Skip to content

WARSKELETON/ESLE-MongoDB

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

49 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MongoDB Scalability and Performance Evaluation - ESLE

Structure

Module Description
gcp Google Cloud infrastructure module
gcp/k8s MongoDB replica set kubernetes deployment module
gcp/terraform Terraform GKE cluster module
results-usl Final scalability results with the factor write concern
results-usl/experimentUSL1 Scalability results with the factor write concern w = majority
results-usl/experimentUSL2 Scalability results with the factor write concern w = 1
results-usl/experimentUSL3 Scalability results with the factor write concern w = majority (6 replica nodes)
results-usl/experimentUSL4 Scalability results with the factor write concern w = 1 (6 replica nodes)
results-usl/workload1 Scalability results with the factor experiment 1 and 2
results-v4 Final experiment results with 7 factors and 2 levels (5 repetitions)
results-v3 Third experimental results attempt with 7 factors and 2 levels (5 repetitions)
results-v2 Second experimental results attempt with 7 factors and 2 levels (5 repetitions)
results-v1 First experimental results attempt with 7 factors and 2 levels (5 repetitions)
workloads Workload definition files
logs Logs modules
runner Workload runner script
Dockerfile Runner with ycsb dockerfile
Docker Compose File Docker Swarm cluster definition
concierge Workload module cleaner
janitor Database cleaner
moca Our own benchmark tool attempt
populate Populate database script
results-aggregator Tool for experiment's results aggregation
get_results Copies experiment from the cloud environment and aggregates results
rebuild_pods Rebuild MongoDB Kubernetes Service

7 Factors with 2 Levels

Factor Level -1 Level 1
Write Concern (A) Majority 1 Ack
Replica Writer Thread Range (B) [0:16] Threads [0:128] Threads
Read Concern (C) 1 Ack Majority
Read Preference (D) Primary Preferred Secondary Preferred
Replica Batch Limit (E) 50MB 100MB
Replica Node Configuration (F) Primary-Secondary-Secondary Primary-Secondary-Arbiter
Chaining (G) Disabled Enabled

Experiment Iteration Struture (Example)

Module Description
outputs YCSB folder output
results-throughput.dat Insert operation latency results
results-latency-insert.dat Insert operation latency results
results-latency-read.dat Read operation latency results
results-latency-scan.dat Scan operation latency results
results-latency-update.dat Update operation latency results

How to switch the different factors?

Write Concern, Read Concern, Read Preference

All of these 3 factors are given directly to our runner script, as they are part of the MongoDB connection string per client request:

./runner.sh <other-flags> -W <write-concern> -R <read-concern> -P <read-preference>

Write Concern Majority (Level -1)

-W majority

Write Concern 1 Ack (Level 1)

-W 1

Read Concern 1 Ack (Level -1)

-R local

Read Concern Majority (Level 1)

-R majority

Read Preference Primary Preferred (Level -1)

-P primaryPreferred

Read Preference Secondary Preferred (Level 1)

-P secondaryPreferred

Replica Writer Thread Range, Replica Batch Limit

All of these 2 factors are setup in the MongoDB kubernetes deployment yaml file, as server parameters.

Replica Writer Thread Range [0:16] Threads (Level -1)

        - "--setParameter"
        - "replWriterMinThreadCount=0"
        - "--setParameter"
        - "replWriterThreadCount=16"

Replica Writer Thread Range [0:128] Threads (Level 1)

        - "--setParameter"
        - "replWriterMinThreadCount=0"
        - "--setParameter"
        - "replWriterThreadCount=128"

Replica Batch Limit 50MB (Level -1)

        - "--setParameter"
        - "replBatchLimitBytes=52428800"

Replica Batch Limit 100MB (Level 1)

        - "--setParameter"
        - "replBatchLimitBytes=104857600"

Replica Node Configuration

Primary-Secondary-Secondary (Level -1)

Create the replica set, for example replica set named "rs0" with mongo-0 as primary, mongo-1 and mongo-2 as secondaries:

kubectl exec mongo-0 -- mongo --eval 'rs.initiate({_id: "rs0", version: 1, members: [{_id: 0, host: "mongo-0.mongo:27017"}, {_id: 1, host: "mongo-1.mongo:27017"}, {_id: 2, host: "mongo-2.mongo:27017"}]});'

Primary-Secondary-Arbiter (Level 1)

Create the replica set, for example replica set named "rs0" with mongo-0 as primary, mongo-1 as secondary and mongo-2 as arbiter:

kubectl exec mongo-0 -- mongo --eval 'rs.initiate({_id: "rs0", version: 1, members: [{_id: 0, host: "mongo-0.mongo:27017"}, {_id: 1, host: "mongo-1.mongo:27017"}, {_id: 2, host: "mongo-2.mongo:27017", arbiterOnly: true}]});'

Chaining

Chaining Disabled (Level -1)

Simply create the replica set with the setting chainingAllowed set to false (members array is redacted for legibility reasons):

kubectl exec mongo-0 -- mongo --eval 'rs.initiate({_id: "rs0", version: 1, members: [...], settings: {chainingAllowed: false}});'

Chaining Allowed (Level 1)

Create the replica set with the setting chainingAllowed set to true (members array is redacted for legibility reasons):

kubectl exec mongo-0 -- mongo --eval 'rs.initiate({_id: "rs0", version: 1, members: [...], settings: {chainingAllowed: true}});'

And force one of the secondaries to utilize the other secondary as its sync source, in this example we are forcing mongo-2 to sync from mongo-1 and mongo-0 is the primary:

kubectl exec mongo-2 -- mongo --eval 'db.adminCommand( { replSetSyncFrom: "mongo-1.mongo:27017" });'

How to run using GCP?

Provision the infrastructure:

cd gcp/terraform
terraform apply

Connect to the cluster by getting the command line access from GCP console, like:

gcloud container clusters get-credentials <gcloud_credentials> --region <region> --project <project_id>

To watch the creation of the pods (optional):

watch -x kubectl get pods

Clean existing environment (if already existing) and create StatefulSet, Service. Also initiates replica set with different system parameters like chaining and architecture (PSS or PSA) as booleans:

cd ..
./rebuild_pods.sh -c <chaining_enabled> -a <arbiter_exists>

How to run the experiments?

Run pod with our ycsb image hosted @ dockerhub:

kubectl run ycsb --rm -it --image aaugusto11/ycsb -- /bin/bash

Or build a local image of ycsb and run the pod:

cd ../../

docker build -t ycsb:latest .

kubectl run ycsb --rm -it --image ycsb:latest --image-pull-policy=Never -- /bin/bash

Run the script to perform benchmark experiment:

./runner.sh -w workload1 -e experiment1 -i 5 -c 1 -x throughput -m 16 -n 16 -s 1 -r 5 -W 1 -R majority -P primary

This will run workload1, as the experiment with id 1, perform 5 iterations, on the cloud (-c 1), from 16 to 16 client threads in increments of 1, repeating each run of the workload 5 times. Each request is being done using writeConcern w = 1 and readConcern = majority, reading from the primary.

Run the script to perform scalability experiment:

./runner.sh -w workload1 -e experiment2 -i 1 -c 1 -x throughput -m 1 -n 100 -s 5 -r 5 -W 1 -R local -P primary

This will run workload1, as the experiment with id 2, perform 1 iteration, on the cloud (-c 1), from 1 to 100 client threads in increments of 5, repeating each run of the workload 5 times. Each request is being done using writeConcern w = 1 and readConcern = local, reading from the primary.

If running on the cloud, copy experiments folder from the pod to the local environment:

kubectl cp default/ycsb:/experiments/experiment1 ./results/experiment1 -c ycsb

Authors

Group 01

Team members

Number Name User Email
90704 Andre Augusto https://github.com/AndreAugusto11 mailto:[email protected]
90744 Lucas Vicente https://github.com/WARSKELETON mailto:[email protected]
90751 Manuel Mascarenhas https://github.com/Mascarenhas12 mailto:[email protected]

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published