MongoDB Scalability and Performance Evaluation - ESLE

Structure

Module	Description
gcp	Google Cloud infrastructure module
gcp/k8s	MongoDB replica set kubernetes deployment module
gcp/terraform	Terraform GKE cluster module
results-usl	Final scalability results with the factor write concern
results-usl/experimentUSL1	Scalability results with the factor write concern w = majority
results-usl/experimentUSL2	Scalability results with the factor write concern w = 1
results-usl/experimentUSL3	Scalability results with the factor write concern w = majority (6 replica nodes)
results-usl/experimentUSL4	Scalability results with the factor write concern w = 1 (6 replica nodes)
results-usl/workload1	Scalability results with the factor experiment 1 and 2
results-v4	Final experiment results with 7 factors and 2 levels (5 repetitions)
results-v3	Third experimental results attempt with 7 factors and 2 levels (5 repetitions)
results-v2	Second experimental results attempt with 7 factors and 2 levels (5 repetitions)
results-v1	First experimental results attempt with 7 factors and 2 levels (5 repetitions)
workloads	Workload definition files
logs	Logs modules
runner	Workload runner script
Dockerfile	Runner with ycsb dockerfile
Docker Compose File	Docker Swarm cluster definition
concierge	Workload module cleaner
janitor	Database cleaner
moca	Our own benchmark tool attempt
populate	Populate database script
results-aggregator	Tool for experiment's results aggregation
get_results	Copies experiment from the cloud environment and aggregates results
rebuild_pods	Rebuild MongoDB Kubernetes Service

7 Factors with 2 Levels

Factor	Level -1	Level 1
Write Concern (A)	Majority	1 Ack
Replica Writer Thread Range (B)	[0:16] Threads	[0:128] Threads
Read Concern (C)	1 Ack	Majority
Read Preference (D)	Primary Preferred	Secondary Preferred
Replica Batch Limit (E)	50MB	100MB
Replica Node Configuration (F)	Primary-Secondary-Secondary	Primary-Secondary-Arbiter
Chaining (G)	Disabled	Enabled

Experiment Iteration Struture (Example)

Module	Description
outputs	YCSB folder output
results-throughput.dat	Insert operation latency results
results-latency-insert.dat	Insert operation latency results
results-latency-read.dat	Read operation latency results
results-latency-scan.dat	Scan operation latency results
results-latency-update.dat	Update operation latency results

How to switch the different factors?

Write Concern, Read Concern, Read Preference

All of these 3 factors are given directly to our runner script, as they are part of the MongoDB connection string per client request:

./runner.sh <other-flags> -W <write-concern> -R <read-concern> -P <read-preference>

Write Concern Majority (Level -1)

-W majority

Write Concern 1 Ack (Level 1)

-W 1

Read Concern 1 Ack (Level -1)

-R local

Read Concern Majority (Level 1)

-R majority

Read Preference Primary Preferred (Level -1)

-P primaryPreferred

Read Preference Secondary Preferred (Level 1)

-P secondaryPreferred

Replica Writer Thread Range, Replica Batch Limit

All of these 2 factors are setup in the MongoDB kubernetes deployment yaml file, as server parameters.

Replica Writer Thread Range [0:16] Threads (Level -1)

        - "--setParameter"
        - "replWriterMinThreadCount=0"
        - "--setParameter"
        - "replWriterThreadCount=16"

Replica Writer Thread Range [0:128] Threads (Level 1)

        - "--setParameter"
        - "replWriterMinThreadCount=0"
        - "--setParameter"
        - "replWriterThreadCount=128"

Replica Batch Limit 50MB (Level -1)

        - "--setParameter"
        - "replBatchLimitBytes=52428800"

Replica Batch Limit 100MB (Level 1)

        - "--setParameter"
        - "replBatchLimitBytes=104857600"

Replica Node Configuration

Primary-Secondary-Secondary (Level -1)

Create the replica set, for example replica set named "rs0" with mongo-0 as primary, mongo-1 and mongo-2 as secondaries:

kubectl exec mongo-0 -- mongo --eval 'rs.initiate({_id: "rs0", version: 1, members: [{_id: 0, host: "mongo-0.mongo:27017"}, {_id: 1, host: "mongo-1.mongo:27017"}, {_id: 2, host: "mongo-2.mongo:27017"}]});'

Primary-Secondary-Arbiter (Level 1)

Create the replica set, for example replica set named "rs0" with mongo-0 as primary, mongo-1 as secondary and mongo-2 as arbiter:

kubectl exec mongo-0 -- mongo --eval 'rs.initiate({_id: "rs0", version: 1, members: [{_id: 0, host: "mongo-0.mongo:27017"}, {_id: 1, host: "mongo-1.mongo:27017"}, {_id: 2, host: "mongo-2.mongo:27017", arbiterOnly: true}]});'

Chaining

Chaining Disabled (Level -1)

Simply create the replica set with the setting chainingAllowed set to false (members array is redacted for legibility reasons):

kubectl exec mongo-0 -- mongo --eval 'rs.initiate({_id: "rs0", version: 1, members: [...], settings: {chainingAllowed: false}});'

Chaining Allowed (Level 1)

Create the replica set with the setting chainingAllowed set to true (members array is redacted for legibility reasons):

kubectl exec mongo-0 -- mongo --eval 'rs.initiate({_id: "rs0", version: 1, members: [...], settings: {chainingAllowed: true}});'

And force one of the secondaries to utilize the other secondary as its sync source, in this example we are forcing mongo-2 to sync from mongo-1 and mongo-0 is the primary:

kubectl exec mongo-2 -- mongo --eval 'db.adminCommand( { replSetSyncFrom: "mongo-1.mongo:27017" });'

How to run using GCP?

Provision the infrastructure:

cd gcp/terraform
terraform apply

Connect to the cluster by getting the command line access from GCP console, like:

gcloud container clusters get-credentials <gcloud_credentials> --region <region> --project <project_id>

To watch the creation of the pods (optional):

watch -x kubectl get pods

Clean existing environment (if already existing) and create StatefulSet, Service. Also initiates replica set with different system parameters like chaining and architecture (PSS or PSA) as booleans:

cd ..
./rebuild_pods.sh -c <chaining_enabled> -a <arbiter_exists>

How to run the experiments?

Run pod with our ycsb image hosted @ dockerhub:

kubectl run ycsb --rm -it --image aaugusto11/ycsb -- /bin/bash

Or build a local image of ycsb and run the pod:

cd ../../

docker build -t ycsb:latest .

kubectl run ycsb --rm -it --image ycsb:latest --image-pull-policy=Never -- /bin/bash

Run the script to perform benchmark experiment:

./runner.sh -w workload1 -e experiment1 -i 5 -c 1 -x throughput -m 16 -n 16 -s 1 -r 5 -W 1 -R majority -P primary

This will run workload1, as the experiment with id 1, perform 5 iterations, on the cloud (-c 1), from 16 to 16 client threads in increments of 1, repeating each run of the workload 5 times. Each request is being done using writeConcern w = 1 and readConcern = majority, reading from the primary.

Run the script to perform scalability experiment:

./runner.sh -w workload1 -e experiment2 -i 1 -c 1 -x throughput -m 1 -n 100 -s 5 -r 5 -W 1 -R local -P primary

This will run workload1, as the experiment with id 2, perform 1 iteration, on the cloud (-c 1), from 1 to 100 client threads in increments of 5, repeating each run of the workload 5 times. Each request is being done using writeConcern w = 1 and readConcern = local, reading from the primary.

If running on the cloud, copy experiments folder from the pod to the local environment:

kubectl cp default/ycsb:/experiments/experiment1 ./results/experiment1 -c ycsb

Authors

Group 01

Team members

Number	Name	User	Email
90704	Andre Augusto	https://github.com/AndreAugusto11	mailto:[email protected]
90744	Lucas Vicente	https://github.com/WARSKELETON	mailto:[email protected]
90751	Manuel Mascarenhas	https://github.com/Mascarenhas12	mailto:[email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
gcp		gcp
results-usl		results-usl
results-v1		results-v1
results-v2		results-v2
results-v3		results-v3
results-v4		results-v4
workloads		workloads
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
ESLE_Report2_G01.pdf		ESLE_Report2_G01.pdf
ESLE_Report_1_G01.pdf		ESLE_Report_1_G01.pdf
README.md		README.md
concierge.sh		concierge.sh
docker-compose.yml		docker-compose.yml
esle-usl-1.0-SNAPSHOT.jar		esle-usl-1.0-SNAPSHOT.jar
get_results.sh		get_results.sh
janitor.py		janitor.py
moca.py		moca.py
populate.py		populate.py
rebuild_pods.sh		rebuild_pods.sh
results-aggregator.py		results-aggregator.py
runner.sh		runner.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MongoDB Scalability and Performance Evaluation - ESLE

Structure

7 Factors with 2 Levels

Experiment Iteration Struture (Example)

How to switch the different factors?

Write Concern, Read Concern, Read Preference

Write Concern Majority (Level -1)

Write Concern 1 Ack (Level 1)

Read Concern 1 Ack (Level -1)

Read Concern Majority (Level 1)

Read Preference Primary Preferred (Level -1)

Read Preference Secondary Preferred (Level 1)

Replica Writer Thread Range, Replica Batch Limit

Replica Writer Thread Range [0:16] Threads (Level -1)

Replica Writer Thread Range [0:128] Threads (Level 1)

Replica Batch Limit 50MB (Level -1)

Replica Batch Limit 100MB (Level 1)

Replica Node Configuration

Primary-Secondary-Secondary (Level -1)

Primary-Secondary-Arbiter (Level 1)

Chaining

Chaining Disabled (Level -1)

Chaining Allowed (Level 1)

How to run using GCP?

How to run the experiments?

Authors

Team members

About

Releases

Packages

Contributors 3

Languages

WARSKELETON/ESLE-MongoDB

Folders and files

Latest commit

History

Repository files navigation

MongoDB Scalability and Performance Evaluation - ESLE

Structure

7 Factors with 2 Levels

Experiment Iteration Struture (Example)

How to switch the different factors?

Write Concern, Read Concern, Read Preference

Write Concern Majority (Level -1)

Write Concern 1 Ack (Level 1)

Read Concern 1 Ack (Level -1)

Read Concern Majority (Level 1)

Read Preference Primary Preferred (Level -1)

Read Preference Secondary Preferred (Level 1)

Replica Writer Thread Range, Replica Batch Limit

Replica Writer Thread Range [0:16] Threads (Level -1)

Replica Writer Thread Range [0:128] Threads (Level 1)

Replica Batch Limit 50MB (Level -1)

Replica Batch Limit 100MB (Level 1)

Replica Node Configuration

Primary-Secondary-Secondary (Level -1)

Primary-Secondary-Arbiter (Level 1)

Chaining

Chaining Disabled (Level -1)

Chaining Allowed (Level 1)

How to run using GCP?

How to run the experiments?

Authors

Team members

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages