Guide on how to use TPCC benchmark #190

thomasloubejac · 2018-11-06T15:02:40Z

During randomly generate session :

It seems that the database performance decrease over time.
And start converging around 100 txn/s

During tuning session (celery worker shows recommendation)

The database performance rarely goes over 100 txn/s

So I'm wondering if I did anything wrong ?
I had 500 random configurations tested before going to tuning session.

Am I supposed to reload the database ? If so, how often ?

thomasloubejac · 2018-11-07T13:17:39Z

I let it run all night, it dropped to 65 txn/s over a 1000 tuning session loops.

nabti · 2019-01-14T08:36:53Z

Hi @thomasloubejac,
Sorry this is not to answer your question I was just wondering if you can Explain how we can use it. step by step.

Thank you for your help

thomasloubejac · 2019-01-15T09:07:15Z

What I will give you here worked for me on postgres-9.6, you'll have to adapt on other dbms

clone this repo on your client and server machines

On your client Machine

clone oltpbenchmark repo on your client machine
make sure you installed every packages needed
apt-get install git openjdk-8-jdk-headless gradle fabric
Make sure that you set an account on your client's database for ottertune, set it with a password and grant it with the right to create databases
Also make sure you can connect to that account directly from the user you will be running the program with. (I had to add -h localhost to every psql command in the ottertune/client/driver/fabfile.py file) In order to connect this is the command I have to use : psql -U user -d database -h localhost -W
edit configuration files

ottertune/client/driver/driver_config.json
ottertune/client/controller/config/{whatever suits your case} (this path has to appear in controller_config in driver_config.json)
for example I have : "database_url" : "jdbc:postgresql://localhost:5432/dbtest"
dbtest being the database I will run the benches on
oltpbench/config/{whatever benchmark you want to run} (this path has to appear in oltpbench_config in driver_config.json)
with the right information (user and password always refer to the account you made on your dbms and never to your current user and passwd)

see a future post down below if you want further information on the configuration files.

Build the things
/{path to oltpbench}/oltpbench/$ ant
to build the benchmark (note that you don't have to rebuild it if you only modify the xml-configurations)
/{path to ottertune}/ottertune/client/controller/$ gradle build
to make sure the controller works ok.
at /{path to ottertune}/ottertune/client/driver/ run

fab create_database (you don't want to run benches on a non-existing database... I did it so don't bother trying
fab load_oltpbench (same story lol)

the last command should take no more than 3 minutes.
the output should look like this :

ottertune_dev/ottertune/client/driver$ fab create_database && fab load_oltpbench
[localhost] Executing task 'create_database'
[localhost] local: PGPASSWORD=test123 createdb -e dbtest -U djamel -h localhost
SELECT pg_catalog.set_config('search_path', '', false)
CREATE DATABASE dbtest;

Done.
[localhost] Executing task 'load_oltpbench'
[localhost] local: ./oltpbenchmark -b tpcc -c /ottertune_dev/oltpbench/config/tpcc_config_postgres.xml --create=true --load=true
07:53:42,865 (DBWorkload.java:262) INFO  - ======================================================================

Benchmark:     TPCC {com.oltpbenchmark.benchmarks.tpcc.TPCCBenchmark}
Configuration: /ottertune_dev/oltpbench/config/tpcc_config_postgres.xml
Type:          POSTGRES
Driver:        org.postgresql.Driver
URL:           jdbc:postgresql://localhost:5432/dbtest
Isolation:     TRANSACTION_READ_COMMITTED
Scale Factor:  70.0

07:53:42,869 (DBWorkload.java:263) INFO  - ======================================================================
07:53:42,883 (DBWorkload.java:522) INFO  - Creating new TPCC database...
07:53:43,054 (DBWorkload.java:524) INFO  - Finished!
07:53:43,055 (DBWorkload.java:525) INFO  - ======================================================================
07:53:43,055 (DBWorkload.java:548) INFO  - Loading data into TPCC database...
08:24:26,300 (DBWorkload.java:550) INFO  - Finished!
08:24:26,300 (DBWorkload.java:551) INFO  - ======================================================================
08:24:26,300 (DBWorkload.java:591) INFO  - Skipping benchmark workload execution

Done.

At this point you need to have your server machine setup, make sure everything works with :

fab loop

Once you managed to make it this far without errors, you have to change the oltpbench configuration :
Andy and Dana advise to use this

<scalefactor>200</scalefactor>

    <!-- The workload -->
    <terminals>50</terminals>
    <works>
        <work>
          <time>200</time>
          <rate>unlimited</rate>
          <weights>45,43,4,4,4</weights>
        </work>
     </works>

So put these values in your oltpbench/config/{yourconfiguration} . Then in the driver directory :

fab drop_database
fab create_database
fab load_oltpbench

If it doesn't give errors straight up, then it means it worked and you're ok to go. this might take a while (27min for a less beefy configuration in my case). Note that the beefier the config, the longer it takes.
If it works for you, great. It didn't for me, so I had to modify the configuration :
A too high Scalefactor prevented oltpbench from working for some reason, (limiting value was 95 in my case).
drop it down and repeat until fab load_oltpbench doesn't give you errors.

then

fab loop

this is to make sure that you have enough time in your workload.
If you don't, it will give you something like :

ottertune/client/driver$ fab loop
[localhost] Executing task 'loop'
[localhost] local: sync; sudo bash -c "echo 1 > /proc/sys/vm/drop_caches"
[sudo] password for thomas: 
[localhost] local: sudo service postgresql restart
[localhost] local: df -h /dev/sda1
2019-01-15 08:57:20,582 [INFO]  Current Disk Usage: 22%
[localhost] local: ./oltpbenchmark -b tpcc -c /ottertune_dev/oltpbench/config/tpcc_config_postgres.xml --execute=true -s 5 -o outputfile > /ottertune_dev/ottertune/client/driver/oltp.log 2>&1 &
[localhost] local: sudo gradle run -PappArgs="-c /ottertune_dev/ottertune/client/controller/config/sample_postgres_config.json -d output/" --no-daemon
:compileJava UP-TO-DATE
:processResources UP-TO-DATE
:classes UP-TO-DATE
:run
09:02:56,188 (Main.java:99) INFO  - Experiment time is set to: -1
09:02:56,210 (Main.java:105) INFO  - Experiment output directory is set to: output/
09:02:57,194 (Main.java:134) INFO  - First collection of metrics before experiment
> Building 75% > :run[localhost] local: sudo kill -2 29606
09:02:57,673 (Main.java:177) INFO  - Starting the experiment ...
09:02:57,692 (Main.java:191) INFO  - Done running the experiment
09:02:57,696 (Main.java:218) INFO  - Second collection of metrics after experiment
09:02:58,271 (Main.java:243) WARN  - Failed to upload results from the controller

BUILD SUCCESSFUL

Total time: 5 mins 37.18 secs
[localhost] local: python3 ../../server/website/script/upload/upload.py            ../controller/output/ 23Y4E4QZ610CFBS72M6F http://vslike:8000/new_result/
b'\n<!DOCTYPE html>\n<html lang="en">\n<head>\n  <meta http-equiv="content-type" content="text/html; charset=utf-8">\n  <meta name="robots" content="NONE,NOARCHIVE">\n  <title>ZeroDivisionError at /new_result/</title>\n
[...]

and a really long html file. Note that <title>ZeroDivisionError at /new_result/</title> means that the bench didn't even have time to run. I suspect there is a problem in the timing of the actions in this function. Note that a task is killed before the experiment even started.
if it worked, check ottertune/client/controller/output/summary.json, see observation_time. This is the effective duration of the Benchmark.

ottertune/client/driver$ cat ../controller/output/summary.json | jq
{
  "start_time": 1547542977673,
  "end_time": 1547542977692,
  "observation_time": 0,
  "database_type": "postgres",
  "database_version": "9.6",
  "workload_name": "workload_name"
}

repeat the steps until you find the maximum values that your configuration will support.
Then you can choose the values you want. I cannot advise you what values to give it,
I took it down to 70 because elseway my server won't run this fast enough.
number of terminals doesn't influence if oltpbench works or not in my experience.
Here is the configuration I use for instance :

<scalefactor>70</scalefactor>

    <!-- The workload -->
    <terminals>50</terminals>
    <works>
        <work>
          <time>320</time>
          <rate>unlimited</rate>
          <weights>45,43,4,4,4</weights>
        </work>
     </works>

On your server machine

edit ottertune/server/website/script/installation/bootstrap.sh
there are paths that are not correct. either rectify the file or run the instructions yourself. Note that this is the file where you'll find the credentials to log in the website's database. this is useful if you have to make modifications and don't want to reset the website everytime you do.
edit your knobs in ottertune/server/website/website/fixtures/{knobs for your dbms}
I found the most important ones were :
global.bgwriter_delay
global.bgwriter_lru_maxpages
global.checkpoint_completion_target
global.checkpoint_timeout
global.deadlock_timeout
global.default_statistics_target
global.effective_cache_size
global.effective_io_concurrency
global.random_page_cost
global.shared_buffers
global.work_mem
global.maintenance_work_mem

I removed every other knob because it either makes the process slower or you shouldn't touch them (according to my DBA).
reduce the range research by changing default and maxval. I can not help you with that either, but see my fixture file in a future post below.
note that you can find the units and vartypes in ottertune/server/website/website/types.py, this will help you know what the values are or should be.

Note that the tuning session will generate configurations within the range [default, maxval].

you're good to go :
run the website and celery as explained (here)[https://github.com/cmu-db/ottertune/wiki/Getting-Started-ottertune]
if you have to make modifications to knobs on the go, you can do it in the ottertune database.
If you have to make modifications to knobs that reduce the range, you might be given errors saying that values are out of range. I didn't try it but you can try to make a new session and use the upload_batch feature on your client side to upload all the results you've had until here. don't forget to not keep the ones with out of range values.

Last

client side : fab run_loops:1000 (during a randomly generate sessions, advice from the team)
server side : when you decide to change to tuning_session you have to run celerybeat. if you don't it won't work. there's a minimum amount of data to have before you can run a tuning session.
side note : it is useless to run celerybeat during a randomly generate session.

It happened that I had to restart the website and celery worker before it accepted running the tuning session.

PS : I didn't even succeed in getting results yet, I'm still trying to figure things out.

nabti · 2019-01-15T09:31:42Z

Thank you very much @thomasloubejac thank you for your time explaining this step by step. I'll try to test this for my case it is for mysql on percona server.

Thank you again

thomasloubejac · 2019-01-15T10:07:18Z

ottertune/client/driver$ cat driver_config.json | jq
{
"database_type": "postgres",
"database_name": "dbtest",
"database_disk": "/dev/sda1",
"database_conf": "/etc/postgresql/9.6/main/postgresql.conf",
"username": "thomas",
"password": "thomas",
"oltpbench_home": "/ottertune_dev/oltpbench",
"oltpbench_config": "/ottertune_dev/oltpbench/config/tpcc_config_postgres.xml",
"oltpbench_workload": "tpcc",
"oltpbench_log": "/ottertune_dev/ottertune/client/driver/oltp.log",
"controller_config": "/ottertune_dev/ottertune/client/controller/config/sample_postgres_config.json",
"save_path": "/ottertune_dev/results", //make sure to mkdir the directory
"upload_url": "http://your_server_machine_IP_Address:8000",
"upload_code": "23Y4E4QZ610CFBS72M6F"
}
ottertune/client/controller$ cat config/sample_postgres_config.json | jq
{
"database_type": "postgres",
"database_url": "jdbc:postgresql://localhost:5432/dbtest",
"username": "thomas",
"password": "thomas",
"upload_code": "DEPRECATED",
"upload_url": "DEPRECATED",
"workload_name": "workload_name"
}
oltpbench$ cat config/tpcc_config_postgres.xml

<!-- Connection details -->
<dbtype>postgres</dbtype>
<driver>org.postgresql.Driver</driver>
<DBUrl>jdbc:postgresql://localhost:5432/dbtest</DBUrl>
<username>thomas</username>
<password>thomas</password>
<isolation>TRANSACTION_READ_COMMITTED</isolation>

<!-- Scale factor is the number of warehouses in TPCC -->
<scalefactor>70</scalefactor>

<!-- The workload -->
<terminals>50</terminals>
<works>
    <work>
      <time>320</time>
      <rate>unlimited</rate>
      <weights>45,43,4,4,4</weights>
    </work>
 </works>
  
   <!-- TPCC specific -->  
 <transactiontypes>
 <transactiontype>
         <name>NewOrder</name>
 </transactiontype>
 <transactiontype>
         <name>Payment</name>
 </transactiontype>
 <transactiontype>
         <name>OrderStatus</name>
 </transactiontype>
 <transactiontype>
         <name>Delivery</name>
 </transactiontype>
 <transactiontype>
         <name>StockLevel</name>
 </transactiontype>
 </transactiontypes>

thomasloubejac · 2019-01-15T10:08:51Z

ottertune/server/website/website/fixtures/$ cat postgres9-6_knobs.json

[
    {
        "fields": {
            "resource": "4",
            "vartype": "2",
            "category": "Resource Usage / Background Writer",
            "tunable": true,
            "description": "",
            "dbms": 1,
            "default": "200",
            "context": "sighup",
            "unit": "2",
            "summary": "Background writer sleep time between rounds.",
            "name": "global.bgwriter_delay",
            "scope": "global",
            "minval": "10",
            "maxval": "10000",
            "enumvals": null
        },
        "model": "website.KnobCatalog"
    },
    {
        "fields": {
            "resource": "4",
            "vartype": "2",
            "category": "Resource Usage / Background Writer",
            "tunable": true,
            "description": "",
            "dbms": 1,
            "default": "100",
            "context": "sighup",
            "unit": "3",
            "summary": "Background writer maximum number of LRU pages to flush per round.",
            "name": "global.bgwriter_lru_maxpages",
            "scope": "global",
            "minval": "0",
            "maxval": "1000",
            "enumvals": null
        },
        "model": "website.KnobCatalog"
    },
    {
        "fields": {
            "resource": "4",
            "vartype": "3",
            "category": "Write-Ahead Log / Checkpoints",
            "tunable": true,
            "description": "",
            "dbms": 1,
            "default": "0.5",
            "context": "sighup",
            "unit": "3",
            "summary": "Time spent flushing dirty buffers during checkpoint, as fraction of checkpoint interval.",
            "name": "global.checkpoint_completion_target",
            "scope": "global",
            "minval": "0.3",
            "maxval": "1.0",
            "enumvals": null
        },
        "model": "website.KnobCatalog"
    },
    {
        "fields": {
            "resource": "4",
            "vartype": "2",
            "category": "Write-Ahead Log / Checkpoints",
            "tunable": true,
            "description": "",
            "dbms": 1,
            "default": "300000",
            "context": "sighup",
            "unit": "2",
            "summary": "Sets the maximum time between automatic WAL checkpoints.",
            "name": "global.checkpoint_timeout",
            "scope": "global",
            "minval": "30000",
            "maxval": "86400000",
            "enumvals": null
        },
        "model": "website.KnobCatalog"
    },
    {
        "fields": {
            "resource": "4",
            "vartype": "2",
            "category": "Lock Management",
            "tunable": true,
            "description": "",
            "dbms": 1,
            "default": "1000",
            "context": "superuser",
            "unit": "2",
            "summary": "Sets the time to wait on a lock before checking for deadlock.",
            "name": "global.deadlock_timeout",
            "scope": "global",
            "minval": "1",
            "maxval": "2147483647",
            "enumvals": null
        },
        "model": "website.KnobCatalog"
    },
    {
        "fields": {
            "resource": "4",
            "vartype": "2",
            "category": "Query Tuning / Other Planner Options",
            "tunable": true,
            "description": "This applies to table columns that have not had a column-specific target set via ALTER TABLE SET STATISTICS.",
            "dbms": 1,
            "default": "100",
            "context": "user",
            "unit": "3",
            "summary": "Sets the default statistics target.",
            "name": "global.default_statistics_target",
            "scope": "global",
            "minval": "100",
            "maxval": "400",
            "enumvals": null
        },
        "model": "website.KnobCatalog"
    },
    {
        "fields": {
            "resource": "1",
            "vartype": "2",
            "category": "Query Tuning / Planner Cost Constants",
            "tunable": true,
            "description": "That is, the portion of the kernel's disk cache that will be used for PostgreSQL data files. This is measured in disk pages, which are normally 8 kB each.",
            "dbms": 1,
            "default": "4294967296",
            "context": "user",
            "unit": "1",
            "summary": "Sets the planner's assumption about the size of the disk cache.",
            "name": "global.effective_cache_size",
            "scope": "global",
            "minval": "8192",
            "maxval": "12884901888",
            "enumvals": null
        },
        "model": "website.KnobCatalog"
    },
    {
        "fields": {
            "resource": "4",
            "vartype": "2",
            "category": "Resource Usage / Asynchronous Behavior",
            "tunable": true,
            "description": "For RAID arrays, this should be approximately the number of drive spindles in the array.",
            "dbms": 1,
            "default": "1",
            "context": "user",
            "unit": "3",
            "summary": "Number of simultaneous requests that can be handled efficiently by the disk subsystem.",
            "name": "global.effective_io_concurrency",
            "scope": "global",
            "minval": "1",
            "maxval": "50",
            "enumvals": null
        },
        "model": "website.KnobCatalog"
    },
    {
        "fields": {
            "resource": "4",
            "vartype": "3",
            "category": "Query Tuning / Planner Cost Constants",
            "tunable": true,
            "description": "",
            "dbms": 1,
            "default": "4.0",
            "context": "user",
            "unit": "3",
            "summary": "Sets the planner's estimate of the cost of a nonsequentially fetched disk page.",
            "name": "global.random_page_cost",
            "scope": "global",
            "minval": "1.0",
            "maxval": "20",
            "enumvals": null
        },
        "model": "website.KnobCatalog"
    },
    {
        "fields": {
            "resource": "1",
            "vartype": "2",
            "category": "Resource Usage / Memory",
            "tunable": true,
            "description": "",
            "dbms": 1,
            "default": "8388608",
            "context": "postmaster",
            "unit": "1",
            "summary": "Sets the number of shared memory buffers used by the server.",
            "name": "global.shared_buffers",
            "scope": "global",
            "minval": "131072",
            "maxval": "10737418240",
            "enumvals": null
        },
        "model": "website.KnobCatalog"
    },
    {
        "fields": {
            "resource": "1",
            "vartype": "2",
            "category": "Write-Ahead Log / Settings",
            "tunable": true,
            "description": "",
            "dbms": 1,
            "default": "32768",
            "context": "postmaster",
            "unit": "1",
            "summary": "Sets the number of disk-page buffers in shared memory for WAL.",
            "name": "global.wal_buffers",
            "scope": "global",
            "minval": "32768",
            "maxval": "16777216",
            "enumvals": null
        },
        "model": "website.KnobCatalog"
    },
    {
        "fields": {
            "resource": "1",
            "vartype": "2",
            "category": "Resource Usage / Memory",
            "tunable": true,
            "description": "This much memory can be used by each internal sort operation and hash table before switching to temporary disk files.",
            "dbms": 1,
            "default": "4194304",
            "context": "user",
            "unit": "1",
            "summary": "Sets the maximum memory to be used for query workspaces.",
            "name": "global.work_mem",
            "scope": "global",
            "minval": "65536",
            "maxval": "17179869184",
            "enumvals": null
        },
        "model": "website.KnobCatalog"
    },
    {
        "fields": {
            "category": "Resource Usage / Memory",
            "maxval": "17179869184",
            "dbms": 1,
            "name": "global.maintenance_work_mem",
            "minval": "1048576",
            "default": "67108864",
            "tunable": true,
            "enumvals": null,
            "vartype": "2",
            "context": "user",
            "scope": "global",
            "summary": "Sets the maximum memory to be used for maintenance operations.",
            "unit": "1",
            "description": "This includes operations such as VACUUM and CREATE INDEX.",
            "resource": "1"
        },
        "model": "website.KnobCatalog"
    }
]

thomasloubejac · 2019-01-15T10:14:34Z

Ok then I hope this will help you!

bohanjason · 2019-01-17T07:59:24Z

Now we have an end-to-end example using OtterTune to tune Postgres 9.6 with a AWS instance. It may be a good example to get started.

nabti · 2019-01-17T08:03:01Z

@bohanjason Thank you very much

bohanjason · 2019-01-17T14:16:43Z

@nabti No problem , sorry for the late reply.

molujii · 2019-12-18T09:52:27Z

@thomasloubejac can you make the video for this it will be a great help because from last 1.5 months i am trying to run it but still not able to run it
One more question can you run this on vmware linux?

molujii · 2019-12-30T10:56:36Z

@thomasloubejac @nabti were you able to run whole project end-to-end.

neonadia mentioned this issue Jul 11, 2019

Cold Start Doesn't Work #206

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Guide on how to use TPCC benchmark #190

Guide on how to use TPCC benchmark #190

thomasloubejac commented Nov 6, 2018 •

edited

Loading

thomasloubejac commented Nov 7, 2018

nabti commented Jan 14, 2019

thomasloubejac commented Jan 15, 2019 •

edited

Loading

nabti commented Jan 15, 2019

thomasloubejac commented Jan 15, 2019

thomasloubejac commented Jan 15, 2019

thomasloubejac commented Jan 15, 2019

bohanjason commented Jan 17, 2019

nabti commented Jan 17, 2019

bohanjason commented Jan 17, 2019

molujii commented Dec 18, 2019

molujii commented Dec 30, 2019

Guide on how to use TPCC benchmark #190

Guide on how to use TPCC benchmark #190

Comments

thomasloubejac commented Nov 6, 2018 • edited Loading

thomasloubejac commented Nov 7, 2018

nabti commented Jan 14, 2019

thomasloubejac commented Jan 15, 2019 • edited Loading

On your client Machine

On your server machine

Last

nabti commented Jan 15, 2019

thomasloubejac commented Jan 15, 2019

thomasloubejac commented Jan 15, 2019

thomasloubejac commented Jan 15, 2019

bohanjason commented Jan 17, 2019

nabti commented Jan 17, 2019

bohanjason commented Jan 17, 2019

molujii commented Dec 18, 2019

molujii commented Dec 30, 2019

thomasloubejac commented Nov 6, 2018 •

edited

Loading

thomasloubejac commented Jan 15, 2019 •

edited

Loading