Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add context storage benchmarking #144

Merged
merged 119 commits into from
Sep 28, 2023
Merged
Show file tree
Hide file tree
Changes from 115 commits
Commits
Show all changes
119 commits
Select commit Hold shift + click to select a range
a870dd2
add context storage benchmarking
RLKRo Jun 7, 2023
afcdce9
add all dff.utils modules to doc
RLKRo Jun 7, 2023
365e996
fix doc
RLKRo Jun 7, 2023
4d6a23f
add support for benchmarking multiple context storages at a time
RLKRo Jun 7, 2023
5815511
add type annotations; option to pass multiple context storages; expor…
RLKRo Jun 8, 2023
17b11fc
add option to get results as a dataframe
RLKRo Jun 8, 2023
2c83245
format
RLKRo Jun 8, 2023
e8564b1
add tutorial for benchmark
RLKRo Jun 9, 2023
a07588c
update benchmark dependencies
RLKRo Jun 9, 2023
258be60
update benchmark utils
RLKRo Jun 19, 2023
b970e83
add benchmark_dbs and benchmark_streamlit
RLKRo Jun 20, 2023
3c42227
update dependencies
RLKRo Jun 20, 2023
15795df
use python3.8 compatible typing
RLKRo Jun 20, 2023
13cc2d7
return ydb & reorder benchmark sets
RLKRo Jun 20, 2023
25c5b7e
add more benchmark cases
RLKRo Jun 21, 2023
2df1054
improve diff viewing
RLKRo Jun 21, 2023
cd0eabf
reduce dialog len for extreme cases
RLKRo Jun 21, 2023
185d950
change benchmark format
RLKRo Jun 21, 2023
c16cd28
change benchmark format: generic factory
RLKRo Jun 21, 2023
5de5a83
bugfix: repeated format update
RLKRo Jun 21, 2023
8fc62a8
move generic benchmark tools to utils
RLKRo Jun 21, 2023
1f450e5
add average read+update column
RLKRo Jun 21, 2023
4404ef3
add mass compare tab
RLKRo Jun 21, 2023
5fb2e69
update extreme cases params
RLKRo Jun 21, 2023
4c51473
set step_dialog_len to 1 by default
RLKRo Jun 22, 2023
0a35013
print exception message during benchmark
RLKRo Jun 23, 2023
b6a38a1
store read times under supposed dialog len
RLKRo Jun 23, 2023
eda7ccc
set streamlit version in dependencies
RLKRo Jun 26, 2023
60bf152
add exist_ok flag for saving to file
RLKRo Jun 26, 2023
ec19a30
move average results calculations from streamlit to benchmark utils
RLKRo Jun 26, 2023
8abc60f
rename lengths to dimensions
RLKRo Jun 27, 2023
c65ca37
remove context checking comments
RLKRo Jun 27, 2023
fb999e3
move partial benchmarking logic to partial benchmark file
RLKRo Jun 27, 2023
e413025
add partial file saving
RLKRo Jun 27, 2023
827861b
update report function
RLKRo Jun 27, 2023
a25ee56
add BenchmarkConfig class to avoid repetition of parameters
RLKRo Jun 27, 2023
0424b43
revert add partial file saving
RLKRo Jun 28, 2023
e6b88b7
fix benchmark name
RLKRo Jun 28, 2023
f68d71c
update streamlit for new benchmark_config
RLKRo Jun 28, 2023
0fe49a4
update benchmark_new_format.py
RLKRo Jun 28, 2023
9e8f310
fix bug with delisting benchmarks
RLKRo Jun 28, 2023
7494772
not include zero-point in graphs
RLKRo Jun 28, 2023
99a4e36
move get_context_updater to BenchmarkConfig
RLKRo Jun 29, 2023
f787577
add benchmark_dir variable for benchmark_dbs.py
RLKRo Jun 29, 2023
1dbb7f9
add get_context method to BenchmarkConfig
RLKRo Jun 29, 2023
8e2ab54
change update benchmarking logic
RLKRo Jun 29, 2023
8741334
return empty dicts as update_times if context_updater is None
RLKRo Jul 2, 2023
5a4eb75
remove write_times from average_results
RLKRo Jul 2, 2023
a4e71c7
add support for no update times
RLKRo Jul 2, 2023
d7cb290
group sizes stat
RLKRo Jul 2, 2023
d3d66c0
add benchmark tests
RLKRo Jul 3, 2023
9ee9e45
Merge branch 'dev' into feat/db-benchmark
RLKRo Jul 3, 2023
e995195
clear context storage at the end of each context_num cycle
RLKRo Jul 3, 2023
a92ca57
save benchmarks in a list
RLKRo Jul 4, 2023
370d138
add json schema for benchmark results
RLKRo Jul 5, 2023
435af6e
move streamlit to utils
RLKRo Jul 5, 2023
40a053c
refactor benchmark streamlit
RLKRo Jul 5, 2023
f8ecda9
remove partial-dev comparison tools
RLKRo Jul 5, 2023
32f5dd8
add doc & update tutorial
RLKRo Jul 6, 2023
d9a161d
move benchmark configs to utils
RLKRo Jul 6, 2023
f52931a
remove partial from benchmark_dbs.py
RLKRo Jul 6, 2023
41ea85c
remove format updater for benchmark files
RLKRo Jul 6, 2023
465f41b
fix mass compare bug
RLKRo Jul 6, 2023
958db10
lint & format
RLKRo Jul 6, 2023
6d55a3f
add utils to backup_files for test_coverage
RLKRo Jul 6, 2023
a9402ee
skip benchmark tests if benchmark not installed
RLKRo Jul 6, 2023
7b5c502
Revert "remove format updater for benchmark files"
RLKRo Jul 6, 2023
5e98c0e
move format updater to utils
RLKRo Jul 6, 2023
a9a7af2
move format updating to a separate function
RLKRo Jul 6, 2023
266e280
remove old format support from format updater
RLKRo Jul 6, 2023
d18025c
use format updater in streamlit
RLKRo Jul 6, 2023
d9cbca5
add ability to upload files to streamlit, add all files from one dire…
RLKRo Jul 6, 2023
65ef6a2
store benchmarks for a specific db in one file
RLKRo Jul 10, 2023
97002c8
change report function, update tutorial
RLKRo Jul 10, 2023
99ee6a2
remove literal include of the files
RLKRo Jul 10, 2023
791be50
format
RLKRo Jul 10, 2023
7c5f7ed
add ability to edit name and description of benchmark sets
RLKRo Jul 11, 2023
2d7432b
add help for displayed metrics
RLKRo Jul 11, 2023
10e6f29
reformat
RLKRo Jul 13, 2023
8f23fd5
preserve file order when delisting benchmarks
RLKRo Jul 18, 2023
5c04f6e
Merge branch 'dev' into feat/db-benchmark
RLKRo Aug 16, 2023
2b2c2bb
remove typing as tp
RLKRo Aug 16, 2023
acb0557
add type annotations
RLKRo Aug 16, 2023
fbb3a5f
move model configuration to kwargs
RLKRo Aug 16, 2023
bdc6277
change misc key type to str
RLKRo Aug 16, 2023
a36ddd3
accept context factory for benchmark instead of context
RLKRo Aug 16, 2023
5d671a0
add type hints for test cases
RLKRo Aug 17, 2023
819dcae
uncomment dbs in benchmark_dbs.py
RLKRo Aug 17, 2023
8e53b2e
add an explanation in tutorial
RLKRo Aug 17, 2023
e50158c
remove benchmark_new_format.py
RLKRo Aug 17, 2023
55fdece
add info messages
RLKRo Aug 17, 2023
62a33d9
randomize strings returned by `get_dict`
RLKRo Aug 17, 2023
c1f6fbe
add comments
RLKRo Aug 17, 2023
837e285
rename benchmark.context_storage to db_benchmark.benchmark
RLKRo Aug 17, 2023
85df6d1
move report function to a separate module
RLKRo Aug 17, 2023
7054c1b
import Path instead of pathlib
RLKRo Aug 17, 2023
9cf1ef0
fix doc
RLKRo Aug 17, 2023
7584902
reformat
RLKRo Aug 17, 2023
aa65bc7
add imports to __init__.py
RLKRo Aug 17, 2023
25eabff
minor report change
RLKRo Aug 17, 2023
e928397
replace deprecated method call
RLKRo Aug 17, 2023
72e991b
rename context vars
RLKRo Aug 17, 2023
87b1820
generalize BenchmarkConfig
RLKRo Aug 18, 2023
2420455
add .dockerignore && add benchmark files to ignore
RLKRo Aug 18, 2023
988f0ac
reformat
RLKRo Aug 18, 2023
4263b7e
move databases inside the benchmark_dir
RLKRo Aug 18, 2023
ebcbc07
fix doc
RLKRo Aug 18, 2023
8968b28
Merge branch 'dev' into feat/db-benchmark
RLKRo Aug 18, 2023
5e604bd
use tutorial directives
RLKRo Aug 18, 2023
0fd1fc6
Merge branch 'dev' into feat/db-benchmark
RLKRo Aug 22, 2023
3d54611
remove ability to add files from filesystem
RLKRo Aug 24, 2023
e422621
delete files from filesystem when sets are deleted via the interface
RLKRo Aug 24, 2023
ec80dc3
link files referenced in the documentation to docs/source
RLKRo Aug 24, 2023
e441da0
add dependency info for streamlit app
RLKRo Aug 24, 2023
512ed66
add streamlit screenshots inside the tutorial
RLKRo Aug 24, 2023
378d87e
Merge branch 'dev' into feat/db-benchmark
RLKRo Sep 11, 2023
bb0e86b
reupload images
RLKRo Sep 11, 2023
8c419b5
add more exception info
RLKRo Sep 11, 2023
093fa52
Merge branch 'dev' into feat/db-benchmark
RLKRo Sep 28, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 32 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
*.DS_Store*
*.egg-info/
dist/
venv/
build/
docs/source/apiref
docs/source/_misc
docs/source/release_notes.rst
docs/source/tutorials
*__pycache__*
*.idea/*
.idea/*
*.pyc
.pytest_cache/*
.mypy_cache
modules/*
dm_pickle*
dialogue_manager*
GlobalUserTableAccessor*
memory_debugging*
opening_database*
_globals.py
venv*
.vscode
.coverage
.pytest_cache
htmlcov
tutorials/context_storages/dbs
dbs
benchmarks
benchmark_results_files.json
uploaded_benchmarks
2 changes: 1 addition & 1 deletion .github/workflows/test_coverage.yml
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ jobs:

- name: clean environment
run: |
export backup_files=( tests tutorials .env_file makefile .coveragerc pytest.ini docs )
export backup_files=( tests tutorials utils .env_file makefile .coveragerc pytest.ini docs )
mkdir /tmp/backup
for i in "${backup_files[@]}" ; do mv "$i" /tmp/backup ; done
rm -rf ..?* .[!.]* *
Expand Down
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ dist/
venv/
build/
docs/source/apiref
docs/source/_misc
docs/source/release_notes.rst
docs/source/tutorials
*__pycache__*
Expand All @@ -25,3 +26,7 @@ venv*
.pytest_cache
htmlcov
tutorials/context_storages/dbs
dbs
benchmarks
benchmark_results_files.json
uploaded_benchmarks
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ pip install dff[postgresql] # dependencies for using PostgreSQL
pip install dff[sqlite] # dependencies for using SQLite
pip install dff[ydb] # dependencies for using Yandex Database
pip install dff[telegram] # dependencies for using Telegram
pip install dff[benchmark] # dependencies for benchmarking
pip install dff[full] # full dependencies including all options above
pip install dff[tests] # dependencies for running tests
pip install dff[test_full] # full dependencies for running all tests (all options above)
Expand Down
12 changes: 12 additions & 0 deletions dff/utils/db_benchmark/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# -*- coding: utf-8 -*-
# flake8: noqa: F401
from dff.utils.db_benchmark.benchmark import (
time_context_read_write,
DBFactory,
BenchmarkConfig,
BenchmarkCase,
save_results_to_file,
benchmark_all,
)
from dff.utils.db_benchmark.report import report
from dff.utils.db_benchmark.basic_config import BasicBenchmarkConfig, basic_configurations
218 changes: 218 additions & 0 deletions dff/utils/db_benchmark/basic_config.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,218 @@
"""
Basic Config
------------
This module contains basic benchmark configurations.

It defines a simple configurations class (:py:class:`~.BasicBenchmarkConfig`)
as well as a set of configurations that covers different dialogs a user might have and some edge-cases
(:py:data:`~.basic_configurations`).
"""
from typing import Tuple, Optional
import string
import random

from humanize import naturalsize
from pympler import asizeof

from dff.script import Message, Context
from dff.utils.db_benchmark.benchmark import BenchmarkConfig


def get_dict(dimensions: Tuple[int, ...]):
"""
Return misc dictionary build in `dimensions` dimensions.

:param dimensions:
Dimensions of the dictionary.
Each element of the dimensions tuple is the number of keys on the corresponding level of the dictionary.
The last element of the dimensions tuple is the length of the string values of the dict.

e.g. dimensions=(1, 2) returns a dictionary with 1 key that points to a string of len 2.
whereas dimensions=(1, 2, 3) returns a dictionary with 1 key that points to a dictionary
with 2 keys each of which points to a string of len 3.

So, the len of dimensions is the depth of the dictionary, while its values are
the width of the dictionary at each level.
"""

def _get_dict(dimensions: Tuple[int, ...]):
if len(dimensions) < 2:
# get a random string of length dimensions[0]
return "".join(random.choice(string.printable) for _ in range(dimensions[0]))
return {str(i): _get_dict(dimensions[1:]) for i in range(dimensions[0])}

if len(dimensions) > 1:
return _get_dict(dimensions)
elif len(dimensions) == 1:
return _get_dict((dimensions[0], 0))
else:
return _get_dict((0, 0))


def get_message(message_dimensions: Tuple[int, ...]):
"""
Return message with a non-empty misc field.

:param message_dimensions: Dimensions of the misc field of the message. See :py:func:`~.get_dict`.
"""
return Message(misc=get_dict(message_dimensions))


def get_context(
dialog_len: int,
message_dimensions: Tuple[int, ...],
misc_dimensions: Tuple[int, ...],
) -> Context:
"""
Return context with a non-empty misc, labels, requests, responses fields.

:param dialog_len: Number of labels, requests and responses.
:param message_dimensions:
A parameter used to generate messages for requests and responses. See :py:func:`~.get_message`.
:param misc_dimensions:
A parameter used to generate misc field. See :py:func:`~.get_dict`.
"""
return Context(
labels={i: (f"flow_{i}", f"node_{i}") for i in range(dialog_len)},
requests={i: get_message(message_dimensions) for i in range(dialog_len)},
responses={i: get_message(message_dimensions) for i in range(dialog_len)},
misc=get_dict(misc_dimensions),
)


class BasicBenchmarkConfig(BenchmarkConfig, frozen=True):
"""
A simple benchmark configuration that generates contexts using two parameters:

- `message_dimensions` -- to configure the way messages are generated.
- `misc_dimensions` -- to configure size of context's misc field.

Dialog length is configured using `from_dialog_len`, `to_dialog_len`, `step_dialog_len`.
"""

context_num: int = 30
"""
Number of times the contexts will be benchmarked.
Increasing this number decreases standard error of the mean for benchmarked data.
"""
from_dialog_len: int = 300
"""Starting dialog len of a context."""
to_dialog_len: int = 311
"""
Final dialog len of a context.
:py:meth:`~.BasicBenchmarkConfig.context_updater` will return contexts
until their dialog len is less then `to_dialog_len`.
"""
step_dialog_len: int = 1
"""
Increment step for dialog len.
:py:meth:`~.BasicBenchmarkConfig.context_updater` will return contexts
increasing dialog len by `step_dialog_len`.
"""
message_dimensions: Tuple[int, ...] = (10, 10)
"""
Dimensions of misc dictionaries inside messages.
See :py:func:`~.get_message`.
"""
misc_dimensions: Tuple[int, ...] = (10, 10)
"""
Dimensions of misc dictionary.
See :py:func:`~.get_dict`.
"""

def get_context(self) -> Context:
"""
Return context with `from_dialog_len`, `message_dimensions`, `misc_dimensions`.

Wraps :py:func:`~.get_context`.
"""
return get_context(self.from_dialog_len, self.message_dimensions, self.misc_dimensions)

def info(self):
"""
Return fields of this instance and sizes of objects defined by this config.

:return:
A dictionary with two keys.
Key "params" stores fields of this configuration.
Key "sizes" stores string representation of following values:

- "starting_context_size" -- size of a context with `from_dialog_len`.
- "final_context_size" -- size of a context with `to_dialog_len`.
A context of this size will never actually be benchmarked.
- "misc_size" -- size of a misc field of a context.
- "message_size" -- size of a misc field of a message.
"""
return {
"params": self.model_dump(),
"sizes": {
"starting_context_size": naturalsize(asizeof.asizeof(self.get_context()), gnu=True),
"final_context_size": naturalsize(
asizeof.asizeof(get_context(self.to_dialog_len, self.message_dimensions, self.misc_dimensions)),
gnu=True,
),
"misc_size": naturalsize(asizeof.asizeof(get_dict(self.misc_dimensions)), gnu=True),
"message_size": naturalsize(asizeof.asizeof(get_message(self.message_dimensions)), gnu=True),
},
}

def context_updater(self, context: Context) -> Optional[Context]:
"""
Update context to have `step_dialog_len` more labels, requests and responses,
unless such dialog len would be equal to `to_dialog_len` or exceed than it,
in which case None is returned.
"""
start_len = len(context.labels)
if start_len + self.step_dialog_len < self.to_dialog_len:
for i in range(start_len, start_len + self.step_dialog_len):
context.add_label((f"flow_{i}", f"node_{i}"))
context.add_request(get_message(self.message_dimensions))
context.add_response(get_message(self.message_dimensions))
return context
else:
return None


basic_configurations = {
"large-misc": BasicBenchmarkConfig(
from_dialog_len=1,
to_dialog_len=50,
message_dimensions=(3, 5, 6, 5, 3),
misc_dimensions=(2, 4, 3, 8, 100),
),
"short-messages": BasicBenchmarkConfig(
from_dialog_len=500,
to_dialog_len=550,
message_dimensions=(2, 30),
misc_dimensions=(0, 0),
),
"default": BasicBenchmarkConfig(),
"large-misc--long-dialog": BasicBenchmarkConfig(
from_dialog_len=500,
to_dialog_len=550,
message_dimensions=(3, 5, 6, 5, 3),
misc_dimensions=(2, 4, 3, 8, 100),
),
"very-long-dialog-len": BasicBenchmarkConfig(
context_num=10,
from_dialog_len=10000,
to_dialog_len=10050,
),
"very-long-message-len": BasicBenchmarkConfig(
context_num=10,
from_dialog_len=1,
to_dialog_len=3,
message_dimensions=(10000, 1),
),
"very-long-misc-len": BasicBenchmarkConfig(
context_num=10,
from_dialog_len=1,
to_dialog_len=3,
misc_dimensions=(10000, 1),
),
}
"""
Configuration that covers many dialog cases (as well as some edge-cases).

:meta hide-value:
"""
Loading