Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add context storage benchmarking #144

Merged
merged 119 commits into from
Sep 28, 2023
Merged

Add context storage benchmarking #144

merged 119 commits into from
Sep 28, 2023

Conversation

RLKRo
Copy link
Member

@RLKRo RLKRo commented Jun 7, 2023

Description

Add dff.utils.benchmark.context_storage module which contains functions for benchmarking context storages.

This PR also includes all modules inside dff.utils (benchmarking + caching) during doc building.

Checklist

  • I have covered the code with tests
  • I have added comments to my code to help others understand it
  • I have updated the documentation to reflect the changes
  • I have performed a self-review of the changes
  • Fix file inclusion in documentation (API ref references files located in /utils/, documentation will potentially fail if served at /docs/build)

@RLKRo RLKRo added documentation Improvements or additions to documentation enhancement New feature or request labels Jun 7, 2023
@RLKRo RLKRo self-assigned this Jun 7, 2023
@RLKRo RLKRo marked this pull request as draft June 9, 2023 08:56
Comment on lines 58 to 63
return Context(
labels={i: (f"flow_{i}", f"node_{i}") for i in range(dialog_len)},
requests={i: Message(text=f"request_{i}") for i in range(dialog_len)},
responses={i: Message(text=f"response_{i}") for i in range(dialog_len)},
misc={str(i): i for i in range(misc_len)},
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should make message size random in length to reflect irregular nature of messages?


# check returned context
if actual_context != context:
raise RuntimeError(f"True context:\n{context}\nActual context:\n{actual_context}")
Copy link
Collaborator

@pseusys pseusys Jun 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How should this succeed if we read, say, 3 last requests from context storage only?
We should mind that OR manually set read/write policy of context storage to ALL.
But that should be done after merge of partial context updates only.

f"Size of one context: {context_size} ({tqdm.format_sizeof(context_size, divisor=1024)})"
)

print(f"Starting benchmarking with following parameters:\n{benchmark_config}")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe output as string/string list instead of printing?

Copy link
Member Author

@RLKRo RLKRo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, consider adding files created by both benchmarking and benchmark visualization website to .gitignore and .dockerignore.

Added benchmark files to ignore files:
2420455

Moreover, the project could benefit from creating separate make targets for clearing benchmark data

Moved database directory inside the benchmark directory:
4263b7e
So now everything generated by benchmark_dbs.py is stored in one place and can easily be removed.

Please, also consider stopping and removing all the docker images after every benchmark

This seems out of scope for the task that benchmarking tries to solve.
Feels like these modifications should be made on the client side (by writing custom scripts for benchmarking).

from uuid import uuid4
import pathlib
from time import perf_counter
import typing as tp
Copy link
Member Author

@RLKRo RLKRo Aug 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced module import with object imports:
2b2c2bb

context_storage: DBContextStorage,
context: Context,
context_num: int,
context_updater=None,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added type annotation:
acb0557



def time_context_read_write(
context_storage: DBContextStorage,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Benchmark requires clear method from context storage. Also, I don't see why someone would want to benchmark Dict as context storage.

I think a much cleaner solution would be to create a DBContextStorage to wrap Dict.


uri: str
"""URI of the context storage."""
factory_module: str = "dff.context_storages"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not that relevant, but it doesn't require much and might be helpful in case someone wants to benchmark their context storage.

Comment on lines 243 to 244
class Config:
allow_mutation = False
Copy link
Member Author

@RLKRo RLKRo Aug 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved param to model kwargs:
fbb3a5f


def _get_dict(dimensions: tp.Tuple[int, ...]):
if len(dimensions) < 2:
return "." * dimensions[0]
Copy link
Member Author

@RLKRo RLKRo Aug 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strings are now randomized:
62a33d9

}

# benchmark
benchmark_dir = pathlib.Path("benchmarks")
Copy link
Member Author

@RLKRo RLKRo Aug 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added comments:
c1f6fbe

)


def report(
Copy link
Member Author

@RLKRo RLKRo Aug 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done:
85df6d1

"""
from_dialog_len: int = 300
"""Starting dialog len of a context."""
to_dialog_len: int = 311
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, there's no statistics behind this number, it's just so that we'd have 10 update steps. I don't think that should be explained in the doc.

Also, I don't think that counts as a magic number.

def _get_dict(dimensions: tp.Tuple[int, ...]):
if len(dimensions) < 2:
return "." * dimensions[0]
return {i: _get_dict(dimensions[1:]) for i in range(dimensions[0])}
Copy link
Member Author

@RLKRo RLKRo Aug 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generalized BenchmarkConfig:
87b1820

Now users can create their own BenchmarkConfigs using simple interface supported by streamlit app and report function.

I don't think random configs should be included in the library, though: I don't see a good use case for that and it would be too complicated.

@RLKRo RLKRo mentioned this pull request Aug 22, 2023
4 tasks
@RLKRo RLKRo requested review from pseusys and kudep August 22, 2023 19:41
RLKRo added 5 commits August 24, 2023 23:34
They can already be uploaded via the `Upload benchmark results` interface
Since files can no longer be added via their path on the filesystem, deleted benchmarks are always the uploaded ones.
@RLKRo RLKRo mentioned this pull request Aug 25, 2023
6 tasks
@RLKRo RLKRo requested a review from ruthenian8 September 11, 2023 21:10
@RLKRo RLKRo mentioned this pull request Sep 21, 2023
4 tasks
@RLKRo RLKRo merged commit a3ea816 into dev Sep 28, 2023
@RLKRo RLKRo deleted the feat/db-benchmark branch September 28, 2023 12:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants