-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add common logic for creating and cleaning buckets in tests #277
base: master
Are you sure you want to change the base?
Add common logic for creating and cleaning buckets in tests #277
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, the change is going in the right direction.
CHANGELOG.md
Outdated
@@ -6,6 +6,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 | |||
|
|||
## [Unreleased] | |||
|
|||
### Added |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it more like an infrastructure thing? It's for developers, not the actual users of the SDK, right?
b2sdk/_test_manager/__init__.py
Outdated
@@ -0,0 +1,9 @@ | |||
###################################################################### |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this directory named with a leading underscore? I don't believe I saw this notation anywhere else. Have you considered doing another package, if that was supposed to be installed with [dev]
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so it also seems to me that creating a separate package would be a cleaner approach, something like b2sdk-test
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After a moments's thought, I concluded that we could actually go down the most common path I've encountered, which is simply package the test in the main package, so it would be b2sdk/test. Numpy, pandas or django use such a patter. What do you think?
b2sdk/_test_manager/api.py
Outdated
|
||
def clean_buckets(self) -> None: | ||
for bucket in self.buckets: | ||
with contextlib.suppress(BucketIdNotFound, v3BucketIdNotFound, NonExistentBucket): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per chance, could you explain to me the difference between v2
and v3
versions of this exception? My memory is foggy and I was wondering whether we could just use one of them.
b2sdk/_test_manager/api.py
Outdated
# Name fraction is used for runners inside the same matrix, time fraction is used for runners in different runs. | ||
# To avoid collision when the same runners are fired in different commits at the same time we also use GITHUB_SHA | ||
random.seed( | ||
environ.get('RUNNER_NAME', 'local') + environ.get('GITHUB_SHA', 'local') + str(time.time_ns()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we skip all that and use str(uuid.uuid4())
? I've checked and Python library uses system uuid.h
which in turns is said to be good enough to provide random value. This would remove reliance on the clock entirely.
Also, worth noting that SystemRandom
would be nice, but I'm not sure whether it's even available on Windows :/
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes in the new version I got rid of seed and used uui4, we will see how it goes
b2sdk/_test_manager/api.py
Outdated
BUCKET_NAME_LENGTH = 50 | ||
BUCKET_NAME_CHARS = string.ascii_letters + string.digits + '-' | ||
|
||
BUCKET_NAME_PREFIX = 'b2tst' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a hard limit on the length of the bucket name that we're hitting? If not, why not name it b2test
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
currently after using the uuid it will normally be b2test-{uuid4}
:) As far as i know the maximum limit is 63 char
test/integration/test_upload.py
Outdated
@@ -11,7 +11,7 @@ | |||
import io | |||
from typing import Optional | |||
|
|||
from .fixtures import b2_auth_data # noqa | |||
from .fixtures import * # noqa |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we not, please?
b2sdk/_test_manager/api.py
Outdated
for bucket in buckets: | ||
print(bucket) | ||
|
||
def count_and_print_buckets(self) -> int: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A simple search provides me with no usages of this function. Can you point me to an actual usage, please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was intended to be a bit of statistic in total as you write in the comment below also we can put this function into cleanup_buckets
function to always track how many buckets have been created when closing tests
b2sdk/_test_manager/api.py
Outdated
@backoff.on_exception( | ||
backoff.expo, | ||
TooManyRequests, | ||
max_tries=8, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be better to extract file removal as a function/method and put a backoff on it? What do you think?
b2sdk/_test_manager/api.py
Outdated
print(f'Total bucket count: {len(buckets)}') | ||
|
||
for bucket in buckets: | ||
if not bucket.name.startswith(BUCKET_NAME_PREFIX): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really want an operation like that? What about other runners running in parallel on the same set of keys?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I understand it, this function is intended as a kind of garbage collector after all the tests have been performed in order to collect the "leftovers". You are right about if several commits run the tests there may be unwanted deletions. It seems to me that it is enough here to add the first 10-15 characters from the SHA of the commit to our BUCKET_NAME_PREFIX
. This way we will be sure that after all finished runs if there are any buckets left the garbage collector will remove them. What do you think?
Mixin class for B2Api, which enables bucket tracking. | ||
This mixin will add a `buckets` member to the B2Api instance and will use it track created and | ||
deleted buckets. The main purpose of this are tests -- the `buckets` member can be used in test | ||
teardown to ensure proper bucket cleanup. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about adding a statistic of how many buckets in total is created in tests and how many buckets exists at once when tests are ran? This could help us to determine whether we can use -xdist
to parallelise tests or to run several workflows in parallel.
module _test_manager has been created from which an Api class containing common logic for creating and deleting buckets can be imported. This way, in B2-CLI there will be no need to duplicate the code related to name generation, creation and deletion of buckets
much of the logic was used from this PR: Fix leaking buckets in integration tests #829