Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Moto uses a single instance of the ...Response class for all requests for some services #6745

Closed
dfangl opened this issue Aug 31, 2023 · 1 comment · Fixed by #6746
Closed

Comments

@dfangl
Copy link
Contributor

dfangl commented Aug 31, 2023

Motivation

In LocalStack, we began seeing some strange behavior, especially with s3 buckets, which were not created in the region or accounts the request targets. This leads to test failures, with rather confusing logs.

For example, a bucket in account 123456789012 in region us-east-1 is created twice. Since the create bucket operation is idempotent in us-east-1, it should prove without an error. However, this fails, if some other s3 operation runs at the same time under a different account.

This test can reproduce this issue (it is a test against LocalStack, but the culprit here is indeed moto):

    def test_parallel_bucket_creation(self, aws_client_factory):
        num_threads = 10
        create_barrier = threading.Barrier(num_threads)
        errored = False

        def _create_bucket(runner: int):
            nonlocal errored
            bucket_name = f"bucket-{short_uid()}"
            s3_client = aws_client_factory(
                region_name="us-east-1", aws_access_key_id=f"{runner:012d}"
            ).s3
            create_barrier.wait()
            try:
                s3_client.create_bucket(Bucket=bucket_name)
                s3_client.create_bucket(Bucket=bucket_name)
            except Exception:
                LOG.exception("Create bucket failed")
                errored = True

        thread_list = []
        for i in range(1, num_threads + 1):
            thread = threading.Thread(target=_create_bucket, args=[i])
            thread.start()
            thread_list.append(thread)

        for thread in thread_list:
            thread.join()

        assert not errored

The problem

After some investigation of this issue, which (before above test) was occurring only occasionally, we found the culprit.
For some services (list can be found at the end), moto uses a single ...Response instance for all requests. So, concurrent requests will affect each others state on the setupClass method which is used over all response classes. There are unaffected services, which use the dispatch method, but a rather large part, including S3, is affected.

In S3, for example, these are the problematic lines:

moto/moto/s3/urls.py

Lines 13 to 23 in d3f5e3d

url_paths = {
# subdomain bucket
"{0}/$": S3ResponseInstance.bucket_response,
# subdomain key of path-based bucket
"{0}/(?P<key_or_bucket_name>[^/]+)$": S3ResponseInstance.ambiguous_response,
"{0}/(?P<key_or_bucket_name>[^/]+)/$": S3ResponseInstance.ambiguous_response,
# path-based bucket + key
"{0}/(?P<bucket_name_path>[^/]+)/(?P<key_name>.+)": S3ResponseInstance.key_response,
# subdomain bucket + key with empty first part of path
"{0}/(?P<key_name>/.*)$": S3ResponseInstance.key_response,
}

They use a single instance, for all requests, which leads to the mentioned behavior.

Impact

The impact is of course only noticeable in parallel scenarios. In our case, we found it out only in terraform tests, which perform a lot of S3 operations for validation, while our lambda provider wanted to create a bucket to store an archive in a different account.
Also, the impact is mostly region/account specific, since this is the state most often used from the ...Response classes. However, other state (like headers) are affected too, but the impact is less noticeable.

Please note that we did not test the behavior against moto itself, but from the import path of urls.py, it seems like moto would be affected equally as LocalStack, if we did not miss something.

Proposed fixes

  • We have a fix in the pipeline against our moto fork (Fix usage of a single object instances for multiple concurrent requests localstack/moto#70) which I am happy to open here as well. However, we wanted to discuss first if this is the route forward. It basically uses the unbound version of a method as a parameter to a function, which calls this method on a new instance then. It was the least intrusive solution we found quickly, and the type checker should catch wrong usages of the dispatch_method class method. We tested this solution with above test, and it works.

  • We could choose a version were we do not pass the whole method, but just a string with the method name. Would work the same, but without type checking. However, this would be a tad less "hacky".

  • A refactoring of the found services to use dispatch. This would be more impactful, and we would basically parse the request url twice, which we might want to avoid.

If you have any more ideas for solution, we would be happy to work on them to get something merged here as well, and remove our downstream fix.

Affected Services

These services:

amp
apigateway
apigatewayv2
appconfig
appsync
awslambda
cloudfront
cognitoidp
databrew
ebs
elastictranscoder
glacier
greengrass
guardduty
instance_metadata
iot
managedblockchain
mq
pinpoint
quicksight
route53
s3
s3control
eventbridgescheduler

use a single instance for all the requests. Unaffected services use the dispatch method. There are some services which use the classmethod dispatch on an instance, but since that does not make a difference (even though it can be confusing), we do not necessarily need to change it.

@bblommers
Copy link
Collaborator

Hey @dfangl, thanks for raising this! I had realized the potential problem, and have also seen intermittent failures - I just hadn't connected the dots.

In the long term, I think/hope we can refactor/improve our dispatch method to deal with (most of) these cases. But the method_dispatch-fix sounds like a good and quick solution, so if you want to submit a PR here as well, that would be very welcome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants