Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add encoding url to list objects #7122

Merged
merged 1 commit into from
Dec 21, 2022
Merged

add encoding url to list objects #7122

merged 1 commit into from
Dec 21, 2022

Conversation

shirady
Copy link
Contributor

@shirady shirady commented Dec 6, 2022

Signed-off-by: shirady [email protected]

Explain the changes

Today we do not encode any field in our response to list objects and list object v2.
In this PR we add encoding to fields specified in the documentation.

EncodingType
Encoding type used by Amazon S3 to encode object key names in the XML response.
If you specify the encoding-type request parameter, Amazon S3 includes this element in the response, and returns encoded key name values in the following response elements:
Delimiter, Prefix, Key, and StartAfter.
Type: String
Valid Values: url

Issues: Fixed #xxx / Gap #xxx

originally was found since the test test_bucket_listv2_encoding_basic and test_bucket_list_encoding_basic were failing.

Testing Instructions:

see here

  • Doc added/updated
  • Tests added

@shirady shirady requested a review from dannyzaken December 6, 2022 12:25
@shirady shirady self-assigned this Dec 6, 2022
src/endpoint/s3/ops/s3_get_bucket.js Outdated Show resolved Hide resolved
src/endpoint/s3/ops/s3_get_bucket.js Outdated Show resolved Hide resolved
@shirady
Copy link
Contributor Author

shirady commented Dec 7, 2022

Note about the implementation:
As a premise, we decide to change the encoding after we get the reply (as opposed to sending it as a param and changing the implementation of list objects in every namespace).

There were two options for this implementation:

  1. Call the function of encoding for every field needed - every time check the flag and if it should be encoded then call the function that encodes the value (notice those fields are inside the return object).
  2. Create a structure that will have all the fields that should be encoded, it will be encoded as needed (according to the flag).

second option was chosen for readability and easier maintenance.

@pull-request-size pull-request-size bot added size/L and removed size/M labels Dec 7, 2022
@shirady
Copy link
Contributor Author

shirady commented Dec 7, 2022

This PR will be merged after adding an action regarding the Ceph S3 tests
or when I'll have the status of all the tests (before and after this change).

Copy link
Member

@guymguym guymguym left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shirady don't we have the same encoding type that we need to handle also in list objects, list versions and list uploads? if so we can make it all in one to make it uniform...

src/endpoint/s3/ops/s3_get_bucket.js Outdated Show resolved Hide resolved
src/endpoint/s3/ops/s3_get_bucket.js Outdated Show resolved Hide resolved
@shirady
Copy link
Contributor Author

shirady commented Dec 8, 2022

@shirady don't we have the same encoding type that we need to handle also in list objects, list versions and list uploads? if so we can make it all in one to make it uniform...

Regarding implementing the encoding type in other operations (for example, list-object-versions and list-multipart-uploads, we can reuse as a second stage.
@guymguym, if it will be fine, we can reuse the changes that we will implement here.
But 'how this change will affect the Ceph S3 tests?' is still an open question.

when I started with the Ceph S3 tests I understand that we do not support the encoding type in any op (if we did I just had to copy the function from other ops). So we could add those tests to 'Not implemented' (out of the list of tests) or try to implement, from the PR this decision is clear.

During my investigations, I found out that the client that we are using in Ceph S3 tests is using boto3 (AWS SDK for python) which sets automatically the encoding type url (see link):

“Encoding type is automatically set because there are unicode characters which can break the XML parser. To work around the issue, we automatically set the encoding type parameter to url if the customer hasn't already set it. We then decode the keys on the way back, again only if we provided that parameter.”

Today the test are hard-coded, by doing this change the client will receive encoded fields that the tester did not expect.
That is why I commented:

This PR will be merged after adding an action regarding the Ceph S3 tests or when I'll have the status of all the tests (before and after this change).

@guymguym
Copy link
Member

guymguym commented Dec 8, 2022

ok so why not add the same handling to list objects etc? what is the problem with it?

@shirady
Copy link
Contributor Author

shirady commented Dec 8, 2022

ok so why not add the same handling to list objects etc? what is the problem with it?

We can, I separated the functions so it will be easier to reuse.

@dannyzaken
Copy link
Contributor

ok so why not add the same handling to list objects etc? what is the problem with it?

We can, I separated the functions so it will be easier to reuse.

The effort seems minimal. I suggest doing it in this PR

@shirady
Copy link
Contributor Author

shirady commented Dec 20, 2022

I added the changes of adding encoding support to list-object-version and list-multipart-uploads.

I will try to explain how these changes will be reflected in the status of Ceph S3 tests:
In August, our status was: 98 failed tests, in December our status is 87 failed tests - list_of_running_all_s3_tests.txt

The encoding-type url is added in 3 operations:

  1. list-objects and list-object-v2: 89 failed tests.

    tests test_bucket_listv2_encoding_basic and test_bucket_list_encoding_basic are not passing yet due to faulty inside the test (I opened an issue).

    test_bucket_list_prefix_basic and test_bucket_list_prefix_delimiter_basic become failed tests due to a bug in the client boto3 (I also opened an issue about it - for Ceph and boto3 as well).
  2. list-object-version: 87 failed tests
.
    No change.
  3. list-multipart-uploads: 88 failed tests.
    The test test_object_acl_create_contentlength_none also failed (but it is supposed to fail in all other cases since it is not implemented).
    Irrelevant change.

Additional notes:

A) I used the testing instructions, but I had to change the wait time in kubectl to finish the running of all the tests, in file run_test_job.sh change the line:

-        kubectl wait --for=condition=complete job/${TEST_RUN_NAME} --timeout=500s -n ${NAMESPACE}

+        kubectl wait --for=condition=complete job/${TEST_RUN_NAME} --timeout=800s -n ${NAMESPACE}

B) I created a copy of this repo (noobaa-core) and ran the tests in GitHub Action if you would like to observe it see the process s3-tests and the list of the tests (the repo will be available for a limited time and will be deleted after this PR will be closed):

original
list-objects and list-object-v2
list-object-version
list-multipart-uploads

C) Now I have the status, unlike what I commented before, this PR will be merged after it gets your approval.

This PR will be merged after adding an action regarding the Ceph S3 tests or when I'll have the status of all the tests (before and after this change).

Copy link
Member

@guymguym guymguym left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, just some comments to organize the code a bit

src/endpoint/s3/s3_errors.js Outdated Show resolved Hide resolved
src/endpoint/s3/s3_utils.js Outdated Show resolved Hide resolved
src/endpoint/s3/ops/s3_get_bucket.js Outdated Show resolved Hide resolved
src/endpoint/s3/ops/s3_get_bucket.js Show resolved Hide resolved
src/endpoint/s3/s3_utils.js Outdated Show resolved Hide resolved
@shirady shirady requested a review from guymguym December 21, 2022 07:37
Signed-off-by: shirady <[email protected]>

add formatter to the file

Signed-off-by: shirady <[email protected]>

change call encode url for every field
instead of using object fields_to_encode.

Signed-off-by: shirady <[email protected]>

add encodying url to list multi part uploads and list versions

Signed-off-by: shirady <[email protected]>
Copy link
Member

@guymguym guymguym left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good @shirady

@shirady shirady merged commit 09dc819 into noobaa:master Dec 21, 2022
@shirady shirady deleted the encoding-list-objects-s3-test branch December 26, 2022 16:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants