Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Client doesn't handle basic auth passwords with special characters in url correctly #60

Open
NoahCarnahan opened this issue Apr 13, 2022 · 10 comments
Labels
bug Something isn't working good first issue Good for newcomers

Comments

@NoahCarnahan
Copy link

What is the bug?
It seems like OpenSearch::Client doesn't handle hosts/urls that contain passwords in the url with special characters correctly. When making requests with one of these urls, I get an "Unauthorized" error. I believe the problem is that __parse_host doesn't decode the special characters.

How can one reproduce the bug?
Example code snippet:

client = OpenSearch::Client.new(
    url:  "https://user:pass%7D%[email protected]:15000",
    transport_options: { ssl: { verify: false } }
)
client.search(index: '_all', body: {})

I can however get around this problem by passing a hash in the url parameter that does the decoding myself:

url = URI("https://user:pass%7D%[email protected]:15000")
client = OpenSearch::Client.new(
  url: {
    scheme: url.scheme,
    user: CGI.unescape(url.user),
    password: CGI.unescape(url.password),
    host: url.host,
    path: url.path,
    port: url.port
  },
  transport_options: { ssl: { verify: false } }
)
client.search(index: '_all', body: {})

What is the expected behavior?
I expect that passwords embedded in the url with standard url encoding should not cause authentication errors

What is your host/environment?

  • OS: macOS 12.3.1
  • Version: 1.0.0
@NoahCarnahan NoahCarnahan added the bug Something isn't working label Apr 13, 2022
@jayeshathila
Copy link
Contributor

Let me take a look on this. Will get back.

@jayeshathila
Copy link
Contributor

The password pass%7D%7Dword is used as is (without decoding). @NoahCarnahan you mean to say the decoding should be happening here ?

@NoahCarnahan
Copy link
Author

Correct, I believe that the library should decode the password before trying to use it for authentication.

I realize I left a detail out of my initial bug report:
The password for this user is pass}}word. } is a special character, therefore it must be encoded to be included in the basic auth section of the url. As per url encoding rules, } is replaced by %7D, which gives us a url of https://user:pass%7D%[email protected]:15000 in this example.

If I were to try specifying the password without URL encoding like https://user:pass}}[email protected]:15000, then an exception would be thrown about an improperly formatted url.

Additionally, I can use curl with the https://user:pass%7D%[email protected]:15000 version of the url to successfully access my AWS OpenSearch cluster.

@dblock
Copy link
Member

dblock commented Apr 19, 2022

Ruby default parser doesn't seem to want to decode the password.

2.6.5 :009 > parsed = URI.parse('https://user:pass%7D%[email protected]:15000')
 => #<URI::HTTPS https://user:pass%7D%[email protected]:15000> 
2.6.5 :010 > parsed.password
 => "pass%7D%7Dword" 

The RFC says "Within the user and password field, any ":", "@", or "/" must be encoded."

> URI.parse('https://user:pass%[email protected]:15000').password
 => "pass%40word" 

So this doesn't decode the password either.

It seems like in most libraries unencoding is left to the caller, e.g. in addressable:

require 'addressable'
Addressable::URI.unencode('https://user:pass%[email protected]:15000/%40%40')
 => "https://user:pass@[email protected]:15000/@@" 

Calling decode first would be a breaking change. I vote to leave things as is.

Note that using URLs with username/password in them is deprecated in the RFC, because this will log the user's password. That's obviously a bad security practice, so I would get rid of any code that passes username/password into the URL.

@jayeshathila
Copy link
Contributor

Agree with @dblock , this will be breaking change. We should leave the decoding to caller.

@dblock
Copy link
Member

dblock commented Apr 20, 2022

@NoahCarnahan Do you feel strongly otherwise? Otherwise let's close this?

@NoahCarnahan
Copy link
Author

I agree that the default ruby parser doesn’t do any decoding, but I think it’s the responsibility of opensearch-ruby as a user of URI to do the decoding. As a point of comparison, the opensearch-py library handles URLs with encoded characters just fine (encoded characters in the password part of the url don’t prevent the library from authenticating with the server).

I recognize that passing username and password in the url isn’t ideal, but I do wonder if this library also can’t handle escaped characters in the path part of the url, which I imagine could come up with certain index names (I haven’t tested this).

Ultimately, I’m not a contributor to this project, so I’m not going to push too hard for the change, but I do feel that the current behavior of the library doesn’t handle all valid RFC 1738 urls correctly, which seems like a bug to me.

@dblock
Copy link
Member

dblock commented Apr 20, 2022

@NoahCarnahan If you have some spare time, add a few RSpec tests and document current behavior (e.g. a test that checks that encoded path part is decoded)? This way we will know what we are changing, if we are and it will help the next person.

@wbeckler
Copy link

It looks like the password part is as per the RFC and is also a deprecated practice. I propose we close this issue and if anyone finds an issue with the url parsing, we can create a new issue.

@dblock
Copy link
Member

dblock commented Sep 28, 2022

I think we should implement some specs and document this behavior before we close it.

@wbeckler wbeckler added the good first issue Good for newcomers label Sep 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

4 participants