Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unnecessary url encoding of some query parameters characters #74

Closed
dbeacham opened this issue Apr 16, 2017 · 2 comments
Closed

Unnecessary url encoding of some query parameters characters #74

dbeacham opened this issue Apr 16, 2017 · 2 comments

Comments

@dbeacham
Copy link

dbeacham commented Apr 16, 2017

I think the set of special symbols not url encoded in query parameters can be expanded from

ALPHA / DIGIT / "-" / "_" / "." / "~"

with

":" / "@" / "/" / "!" / "$" / "'" / "(" / ")" / "*" / ","

e.g ":", "@", "/" (extra pchar values) and sub-delims without "?" (query component delimiter) and "&;=+" (form url encoding sub component delimiters).

This is based on the query ABNF in RFC 3986 Appendix A. Appropriate sections below

unreserved    = ALPHA / DIGIT / "-" / "." / "_" / "~"
sub-delims    = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "="
pct-encoded   = "%" HEXDIG HEXDIG

pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"
query         = *( pchar / "/" / "?" ) 

and RFC 3986 Section 2.4: When to Encode or Decode quoted in part below

Under normal circumstances, the only time when octets within a URI
are percent-encoded is during the process of producing the URI from
its component parts. This is when an implementation determines which
of the reserved characters are to be used as subcomponent delimiters
and which can be safely used as data. Once produced, a URI is always
in its percent-encoded form.

But do check my working and I can send over a PR if you agree.

@tkvogt
Copy link
Contributor

tkvogt commented Jan 28, 2018

Is someone working on it?
The github library has a related issue. The whole search api does not work currently.
A query to github:

?q=a+created:2012-12-07..2012-12-17+language:Haskell&sort=stars&order=desc

is always url encoded by http-types into

?q=a%2Bcreated%3A2012-12-07..2012-12-17%2Blanguage%3AHaskell&sort=stars&order=desc

So either not url encode ":" and "+", or explicitly say what to encode, like

data QueryElement = QE BS.ByteString -- encode string
                  | QN BS.ByteString -- do not encode

[QE "a", QN "+", QS "created", QN "+", QS "2012-12-07..2012-12-17", QN "+", ..]

Would you accept a PR that adds new functions, that use [QueryElement] instead of QueryString?

@aristidb
Copy link
Owner

aristidb commented Jan 28, 2018

I'm not keen on changing the behavior of existing functions, just the other day I needed to revert a change changing an encoding in a theoretically valid way because it broke some URL signing.

However, I'd be happy to accept a PR adding new functions. Even just a function renderQueryBuilderMinimalEscape would be fine, I guess. Your approach allowing explicit unescaped segments would also be acceptable.

(And to answer your first question, I know of nobody actively working on this.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants