Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix SET-URL #3385

Closed
wants to merge 2 commits into from
Closed

Conversation

shamazmazum
Copy link
Contributor

As discussed in #3383, I open another PR which fixes navigation to URLs like "github.com/atlas-engineer".

It brings two changes:

  1. Uses valid-url-p to check if "https://" + query is a valid URL.
  2. In valid-url-p move the check of TLD validity out of (or (not check-dns-p) ...) block. Maybe it is my fault (I'm too lazy to do git blame), but this has nothing to do with DNS lookup.

Now you can set URL to github.com, github.com/foobar, wiki foobar or foobar and get what you want in all those cases.

@shamazmazum
Copy link
Contributor Author

Oops, one test fails, investigating

@shamazmazum shamazmazum reopened this Apr 10, 2024
@shamazmazum shamazmazum force-pushed the fix-set-url-2 branch 2 times, most recently from f4f0e30 to bb48522 Compare April 10, 2024 04:51
@shamazmazum
Copy link
Contributor Author

shamazmazum commented Apr 10, 2024

Done! As I said, I would like to remove DNS lookups completely in later commits, so URLs like http://foobar would be valid URLs. DNS is not used to check if an URL is valid since 377e9ab, so as for now I think this is a good idea to check if a query is an URL without a scheme or a search: if a query contains a dot and (valid-url-p (str:concat "https://" (query query)) :check-dns-p nil) is non-NIL, then it is not a search.

@aadcg Please review

@aadcg
Copy link
Member

aadcg commented Apr 11, 2024

@shamazmazum I'll review soon, thanks.

@aadcg
Copy link
Member

aadcg commented Apr 15, 2024

@shamazmazum I think there are two issues in this PR.

  1. As we have agreed, we don't want to change type signatures in this PR. See the change in initialize-instance :after.
  2. The issue we're solving is related to the call (and check-dns-p (valid-tld-p (query query))). Note that (cl-tld:get-tld "github.com") => "com", while (cl-tld:get-tld "github.com/") => NIL. For this reason, the later case is interpreted as a search engine query. The solution you propose seems too fragile.

@shamazmazum
Copy link
Contributor Author

shamazmazum commented Apr 15, 2024

  1. initialize-instance still can accept check-dns-p, obviously, but OK, I bring it back.
  2. The solution I propose it simple: if it looks like an URL, check if it is a valid URL with quri. cl-tld just does not work here. Moreover, quri was also used before 377e9ab (the commit which introduced this bug). If you think that this is not OK, I would like to see any examples where this approach does not work.

UPD: I may be wrong, but firefox does the same: if a query contains a dot, it tries to interpret it as an URL. If your URL does not have a dot, you have to specify the scheme. Surely, firefox never did any DNS lookups to determine if a query is an URL or not.

Before 377e9ab the conditional was:

((and check-dns-p
      (valid-url-p (str:concat "https://" (query query))
                   :check-dns-p check-dns-p))
 (setf (query query) (str:concat "https://" (query query))))

I just replace check-dns-p with a test for a dot and always pass nil to :check-dns-p (as you already do).

UPD2: Check this code from initialize-instance on master:

(cond
    ((valid-url-p (query query)
                  :check-dns-p nil)
     ;; Valid URLs should be passed forward.
     nil)
    ((and check-dns-p
          (valid-tld-p (query query)))
     (setf (query query) (str:concat "https://" (query query))))

What is the logic of using valid-url-p in the first conditional and a weaker check valid-tld-p in the second?

@aadcg
Copy link
Member

aadcg commented Apr 15, 2024

@shamazmazum enter about.this, github.com and 192.168.0.1 on Firefox and note that the first one is interpreted as a search query. This seems to be the counter-example you've requested.

@shamazmazum
Copy link
Contributor Author

shamazmazum commented Apr 15, 2024

@aadcg I agree with the first example. I still suggest using valid-url-p because it parses an URL and then you can pass the host part to valid-tld-p. We can do valid-url-p which checks TLD and a weaker predicate which omits this test (e.g. when the scheme was explicitly supplied). Obviously, you will have to change the type of valid-url-p to something like (-> valid-url-p (string &key (:check-tld-p boolean)) (values boolean &optional)).

Otherwise, I don't know. Cut everything after / in the query? But github.com/foo bar baz is obviously not an URL. Reinvent URL parsing in valid-tld-p? What for if we have quri.

I am out of ideas :) I'll just let it be if you don't like the idea with TLD check in valid-url-p.

@shamazmazum
Copy link
Contributor Author

BTW, there is a TLD check in valid-url-p which never runs.

@aadcg
Copy link
Member

aadcg commented Apr 15, 2024

@aadcg I agree with the first example. I still suggest using valid-url-p because it parses an URL and then you can pass the host part to valid-tld-p. We can do valid-url-p which checks TLD and a weaker predicate which omits this test (e.g. when the scheme was explicitly supplied). Obviously, you will have to change the type of valid-url-p to something like (-> valid-url-p (string &key (:check-tld-p boolean)) (values boolean &optional)).

Overall, I agree. If we follow this approach, then we don't need to check for dots in the query or am I missing something?


To put things into perspective, I've noticed the mess while fixing #2134. I took the approach of changing the minimum. I did mistakes nonetheless (8b00a2f and the bug you mention when querying "github.com/user/repo"), but the goal was to keep all of the knobs in place to easily revert to a previous state. It is now clear that it makes little sense to try to fix what is fundamentally broken. Therefore, I take my word back and suggest that we iterate on what the ideal solution would look like without any constraints. Sorry, I had the hope that this wouldn't spiral so deep.


Some things to take into account:

  1. The only "DNS check" we care for is the one done by cl-tld. I'm not sure whether, linguistically and technically, we should call it a DNS check or TLD check.
  2. Note that the DNS/TLD-check (in the above sense) is a needed parameter, see prompter:filter-preprocessor and prompter:filter-postprocessor from new-url-or-search-source.
  3. Ensure that issues Nyxt freezes when typing very fast in prompt buffer #2134 and set-url: entering url without http(s):// prefix no longer works v3.11.3 #3349 remain fixed.
  4. Bonus points for fixing this issue.

Let me see the full solution you suggest and then we'll go from there. Thanks!

shamazmazum added a commit to shamazmazum/nyxt that referenced this pull request Apr 16, 2024
Entering a query like "github.com/atlas-engineer" goes directly to
the desired web page instead of making a search. This fixes atlas-engineer#3385.
@shamazmazum
Copy link
Contributor Author

shamazmazum commented Apr 16, 2024

@aadcg I hope, I did it :) I split it into two commits: the first fixes the issue with minimal changes and the second does some clean-up which removes :check-dns-p parameter and also lookup-hostname function, because it is not used anymore. If you now allow me to change signatures (as I think, you do), I suggest accepting all those commits.

#2134 remains fixed (at least I see not delays on my computer), #3349 fixed, IP addresses work (127.0.0.1 goes directly to https://127.0.0.1). Additional test cases added.

Please, re-check if I missed something (I hope not).

Copy link
Member

@aadcg aadcg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shamazmazum overall, I think it's almost ready.

I couldn't find regressions when comparing with master and many issues have been solved.

When comparing to how Firefox behaves, I could only see that it handles localhost in a more DWIM fashion. E.g. issue python3 -m http.server and request localhost:8000. On Firefox it works without adding http://, unlike Nyxt.

source/urls.lisp Show resolved Hide resolved
source/buffer.lisp Outdated Show resolved Hide resolved
source/buffer.lisp Outdated Show resolved Hide resolved
source/buffer.lisp Outdated Show resolved Hide resolved
source/urls.lisp Show resolved Hide resolved
source/buffer.lisp Show resolved Hide resolved
source/urls.lisp Show resolved Hide resolved
@shamazmazum
Copy link
Contributor Author

OK. Thanks. I'll do it all later, but firstly I would like to know why use with-slots instead of with-accessors. These two mean different things. If you are sure that you want the first, then OK, sure :) But do you really want it?

Entering a query like "github.com/atlas-engineer" goes directly to
the desired web page instead of making a search. This fixes atlas-engineer#3385.
These functions are used no more as nyxt doesn't make DNS lookups
to check if URL is correct now.
@shamazmazum
Copy link
Contributor Author

@aadcg I did what you said. You still have to type http://localhost if you want to visit localhost, I hope it's OK more or less. BTW check something like Images.jl (a Julia package) in Nyxt and Firefox :)

aadcg pushed a commit that referenced this pull request Apr 17, 2024
Fixes the following issue reported in #3385. Input such as
"github.com/atlas-engineer" must be interpreted as a URL, not as a search query.
@aadcg
Copy link
Member

aadcg commented Apr 17, 2024

@shamazmazum I've pushed your work in commits 14dac1e, 424eff6, 1f58316 and ec082a3. Thank you for your patience!

@aadcg aadcg closed this Apr 17, 2024
shamazmazum added a commit to shamazmazum/nyxt that referenced this pull request May 23, 2024
Entering a query like "github.com/atlas-engineer" goes directly to
the desired web page instead of making a search. This fixes atlas-engineer#3385.
aadcg pushed a commit that referenced this pull request May 27, 2024
Fixes the following issue reported in #3385. Input such as
"github.com/atlas-engineer" must be interpreted as a URL, not as a search query.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants