Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search through https://search.quarkus.io #1825

Merged
merged 5 commits into from
Jan 15, 2024

Conversation

yrodiere
Copy link
Member

@yrodiere yrodiere commented Oct 27, 2023

Creating as draft, for discussion. Please do not merge yet.

Summary of the changes:

  • Search in /guides and /version/*/guides is now implemented with a Vue.js application, replacing the custom Javascript of search-filter.js.
  • Search is performed remotely by https://github.com/quarkusio/search.quarkus.io
  • When https://github.com/quarkusio/search.quarkus.io fails to respond in a reasonable time (currently 400ms) or errors out, we default to Javascript search (basically the same implementation than before, just without JQuery)
  • When searching, we no longer display anything else than matching cards (no category headings in particular). This allows ordering hits by score (relevance to the searched terms).

To test locally:

You should notice search results are slightly more relevant, though we didn't try to tune search very much for now. You will see improvements though, e.g. matches against content that is not in the title/summary but elsewhere in the guide

@github-actions
Copy link

github-actions bot commented Oct 27, 2023

🙈 The PR is closed and the preview is expired.

@yrodiere yrodiere force-pushed the search-quarkus-io branch 2 times, most recently from 9bae26e to 5ca41f8 Compare October 30, 2023 12:58
@ynojima
Copy link
Member

ynojima commented Oct 30, 2023

If this supports localized sites (es.quarkus.io, pt.quarkus.io, cn.quarkus.io, and ja.quarkus.io), it's really appriciated :)

@yrodiere
Copy link
Member Author

yrodiere commented Oct 30, 2023

If this supports localized sites (es.quarkus.io, pt.quarkus.io, cn.quarkus.io, and ja.quarkus.io), it's really appriciated :)

From what I saw the guides in localized sites are still in English. And I don't think there are plans to translate the guides.

So, since this search is about the content of the guides, I don't think there's anything to do...

@ynojima
Copy link
Member

ynojima commented Oct 30, 2023

Not sure about other locales, but speaking of ja.quarkus.io, which I'm working on, all the v2.7 guides are localized (https://ja.quarkus.io/version/2.7/guides/), and more than half of the latest v3.5 guides are localized(https://ja.quarkus.io/guides/).

@yrodiere
Copy link
Member Author

yrodiere commented Oct 30, 2023

Not sure about other locales, but speaking of ja.quarkus.io, which I'm working on, all the v2.7 guides are localized (https://ja.quarkus.io/version/2.7/guides/), and more than half of the latest v3.5 guides are localized(https://ja.quarkus.io/guides/).

That's definitely different from the other locales, but sure we can index that as well.

Could you please open an issue at https://github.com/quarkusio/search.quarkus.io with some information on where to find these guides (GitHub repo, path in the repo in particular).

Also I have zero experience with search in Japanese and can't even read it, so it's definitely going to be challenging to tune it for good relevance; I'll need feedback from someone who can understand the results.

@ynojima
Copy link
Member

ynojima commented Oct 30, 2023

I'll need feedback from someone who can understand the results.

Of course I'm happy to help this :)

I filed an issue: quarkusio/search.quarkus.io#33
If the issue is not easy and you need to focus on the original quarkus.io search for initial release, adding a flag to jekyll configuration file to hide the search box helps localized sites.

@yrodiere yrodiere force-pushed the search-quarkus-io branch 3 times, most recently from 2207750 to a7958da Compare November 8, 2023 11:56
@maxandersen
Copy link
Member

I tried a few queries but didn't spot much difference in list of matches - wondering if i'm just choosing bad ones that are already tagged :) Got some good examples?

One thing I did notice is that search no longer retain tutorials/concepts/references section titles - is that intentional? The order still seem to be the exact same as old one thus not sure ?

@yrodiere yrodiere force-pushed the search-quarkus-io branch 2 times, most recently from 627eedb to a17dced Compare November 8, 2023 13:42
@yrodiere
Copy link
Member Author

yrodiere commented Nov 8, 2023

@maxandersen

I tried a few queries but didn't spot much difference in list of matches - wondering if i'm just choosing bad ones that are already tagged :) Got some good examples?

I think the timeout I originally set to fall back to Javascript search was too low; search is fast on the server side but our Openshift cluster seems to be located far from Europe and this results in rather high latency depending on your connection.

So, in short, you've been testing the fallback to Javascript search. Which doesn't yield good search results indeed, but at least fallback works, so, yay!

I pushed an update with a higher timeout. Try again with orm for example. And please note we didn't try very hard to optimize relevance yet.

Before:

image

After:

image

Another search you can try is elasticsearch: you'll notice that the guide "Centralized log management (Graylog, Logstash, Fluentd)" didn't appear in the results before, but now it does. That's because we do full-text search: we also look into the content of each guide, not just its title, summary and keywords.

One thing I did notice is that search no longer retain tutorials/concepts/references section titles - is that intentional? The order still seem to be the exact same as old one thus not sure ?

It's intentional: we need to remove section titles in order to provide relevance sort, and yes we do sort by relevance.

The order bieng the same as the old one is a result of using Javascript search, see above. It just happens that I remove section titles with Javascript search as well, because I didn't want to duplicate the code just for this.

@yrodiere yrodiere force-pushed the search-quarkus-io branch 4 times, most recently from 3b241ac to c3cc7cb Compare November 9, 2023 11:43
@yrodiere
Copy link
Member Author

yrodiere commented Nov 9, 2023

Hey @edewit @ia3andy , I've been told you know your way around Javascript :)
I don't, but I did what I could with the vue.js application in this PR.

When you find some time, would you be so kind as to review this? It's probably never going to be perfect, but I'd like to at least avoid obvious mistakes...

I added instructions on how to run it locally in the PR description, and you can test it live here: https://quarkus-website-pr-1825-preview.surge.sh/guides/

@brunobat reported one bug already: it seems that if you type relatively fast in the text box, characters get mangled.

I think it comes from the fact the text box gets refreshed while you type. I don't know why it even gets refreshed since, well, it's supposed to be write-only, but maybe you have an idea?

@brunobat
Copy link
Contributor

brunobat commented Nov 9, 2023

@gsmet
Copy link
Member

gsmet commented Nov 10, 2023

I played a bit more with it today and given the search is using a AND when fallabacking to Javascript, I would lean towards also using a AND when using the service.

@yrodiere
Copy link
Member Author

I played a bit more with it today and given the search is using a AND when fallabacking to Javascript, I would lean towards also using a AND when using the service.

quarkusio/search.quarkus.io#43

@yrodiere
Copy link
Member Author

FWIW the search service now indexes the full HTML body of each guide, which includes e.g. configuration property references. So the search now matches on configuration properties.

cc @michelle-purcell

@ia3andy
Copy link
Contributor

ia3andy commented Nov 13, 2023

My only concern is that we already use Lit and React in other Quarkus projects (code.quarkus and DevUI) using yet another framework will make things hard to maintain in the future. Would it be possible to switch to LitElement?

@yrodiere
Copy link
Member Author

My only concern is that we already use Lit and React in other Quarkus projects (code.quarkus and DevUI) using yet another framework will make things hard to maintain in the future. Would it be possible to switch to LitElement?

If someone sends a PR, sure. Personally I've already spent a few days just making it work, so I can't afford to spend a week learning another framework and trying to make it work...

@ia3andy
Copy link
Contributor

ia3andy commented Nov 13, 2023

My only concern is that we already use Lit and React in other Quarkus projects (code.quarkus and DevUI) using yet another framework will make things hard to maintain in the future. Would it be possible to switch to LitElement?

If someone sends a PR, sure. Personally I've already spent a few days just making it work, so I can't afford to spend a week learning another framework and trying to make it work...

From what I've looked, there is not much UI code yet, so better choose wisely. Also if you look at Lit, it seems really close to the code you provided (I am not familiar with Vue either) so I don't think it will be a big gap from one to the other...

@yrodiere
Copy link
Member Author

yrodiere commented Nov 13, 2023

From what I've looked, there is not much UI code yet

That's related to my concern. When I say I'm not good at Javascript, I mean it. This "not much" literally took me days of back and forth to get something that seems to work (with that one remaining bug...).

In fact I'm pretty sure I've spent as much time on the frontend than on the backend at this point... If we exclude OpenShift deployment :D

Also if you look at Lit, it seems really close to the code you provided (I am not familiar with Vue either) so I don't think it will be a big gap from one to the other...

And there you have it: "not much UI code" => a few days; "not a big gap" => how many more days?

I really can't afford to spend that kind of time right now, especially since this is essentially a side project and I'm late on pretty much every main project I'm assigned to (Hibernate Search, Quarkus Hibernate extensions, ...).

If you feel strongly about this I'd gladly accept a patch from an expert like you that basically erases my stuff with the same features (including infinite scrolling etc.) but on a better framework and with better code.

@yrodiere
Copy link
Member Author

I pushed an update this morning to use the new domain (search.quarkus.io), but it seems the preview wasn't updated.... ?

I'll try pushing again.

@yrodiere yrodiere changed the title Prototype of search through https://github.com/quarkusio/search.quarkus.io Search through https://github.com/quarkusio/search.quarkus.io Dec 20, 2023
@yrodiere yrodiere changed the title Search through https://github.com/quarkusio/search.quarkus.io Search through https://search.quarkus.io Dec 20, 2023
@yrodiere
Copy link
Member Author

We have just agreed to merge this early next year: https://quarkusio.zulipchat.com/#narrow/stream/187038-dev/topic/search.2Equarkus.2Eio/near/409365021

@maxandersen
Copy link
Member

tried finding info about "quarkus.websocket.max-frame-size" but couldn't get it to show something relevant.

tried the full string with and without quotes - also tried "max-frame-size" and even with quotes it kept searching for individual parts.

Should have found at least https://quarkus.io/guides/all-config that has it mentioned.

@yrodiere
Copy link
Member Author

yrodiere commented Jan 8, 2024

tried finding info about "quarkus.websocket.max-frame-size" but couldn't get it to show something relevant.

tried the full string with and without quotes - also tried "max-frame-size" and even with quotes it kept searching for individual parts.

Should have found at least https://quarkus.io/guides/all-config that has it mentioned.

FWIW this is being discussed there: quarkusio/search.quarkus.io#117

@yrodiere
Copy link
Member Author

tried finding info about "quarkus.websocket.max-frame-size" but couldn't get it to show something relevant.
tried the full string with and without quotes - also tried "max-frame-size" and even with quotes it kept searching for individual parts.
Should have found at least https://quarkus.io/guides/all-config that has it mentioned.

FWIW this is being discussed there: quarkusio/search.quarkus.io#117

Fixed and deployed just now. Try it out :)

@yrodiere yrodiere marked this pull request as ready for review January 11, 2024 15:37
@yrodiere yrodiere merged commit 59b0526 into quarkusio:develop Jan 15, 2024
1 check passed
@yrodiere
Copy link
Member Author

Merged! Thanks all for your comments.

If you notice any problems, please report to https://github.com/quarkusio/search.quarkus.io/issues

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants