Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Add custom tsquery from websearch function and related tests #838

Merged
merged 4 commits into from
Nov 11, 2024

Conversation

adelavega
Copy link
Member

Added custom function to parse websearches as in a "quasI" pubmed like format.

The custom parser does the following:

  • Converts "AND" "OR" "NOT" to tsquery
  • Respects parentheses, and treats groups of parentheses as units
  • Quotes specifically mean that words in quote must be in specific order. Currently it does not mean that special chars are escaped (e.g. parantheses), although we could add that. It also potentially behaves strangely if dashes are in quotes (e.g. "pre-diagnosis").

Needs additional testing to ensure behavior is as expected. I added a few queries, but more would be helpful.
Could have undefined behavior if invalid combination of parens and quotes are used, but to be fair this is just an invalid queries.

Needed from API side:

  • Confirmation from @jdkent that correct change to code was made
  • Return error codes to user via API
  • Test actual queries on db to ensure tsquery behaves as expected

@jdkent can you take this and run with it?

@adelavega
Copy link
Member Author

adelavega commented Nov 4, 2024

Should we disallow special characters in quotes?
"Autism Spectrum Disorder (ASD)" --- add to test case

@adelavega
Copy link
Member Author

adelavega commented Nov 4, 2024

"pre-diagnosis" - can we get a document that has this term in it.

currently: "pre-diagnosis" compiles to PRE<->DIAGNOSIS

should it be "PRE-DIAGNOSIS" instead?

@jdkent
Copy link
Member

jdkent commented Nov 5, 2024

currently: "pre-diagnosis" compiles to PRE<->DIAGNOSIS
should it be "PRE-DIAGNOSIS" instead?

testing the db, both queries gave the same result, so it's safe to replace the hyphen with <->

@adelavega
Copy link
Member Author

noice

@adelavega
Copy link
Member Author

What do we have left to do here?

@jdkent jdkent merged commit ea5cdc4 into master Nov 11, 2024
16 checks passed
@jdkent jdkent deleted the enh/adv_search branch November 11, 2024 18:59
@adelavega
Copy link
Member Author

Awesome! So for deployment I think we need to make some frontend and documentation changes so users know what's going on @nicoalee

Although, IIRC basic searches should operate very similarly, correct?

@jdkent jdkent mentioned this pull request Nov 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants