Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request]: FINAL search engine features for 9.5 #600

Closed
3 tasks done
python357-1 opened this issue Nov 22, 2024 · 9 comments · Fixed by #606
Closed
3 tasks done

[Feature Request]: FINAL search engine features for 9.5 #600

python357-1 opened this issue Nov 22, 2024 · 9 comments · Fixed by #606
Labels
Priority: High An important issue requiring attention TagStudio: Library Relating to the TagStudio library system TagStudio: Search The TagStudio search engine Type: Enhancement New feature or request

Comments

@python357-1
Copy link
Collaborator

python357-1 commented Nov 22, 2024

Checklist

  • I am using an up-to-date version.
  • I have read the documentation.
  • I have searched existing issues.

Description

this issue describes the current consensus on what features the search engine will have for the 9.5 release.

REQUIRED for 9.5:

new queries - a definition and implementation of a "query language" which can be converted to SQL queries to be made against the database

  • Boolean operators: AND, OR, and NOT operators for said queries
  • tag, tag_id, mediatype, filetype, path, and special queries: types of queries to be run against entries/tags in the database
query description allowed values example
tag searches by tag name, with an optional "disambiguation" syntax that specifies which tag you are looking for, if there are multiple matches string tag: Mario
tag: Mario[parent=nintendo]
tag_id searches by the internal ID of a tag int tag_id: 1001
mediatype searches by name property of MediaCategories (will eventually use the translated name) string mediatype: video
filetype searches by the file extension string filetype: jpg
path searches by complete path of files. allows globs string path: folder/*
special searches by special metadata of entries "untagged"
"unlinked"
special: untagged
  • field content search: TODO
  • sortable results (possibly in a different issue/PR)

STRETCH GOALS for 9.5

  • new search UI (will be in a separate issue/PR)
  • HAS query (waiting on composition tags to be implemented)
@python357-1 python357-1 added the Type: Enhancement New feature or request label Nov 22, 2024
@CyanVoxel CyanVoxel added TagStudio: Library Relating to the TagStudio library system Priority: High An important issue requiring attention TagStudio: Search The TagStudio search engine labels Nov 22, 2024
@CyanVoxel CyanVoxel moved this to 🚧 In progress in TagStudio Development Nov 22, 2024
@CyanVoxel CyanVoxel added this to the Alpha v9.5 (Post-SQL) milestone Nov 22, 2024
@Computerdores
Copy link
Collaborator

I am interested in working on a parser for this. Feedback on my current WIP grammar would be nice:

ANDList        = ORList ( ["AND"] ORList )* ;
ORList         = Term ( "OR", Term )* ;
Term           = Constraint | "(", ANDList, ")" ;

Constraint     = [ConstraintType, ":"], Literal, "[", PropertyList, "]" ;

ConstraintType = "tag" | "mediaType" ; (* not a complete list *)
PropertyList   = Property, (",", Property)* ;
Property       = ULITERAL, "=", Literal ;
Literal        = ULITERAL | QLITERAL ;

Notes:

  • QLITERAL means a quoted literal (e.g.: "test 'hello' end", 'test "hello" end', 'What\'s this, \ escaping?')
  • ULITERAL means an unquoted literal (like the value of this property: test=true)

@python357-1
Copy link
Collaborator Author

Looks pretty good to me! I'm just curious, is the reason for ANDLists and ORLists being separate things for which one takes precedence over the other? If so, does the current grammar make ANDs take precedence over ORs? That would be ideal

@Computerdores
Copy link
Collaborator

Computerdores commented Nov 26, 2024

Looks pretty good to me!

Thanks! I am also much more happy with this that with my previous version. This is much simpler and should be easier to parse.

is the reason for ANDLists and ORLists being separate things for which one takes precedence over the other?

Yes!

does the current grammar make ANDs take precedence over ORs? That would be ideal

No it does not, OR would be evaluated first here. AND taking precedence over OR would be more intuitive for me too, however iirc @CyanVoxel suggested this somewhere in the discussion on the discord and I generally don't care too much about it.

(Cyan, If I am not supposed to ping you let me know, I asked on the discord before, but a big discussion broke out right after so I believe you missed it)

@CyanVoxel
Copy link
Member

No it does not, OR would be evaluated first here. AND taking precedence over OR would be more intuitive for me too, however iirc @CyanVoxel suggested this somewhere in the discussion on the discord and I generally don't care too much about it.

Unless it was a looooong time ago, I don't remember preferring OR over AND; the opposite seems more intuitive to me as well. I know I brought up a preference for implicit AND when no operator is given, but I'm not sure if that's related here.

(Cyan, If I am not supposed to ping you let me know, I asked on the discord before, but a big discussion broke out right after so I believe you missed it)

It's alright to ping me, I miss too much stuff to warrant not being pinged 🙃

@Computerdores
Copy link
Collaborator

Computerdores commented Nov 26, 2024

No it does not, OR would be evaluated first here. AND taking precedence over OR would be more intuitive for me too, however iirc @CyanVoxel suggested this somewhere in the discussion on the discord and I generally don't care too much about it.

Unless it was a looooong time ago, I don't remember preferring OR over AND; the opposite seems more intuitive to me as well. I know I brought up a preference for implicit AND when no operator is given, but I'm not sure if that's related here.

Alright nvm then. I have a first go at the parser almost done (still with OR preference though) so I might open a Draft PR in the next days.

@Computerdores
Copy link
Collaborator

Updated Grammar (AND now binds stronger than OR as is normal):

ORList         = ANDList ( "OR", ANDList)* ;
ANDList        = Term ( ["AND"] Term )* ;
Term           = Constraint | "(", ORList, ")" ;

Constraint     = [ConstraintType, ":"], Literal, "[", PropertyList, "]" ;

ConstraintType = "tag" | "mediaType" ; (* not a complete list *)
PropertyList   = Property, (",", Property)* ;
Property       = ULITERAL, "=", Literal ;
Literal        = ULITERAL | QLITERAL ;

Notes:

  • QLITERAL means a quoted literal (e.g.: "test 'hello' end", 'test "hello" end', 'What\'s this, \ escaping?')
  • ULITERAL means an unquoted literal (like the value of this property: test=true)

@Computerdores
Copy link
Collaborator

With this proposal we are currently missing a way to search for untagged files.

My idea on how to fix this would be the following syntax: special:untagged.

The advantage of this is that this could also be used to add other special criteria like e.g. special:unlinked to search for unlinked entries. And it would nicely integrate into the grammar and the existing code base on #606

@CyanVoxel
Copy link
Member

With this proposal we are currently missing a way to search for untagged files.

My idea on how to fix this would be the following syntax: special:untagged.

The advantage of this is that this could also be used to add other special criteria like e.g. special:unlinked to search for unlinked entries. And it would nicely integrate into the grammar and the existing code base on #606

I like this approach. Simply typing "empty" or "untagged" was nice in 9.4, but realistically it shadowed any potential tags with those names. This new approach avoids that issue while also making sure it plays nicely with the grammar 👍

@python357-1
Copy link
Collaborator Author

Added a table for all current search queries. Let me know if anything needs to be changed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Priority: High An important issue requiring attention TagStudio: Library Relating to the TagStudio library system TagStudio: Search The TagStudio search engine Type: Enhancement New feature or request
Projects
Status: ✅ Done
Development

Successfully merging a pull request may close this issue.

3 participants