-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Standardize free text search of RDF data #40
Comments
That was considered for SPARQL 1.1. The problem is "standard". The WG felt that the expectation would be a single free text search language in the same way regex is the F&O regex language. Regex impls are sufficiently converged that the difference across programming language are small, and are part of the runtime of many languages. Free text was not then in the same state. If the WG had to come up with that free text search language, the work item that would have squeezed out many of the other things in SPARQL 1.1, there being finite time and people to do the work. A looser idea of "standard" where an access point to text search facilities, without exact definition is one possibility. |
|
https://www.w3.org/2009/sparql/wiki/Feature:FullText (shared by Axel Polleres) |
Hi one of the approach for handling this feature could be thinking about a standardized way to use external indexing services, by means of a specific usage of the For example for blazegraph:
this document seems to have a nice overview of various different implementations: |
Removing "SPARQL: " on transferred issue. |
Indeed, this was hard to do in SPARQL 1.1, since it would require some survey work to understand the specific use cases. However, I think the key is that nearly every web site and app has a search bar. I've seen a lot of architectural complexity added just to support that on backends. Some basic support seems to me to be very important, but the scope should be limited. |
Another approach to handle FTS is used in Halyard project: SELECT ?subj ?pred
WHERE {
?subj ?pred "(search~1 algorithm~1) AND (grant ingersoll)"^^halyard:search
} https://merck.github.io/Halyard/usage.html#cooperation-with-elasticsearch in this case the actual FTS is done on literals externally (on an external Elasticsearch instance), similarly to blazegraph. The syntax seems rather different from the other ones cited before. |
Regarding the Halyard's full-text search syntax, with @asotona we're actually considering changing it to the more common and uniform |
Full Text Search is important. Let's find what to consolidate around as a new feature and what to leave flexible. While perfect functionality alignment isn't possible if system use external libraries but is there one "text search" feature which has some common syntax across systems? We have two issues for better calling external functionality in "multiple returns from a function #6" and Are they enough as a mechanism? Or maybe FTS is so important it deserves special syntax , with URI to a specific FTS implementation (even if that syntax is calling the same mechanisms) or would different systems be giving a different URI to their capabilities and the user has to see this? Rough example: |
The XQuery Fulltext extension is at it's core not specific to XML, and could easily be lifted wholesale into SPARQL. It took a lot of time and effort to develop - and it would be good not to duplicate this effort. |
My attempt to list triplestores with full-text search functionality: |
See comment from @hartig in this related issue: #193 (comment)
|
Apache Jena text search is based on Apache Lucene. |
Several RDF stores support free text search, but there's no standard way to do it.
Proposed by Kjetil Kjernsmo in W3C Graph workshop lightning talk: https://www.w3.org/Data/events/data-ws-2019/assets/lightning/KjetilKjernsmo.pptx
The text was updated successfully, but these errors were encountered: