-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
simple search, v1.1 #26
Comments
The new version can be called with parameters DICT and KEY: https://www.sanskrit-lexicon.uni-koeln.de/simplet/DICT/KEY. But also admits optional additional parameters: /SIMPLE_INPUT/OUTPUT/ACCENT The SIMPLE_INPUT parameter specifies the assumed spelling of KEY, and this value is visible in another menu. When not specified in the URL, SIMPLE_INPUT defaults to 'default'. This assumes a phonetic type spelling, which When SIMPLE_INPUT is one of the other values (slp1, hk, itrans), then the spelling of KEY is assumed to use the peculiarities In addition to the SIMPLE_INPUT parameter, some additional enhancements have been made to better model spelling |
Working on bug exemplified by 'rupa'. Problem is that 'ru' has variants, but then the variants of 'u' are lost! |
Some problems from sanskrit-lexicon/COLOGNE#167 resolved
STILL unresolved
|
@gasyoun (or others) Please point out where some 'low-hanging fruit' improvements to simple search might be. |
transitions should be different for slp1 than default
They are the same. |
If I had a grandpa still alive, wish it would be you - Jim the magician. The four resolved ones work perfect.
Right, just a few more common ones.
Yes, and it's not critical. Good to have, because Sanskrit words live in strange ways, but not critical, as hard to solve.
Agree. |
@Andhrabharati & @funderburkjim
|
@gasyoun Coming to the subject matter of Tamilish Sanksrit, this is kind of a better style, I should say. I have seen far worst texts, even more unimaginable than the spellings of original DLI titles (now I see that various language-wise teams are working on cleaning those titles). And finally, why am I addressed in this?!! |
non-AI, rule based.
Oh, ok.
Can you show me a sample?
Let's document the worst ones?
You might have some samples I have missed above. |
The reason for not finding kadācid is- its not a single word by grammatical rules (though in print, many books club the two words "कदा चित्" together) - Look at this in MW, for example-
So are the words like "कदा चन"-
|
I do know that. But many people still look for it as a single word. I would want to have it as an entry point. |
Only way for this is to ignore the spaces in the "texts" to get such entries (and that was the way the manuscripts texts were, before the punctuation system [space, quote marks, exclamation & question marks, comma, ... ... ...] got introduced in Indian texts). |
No only Indian, same was in Latin until Middle ages. |
tamilish alternatesBased on examples above: These are 'solved':
These are mentioned in comment, but not believed to be problems:
Still no matches:
cut/paste good resultsI've got good results with small test of cut/paste of words from wikipedia. unwanted substitutionsThere are still sometimes too many results, as with 'natha':
Allowing initial 'n' to be replaced by 'm' is the main culprit in this example. many skd spelling differences now resolved.skd usually (always?) shows the nominative singular for substantive headwords.
|
Agree. As an additional option it makes sense - when you know what you actually search for. |
search time differenceThere is a noticeable difference in search time between local machine and cologne. This is probably a combination of:
|
And that is even after the ngrams are turned off? Cologne seems really slow on this.
Now we have a problem of over-generation.
A dream come true. Thanks, @funderburkjim |
p/b removed.Removed this spelling equivalence in simplet Now brahman (default/mw) gives 16 results:
Still 8+ seconds at Cologne (are you also seeing slow times in Cologne search?) When using prior version (/simple), same search for brahman takes about 2 seconds, and With simple and simplet search engines it is hard to know the 'cause' of the differences. Current comparisons between the prior version (simple) and dev version (simplet)
@gasyoun As SEO expert, how do you think we should proceed? |
No, it did not look like 8 to me, quicker, close to 3 as your experience
I do not see no reason for why not. Speeding it up might take longer than expected.
Not only production - it's ready to go outside the |
Change .htaccess:/simple/ now goes to version 1.1 (formerly called /simplet/) /simple1.0/ now goes to version 1.0 (formerly called /simple/.
What does that mean? |
While trivial to add a simple option to basicdisplay's input menu, It may be better to think of the whole system of basic, list, advanced-search displays as legacy applications, which will remain as is for the foreseeable future. Indeed with simple-search, there is no need for basic or list, in my opinion. However, basic, list, etc. do have the advantage that they can be easily installed as local applications. On the other side, Advanced search has some unique features that simple-search lacks:
So I think I get your idea, but that it is premature to spend much time thinking about it. |
Agree. But adding it will hurt in no way. A big remake is big. Let's do the trivial.
I do not see why these options can't be implemented in simple search. |
A new version of simple search is currently available under a 'test' url:
https://sanskrit-lexicon.uni-koeln.de/simplet/
The previous version is also available, under https://www.sanskrit-lexicon.uni-koeln.de/simple/
Would hope to have some users experiment with the new version before making the new version available under
https://sanskrit-lexicon.uni-koeln.de/simple/
The text was updated successfully, but these errors were encountered: