Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

simple search, v1.1a #27

Open
funderburkjim opened this issue Feb 15, 2021 · 18 comments
Open

simple search, v1.1a #27

funderburkjim opened this issue Feb 15, 2021 · 18 comments
Assignees
Labels
Documentation enhancement New feature or request

Comments

@funderburkjim
Copy link
Contributor

funderburkjim commented Feb 15, 2021

v1.1a of simple-search is quite different under the hood that v1.1 and previous versions.

It may currently be used with 'simple1.1a' instead of 'simple'.
e.g. https://sanskrit-lexicon.uni-koeln.de/simple1.1a/mw/munih

The performance should be more uniform across words.

The program depends heavily on sqlite fts, which is a full text search (a.k.a. inverted index) capability of sqlite. Making and querying inverted indices underpin most
search applications, but usually (as with those based on java Lucene) require a separate server.
However, sqlite's inverted indexes can be installed and queried much like any other sqlite database.

This functionality (using fts4 version) is available via the version 3.6 of Python at
Cologne. It is not natively available in the sqlite version of PHP at Cologne. However, a PHP
program can query an fts table via python by using 'shell_exec'. That's what is being used
in simple1.1a.

There's a lot more to say, technically, about this approach. But first, use it some and let me
know if any important features have been missed. You can still get at the prior 1.1 version by 'simple'.

Also, as a teaser, try searching for some declined forms (e.g. muniBiH, devena, sisunAm) -- this
is one area where v1.1a might be able to be extended much more readily than prior versions.

@gasyoun
Copy link
Member

gasyoun commented Mar 8, 2021

The program depends heavily on sqlite fts,

Wonder if it could be used for offline desktop version of the dictionaries as well.

devena

devena works, wow, @funderburkjim

some declined forms (e.g. muniBiH, devena, sisunAm)

Is there a list out there what can, and what not? Pronominals I did not found.

New case:
sradha
3 results: śarada śaradā saraḍa
sraddha
3 results: śraddhā śrāddha śraddha

Should we suppose, that if I search for sradha that I might have meant śraddhā?

@gasyoun gasyoun added the enhancement New feature or request label Mar 8, 2021
@gasyoun
Copy link
Member

gasyoun commented Mar 24, 2021

New case:
I was wrongly looking for viśravasa but needed viśravas actually.

@gasyoun
Copy link
Member

gasyoun commented Apr 14, 2021

@funderburkjim agree with sradha?

@funderburkjim
Copy link
Contributor Author

Haven't thought about sradha example. Thanks for reminder.

Can you think of a generalization of this? Is it only 'dD' (slp1)? or is this one instance of a more comprehensive pattern?

@gasyoun
Copy link
Member

gasyoun commented Apr 14, 2021

Can you think of a generalization of this?

like tT?

Is it only 'dD' (slp1)?

I believe t as well.

@gasyoun
Copy link
Member

gasyoun commented Apr 16, 2021

I entered aṃśumāna and found out it should have been actually aṃśumat.
So got 0 no results found.
I can propose it was meant aṃśumān instead of aṃśumāna, still.

https://archive.org/details/in.ernet.dli.2015.308381/page/n147/mode/2up

@gasyoun
Copy link
Member

gasyoun commented May 10, 2021

@funderburkjim people still mix SLP1 and simple. We can't change SLP1 name, not so sure about simple.

Almost full match names should come higher in the list than even popular, but variations, agree with this case?

bhisma

@funderburkjim
Copy link
Contributor Author

I agree this is confusing.

Suggestion: Get rid of the menu called 'input'.

Worth a try?

@gasyoun
Copy link
Member

gasyoun commented May 10, 2021

Suggestion: Get rid of the menu called 'input'.

If simple is the default, let's go for it.

@gasyoun
Copy link
Member

gasyoun commented Jul 11, 2022

@funderburkjim sankhya gives as expected:

5 results: saṃkhyā sāṃkhya saṃkhya śaṅkya śāṅkhya

saṁkhya gets ṁ lost and nothing from what is offered is of any interest:

5 results: sakhya śakya śākya śākhya sākhya

@funderburkjim
Copy link
Contributor Author

@gasyoun The 'm-dot-above' is now handled in 'simple'
image

@funderburkjim
Copy link
Contributor Author

@gasyoun However, the '1.1a' version does not catch this. This is an unexpected difference between /simple1.1a/ and /simple/.
image

@gasyoun
Copy link
Member

gasyoun commented Sep 10, 2022

@funderburkjim if we want to use the same SIMPLE page for English to Sanskrit translations, it becomes troublesome.

god

If you type anything from a phone, the first letter in the input box will become Capital by default. Nothing will be found.

Godss

But even if we type an English word without capital letters, the result will be found, but will not be counted as such, remaining 0.

@funderburkjim
Copy link
Contributor Author

I hadn't really thought about 'simple' for MWE, AE, etc. Probably the current logic is
inappropriate for non-Sanskrit headwords.

The code base of 'simple' has become complicated enough to be difficult to manage.

And the UI needs to be rethought as the interactions among the user choices has
become difficult to predict.

I currently almost always use 'input=slp1' setting -- that way the 'Suggestion' list is available but the spelling change features of 'input=simple' are not present at all. Perhaps this
setting should be separated out as a 'suggest' app, and removed from 'simple' , since this
usage is not really in the spirit of 'simple'.

@gasyoun
Copy link
Member

gasyoun commented Sep 26, 2022

Probably the current logic is inappropriate for non-Sanskrit headwords.

Yap, not working. Input devanagari in this /simple does not work as well. Worked well before. Stopped working lately.

fsdfsdfsdfsdfdsfsd

The code base of 'simple' has become complicated enough to be difficult to manage.

Should not be that much code to get lost. Is it?

And the UI needs to be rethought as the interactions among the user choices has
become difficult to predict.

Like to test on different scenarios? I've proposed one a student has asked me for lately.

I currently almost always use 'input=slp1' setting -- that way the 'Suggestion' list is available but the spelling change features of 'input=simple' are not present at all.

So you're like a robot. There are 5 people on the Earth who think in SLP1.

since this usage is not really in the spirit of 'simple'.

I agree. But do not see an issue with leaving it as well.

@gasyoun
Copy link
Member

gasyoun commented Nov 14, 2022

lakṣmīvān will not show me lakṣmīvat - should we try to show Nominative forms? @funderburkjim

@gasyoun
Copy link
Member

gasyoun commented Nov 16, 2022

halahala will never show hālāhala, but should @funderburkjim

asddasasas

and

fsdsdfdsfsd

@gasyoun
Copy link
Member

gasyoun commented Dec 2, 2022

Searching for āyuṣman will not show us āyuṣmant, which is āyuṣmat in MW; neither is āyuṣmān generated, but mentioned inside the article @funderburkjim

ayushman

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Documentation enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants