Table of contents generated using this app
We seek to provide a functional rest-api based webapp to access various dictionaries stored in a database. We intend for the database to serve a huge, eclectic mix of dictionaries which are not already available in a single spot.
(in decreasing order of importance)
- Any web developer should be able to make simple REST-API calls to our backend to easily get entries for a lot of dictionaries (most of whom aren't already available in such manner).
- Imagine being able to highlight a word and use the context menu while reading a text to get it's meanings, grammatical info (linga, vibhakti, vachana, puruSha) etc..
- Imagine being able to make simple commentaries (like this) with a few mouse clicks per word (selecting the meaning most appropriate to the context).
- End users ( sanskrit students and scholars ) should be able to thoroughly investigate a term in the dictionaries using just a browser, without having to install any software.
- End users should be able to point out errors (eg. via a link).
- Users should be able to log in and submit new words to a "user dictionary" (as in <sanskritdictionary.de> or wiktionary, but using convenience of openId and less fuss.)
- We have several projects which collect dictionaries for various languages (link). These dictionaries are further subcategorized by topic, entry language etc.. (eg. Ayurveda, sa-german etc..).
- Simplest programmatic current way to access dictionary data is through babylon files (some here - open one to check out the format).
- We ultimately have the (extensible) JSON structure exemplified below for each entry:
{
"_id": "stardict-sanskrit__sa-head__sa-entries__vAchaspatyam-sa__vAchaspatyam-sa__22721",
"_rev": "1-bce5a5d86829ba751b32be7839771d42",
"entry": {
"text": "{@कटि(टी)@}¦ स्त्री कट-इन्। <br><br>१ श्रोणिदेशे(काकांल)। कृदिकारान्तत्वात् वा ङीप्। <br>“सहासनमभिप्रेप्सुरुत्कृष्टस्पाप्यपकृ-ष्टजः। अट्यां कृताङ्कोनिर्वास्यः” मनुः। <br>“कटिश्च तस्या-तिकृतप्रमाणा” भा॰ व॰ <br><br>१० <br><br>५४ । <br>“सव्येन च कटीदेशेगृह्य वाससि पाण्डवः” भा॰ आ॰ <br><br>१६ <br><br>३ <br>“कटिस्तु हरतेमनः” सा॰ द॰। तत्र तच्छब्दस्य ग्राम्यत्वमुक्तम्। ङीबन्तः<br><br>२ पिप्पल्पां स्त्री मेदि॰"
},
"jsonClass": "DictEntry",
"location": {
"dictionaryId": "stardict-sanskrit__sa-head__sa-entries__vAchaspatyam-sa__vAchaspatyam-sa",
"entryNumber": 22721,
"jsonClass": "DictLocation",
},
"headwords": [
{
"text": "kaTi"
},
{
"text": "कटि"
}
]
}
- For dictionary metadata, we have the following structure:
{
"_id": "amarakosha",
"name": "अमरकोशः",
"authors": ["अमरसिंहः"],
"licenseLink": "http://some-link",
"canonicalSource": "http://some-link",
"categories": ["sanskrit to sanskrit", "thesaurus"],
"issuePage": "https://github.com/sanskrit-coders/stardict-sanskrit/issues"
}
Actually, whatever couchdb provides is enough. Examples below are from the vedavaapi.org deployment (for https equivalents and other servers, see the "Deployment" section).
- Querying dictionary entries: use the dict_entries database and the index_headwords index
- couchdb documentation for the general find call.
In the ideal case, we would have the following (from the view of simplicity) (in decreasing order of importance):
/words/xyz
yields the appropriate entry if it exists; or returns a list of n=40 words starting with that substring 'xyz' - from all dictionaries./dictionaries/dictionaryId/words/xyz
- same as above, restricted to one dictionary./dictionaries/dictionaryCategory/words/xyz
does the same - except for all dictionaries in a given category./words_with_substring/xyz
etc..
(in decreasing order of importance)
- Should be mobile friendly, with flowing text.
- Searching:
- User is able to search for a word in multiple dictionaries.
- As a user types, a dropdown of suggestions appear, thereby avoiding unnecessary strain.
- User can restrict search to certain dictionaries or dictionary categories.
- Complex (ok if slower) searches based on substrings (or eventually regular expressions), matching headwords.
- Complex search within entries.
- Results:
- A term can appear in many dictionaries, the user should be able to quickly navigate to the dictionary or dictionary-category of his choice.
- Some history (what words were recently looked up).
- Stats:
- Collect site use stats (eg. using google analytics).
- Submit lookup stats to the server.
- Example from Goldendict (desktop) here.
- Use no-sql database with good replication characteristics, like couchdb. couchdb already provides some simple API, and it is extensible.
- Leverage the rest api that couchdb provides to the extant possible.
- Use offline catching intelligently to minimize communicating with the server. Caching ideas:
- Headword-dictionary mappings
- Pouchdb has size limitations due to its use of JS indexed Db, so we cannot use it heavily.
- Using Polymer for example frontend because of its utility and its easy compatibility with web components standards.
- You want to host a repilica and make things faster for folks in your geographical area? Just open an issue in this project and let us know.
- Ahmedabad, IN
- Bay area, USA (dev machine, unstable) https://api.vedavaapi.org/couchdb/dict_entries/_all_docs
- Polymer app
- You want to host copies (or even develop a superior UI?) and make things faster for folks in your geographical area? Just open an issue in this project and let us know. We'd love to list it here.