-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BEN Study #32
Comments
It may be noted that Scientific names should not be treated as Skt. words, they must be retained as spelt in the English (Latin) words. |
Not critical, still good to have.
Yap, so it is a general notion. |
I don't find Paê in csl-orig/v02/ben/ben.txt.
Yes, I took this shortcut here in benfey, so tooltips for the works would be available in displays. |
The 'sh' list is good. These errors in conversion to IAST need to be corrected. Possibly there are other errors generated in conversion to IAST. |
Sorry, I missed the 'g' in '[Pagêxx'; and here is the list for it.
Seems e10 of the LN encoding got applied here to get the ê. |
Just like many others, BEN scan with CDSL is also bad and has led to many errors in the digitisation. BTW, I am on BEN for last two days and in another two days, will be posting my file, for study by the CDSL team. |
Have I said enough times I like this guy? )) |
Some more points-
|
Incidentally,
Apart from displaying the pdf page, is there any use for this |
In old AS notation, e10 was used for ê. Therefore, it seems to be an erroneous side-effect of converting AS to IAST. |
What is AS, Alphabet-Sequence? I read somewhere, Jim mentioning the LN notation (Letter-Number) and so mentioned in my post above. |
There is no formal definition. We sometimes called these encoding 'Anglicized Sanskrit'. |
As I noticed, the error might've crept in, because those [Page...] strings are tagged as Sanskrit {%[Page...]%} |
I thought that the 'Anglicized Sanskrit' term is more used for words like Sanskrit, Aryan, Brahmin etc. as mentioned by MW!! |
In some other thread, there was a discussion on using PDFs for linking to the citations in CDSL dictionaries. I just got reminded of this, as Benfey had used Gorresio ed. of Ramayana. Seems Gorresio had spent 24 years of his life in bringing out his Ramayana (critical) ed., at the behest of Burnouf, and it got very popular in the Western countries those days. And @gasyoun was pondering on whereabouts of the Bombay ed. and Calcutta ed. that are widely referred in the "European" Lexicons of Sanskrit. One can find these and many more editions of Ramayana at http://onlinebooks.library.upenn.edu/webbin/book/lookupname?key=V%26amacr%3Blm%26imacr%3Bki So you may think again on using the PDF-links for the citations across all the CDSL works. |
This terminology is due to @thomasincambodia , who originally used it in mw; see CDSL.pdf, where he termed it 'Anglicized Sanskrit'. Over time, Thomas has used variations of his original AS notation; and has extended the usage to represent any Latin alphabet-with-diacritics in whatever language. I thought it best to remove this letter-number representation in the digitizations, by replacing the letter-number codes with Unicode characters. In this replacement process, there is always the issue that some letter-number sequences should NOT be replaced by Unicode; for example the 'e10' in It is good that you point out the erroneous conversion of 'e10' to circumflex in Benfey. These need to be changed. |
Your ref. to this paper by Thomas has reminded me of another wish in MW revision; to add Winternitz's corrections, apart from incorporating MW's own addenda into the main text. And can you get from Thomas, the details of other 'private' works that he was mentioning in this paper? |
Suggest you make a new issue regarding mw, and address question to @thomasincambodia . |
Can you think of some plan by which we can correct those bad places in the text? Full proofing is the best way out, but it definitely takes more time. |
So a few thousand of them in each dictionary.
One can't browse such an amount in full. Only randomly.
Do you have a link to the Winternitz's corrections?
Yes, the links to Ramayana and Mahabhrata is what comes to mind, but where we lack an idea what exactly to do, as the older editions where never digitised and only scanned. If we can't link to the exact schloka in the book, linking at least to the chapter would make sense. @Andhrabharati in the case of Gorresio what scan would you propose? |
It all depends on the person on the job!
Yes, I do have the PDF.
I recall @funderburkjim asking you to make this a student's project, marking the pdf page number against the citation, so that the page can be displayed. I have two diff. scans of Gorresio volumes. Need to look into both, to decide which is the better one. |
Here is the file, @gasyoun- |
I do not see no Greek here but just the French ç |
@Andhrabharati |
I was referring to your statement under "Further corrections and tags" in 1.5 of the CDSL.pdf, @thomasincambodia |
"private corrections lists" should have been "unpublished correction lists" |
I would rather call these Sanskrit loanwords especially when the form has been altered/adapted to English, but there are borderline cases: Rigveda, pandit, karma, Shiva etc. Pa1nduiden Pa1n2ini Pa1riga1ta Pa1rtha Pa1rvati1 Pa1t2ala1 Pa1t2ala1-Blüthe It can be said that AS words are always nouns/names. Also mark the initial capital, as there are no capital letters in Indian scripts. |
Interesting thought. |
sorry, of course meant GS = Germanized Sanskrit |
I agree with Thomas's viewpoint that discussions can be pursued without unnecessary judgmental tone. I would like to appreciate hard work done by Thomas and his team, of which we are reaping fruits. To be blunt, whatever we do here in this repository is ultimately a correction or feature addition to the work which was handed over to us because of the hard work put in by Thomas et al. I also appreciate the fact that @Andhrabharati is quite methodical in his approach and has been bringing forth many issues which have not been attended to hitherto. We need that vigour too. Kind request is to focus on content of the issue being raised, and keep the value judgments away from discussion. |
Agree with every word of Dhaval. |
For the last 4 days, I was completely bed-ridden (with high-fever), away from the computer. Just started sitting at the computer since this evening. So, I am just posting my BEN_main work as is, without spending any more time, though I had many pending aspects to cover in it. As this work is made with a format close enough to CDSL one, and hope it would be accepted as is. Not many comments henceforth from my side, as my words are harsh at times (as they come from my heart, without any bad intention), but they seem unbearable. Just like to say now that the ls count increased from 113 to 219 & ab count from 107 to 282. And until I hear back about my MW etym., IEG & this BEN works, will take good rest doing nothing (for CDSL, of course!). |
Sanskrit being the language of the Gods, all other languages are the languages of Angels, hence AS = Anglicized Sanskrit. |
No, the main purpose is to provide a link between the entry and the printed text, which is available from the scan. |
Nice idea! |
I browsed through the first 200 pages of the old scans, found several places especially where the image was |
Anglicized and Anglicised are just spelling variants. But if we take angels as the base |
one more: Angelificated |
Sorry to hear that; hope you will recover quickly and completely. |
I am recovered, @funderburkjim; only little weakness, no obstacle for any working. |
I have noticed many Greek words in Benfey dictionary having a Roman 'j' in between and found that it denotes some gliding sound. And there is one word having a Roman 'y' in it - σαγyω, at the entry word |
I’ve never encountered that before, but that sounds reasonable. Typically
in Classical Greek two gammas produce an /ŋɡ/ sound, so this could be the
lexicographers indicating it was instead pronounced with a gliding sound.
…On Sun, Feb 13, 2022 at 10:34 AM Andhrabharati ***@***.***> wrote:
@jmigliori <https://github.com/jmigliori>
I have noticed many Greek words in Benfey dictionary having a Roman 'j' in
between and found that it denotes some gliding sound.
And there is one word having a Roman 'y' in it - *σαγyω*, at the entry
word सञ्ज् (p. 996).
Does this also have some significance (as the 'j' above)?
Here is the page image for your reference-
[image: image]
<https://user-images.githubusercontent.com/75209130/153760455-c58381c9-16bb-423b-ab7c-cd5b304b5908.png>
—
Reply to this email directly, view it on GitHub
<#32 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AC3CFLPJAL2EDXI6QCWZAJLU27FQVANCNFSM5E3DLHVQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Here is the BEN_main.txt with greek strings filled up-- Now, this stands corrected for the j (u+006A) > ϳ (u+03F3), as mentioned above. |
Some "interesting" findings in this.
[Page10xx is made [Paêxx
sh in non-skt strings made as ṣ
BEN sh (non-Skt) made as ṣ.txt
Most importantly, the
<ls>
marking is limited to the work names alone, but not extended to the content numbers (citations), as I have been mentioning in MW, PWG etc., while I was into those works.The text was updated successfully, but these errors were encountered: