-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Common words and forms missing from LEMLAT #6
Comments
Thank you, Neven.
Very helpful, indeed.
The forms you propose to include in the DB are mostly "exceptional forms" (in LEMLAT's terminology) of already recorded lemmas. See the documentation for the details of such forms.
Basically, such forms are not segmented by LEMLAT and their analysis is fully hard coded in a specific table of the db (called "forme_ecc").
I will check each of these forms in the lexicographic sources of LEMLAT (Georges, OLD, Laterculi + Onomasticon of Forcellini). If they are there, they will be included in the db (this might be the case of "nosse"). If not, we will have to take a decision about, as we want to separate in the "lessario" table those forms not reported by the sources of LEMLAT (there is a specific column for such information: src).
Thank you again!
Marco
… Il giorno 21 ago 2017, alle ore 11:49, Neven Jovanović ***@***.***> ha scritto:
We have tested LEMLAT on a reading list classical Latin corpus of some 23,700 words and 8,538 different word forms: Terence's Adelphoe, Horace's Odes Bk. 1, Tibullus Bk. 1, Seneca's Letters Bk. 1 (all editions from the PerseusDL collection). Beside various forms of personal names (and some typos in our sources), there were 40 word forms not recognized by LEMLAT; a tiny percent of all forms -- but the list is below. Some reasons for not recognizing the forms seem to be orthographical (ë, omitted -p- in emta, demsi, oe in foeneraret; words joined instead of separated -- illiusmodi). Some have to do with meter in comedy - the elided -n', from -ne, is regularly not recognized by LEMLAT. Some missing forms are fairly common: norimus, nosse.
I propose that the forms from the list below be added to the LEMLAT database.
adteruisse
audistin
coëmisse
demseris
demsi
egon
emta
emtae
emtam
foeneraret
haecine
hancine
hocine
hoscine
illan
illiusmodi
ipsus
lucu
men
norimus
nosse
nossem
nostin
numquidnam
poëta
poëtae
posthaec
propediem
quamobrem
quamprimum
quandoquidem
quorundam
quotannis
refrixerit
sumtuosa
tamdiu
tantummodo
tercentenas
tetigin
tun
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Per apportare questa modifica, si tenga conto che si tratta di una modifica che impatta LEMLAT e solo potenzialmente anche il lemmario di LiLa. La lista di Neven è una lista di forme non riconosciute da LEMLAT. Non sono lemmi e non impattano il lemmario LiLa, se non nei due casi qui sotto descritti. Impatto sul lemmario di LiLa:
Flavio è la persona più giusta per apportare modifiche a LEMLAT, perché ha ben chiaro il quadro complessivo delle tabelle del lemlat_db. Ricordo che si inseriscono in LiLa nuovi lemmi/wr solo se si realizza una di queste condizioni:
|
We have tested LEMLAT on a corpus of classical Latin texts from a university reading list. The corpus contains some 23,700 words and 8,538 different word forms: Terence's Adelphoe, Horace's Odes Bk. 1, Tibullus Bk. 1, Seneca's Letters Bk. 1 (all editions from the PerseusDL collection). Beside various forms of personal names (and some typos in our sources), there were 40 word forms not recognized by LEMLAT; a tiny percent of all forms -- but the list is below. Some reasons for not recognizing the forms seem to be orthographical (ë, omitted -p- in emta, demsi, oe in foeneraret; words joined instead of separated -- illiusmodi). Some have to do with meter in comedy - the elided -n', from -ne, is regularly not recognized by LEMLAT. Some missing forms are fairly common: norimus, nosse.
I propose that the forms from the list below be added to the LEMLAT database.
The text was updated successfully, but these errors were encountered: