Improving Lesk Overlaps #15

alvations · 2015-01-15T01:10:49Z

This is more of a performance than a theoretical issue. In theory, it's implemented as they are presented with their respective papers, simple overlaps.

Going after the state-of-art will mean that the implementation is not as represented in the paper. Supervised learning part is going to be a long shot since feature extraction is another headache.

Possibly, improving the overlaps should be a better move for the current code.

Look at the normalization https://github.com/alvations/pywsd/blob/master/pywsd/lesk.py#L41
Think about the effects of lemmatized overlaps vs unlemmatized overlaps. (currently, it's lemmatized overlap by default)
Handling tie-breakers when #overlaps is the same.
Fallback on MFS (that involves extracting MFS from annotated corpus)

alvations added the enhancement label Jan 15, 2015

alvations changed the title ~~Lesk Overlap not doing well.~~ Improving Lesk Overlaps Jan 15, 2015

alvations added this to the Version 1.1 milestone Jan 18, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improving Lesk Overlaps #15

Improving Lesk Overlaps #15

alvations commented Jan 15, 2015

Improving Lesk Overlaps #15

Improving Lesk Overlaps #15

Comments

alvations commented Jan 15, 2015