Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving Lesk Overlaps #15

Open
alvations opened this issue Jan 15, 2015 · 0 comments
Open

Improving Lesk Overlaps #15

alvations opened this issue Jan 15, 2015 · 0 comments

Comments

@alvations
Copy link
Owner

This is more of a performance than a theoretical issue. In theory, it's implemented as they are presented with their respective papers, simple overlaps.

Going after the state-of-art will mean that the implementation is not as represented in the paper. Supervised learning part is going to be a long shot since feature extraction is another headache.

Possibly, improving the overlaps should be a better move for the current code.

  1. Look at the normalization https://github.com/alvations/pywsd/blob/master/pywsd/lesk.py#L41
  2. Think about the effects of lemmatized overlaps vs unlemmatized overlaps. (currently, it's lemmatized overlap by default)
  3. Handling tie-breakers when #overlaps is the same.
  4. Fallback on MFS (that involves extracting MFS from annotated corpus)
@alvations alvations changed the title Lesk Overlap not doing well. Improving Lesk Overlaps Jan 15, 2015
@alvations alvations added this to the Version 1.1 milestone Jan 18, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant