Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[distsim] feature request - scope limitation by top N terms #298

Open
gilnoh opened this issue Nov 7, 2013 · 0 comments
Open

[distsim] feature request - scope limitation by top N terms #298

gilnoh opened this issue Nov 7, 2013 · 0 comments

Comments

@gilnoh
Copy link
Member

gilnoh commented Nov 7, 2013

(It is possible that the feature is already there but unknown to me. But just in case)

Lexical resource - For evaluation and efficiency purpose, it is often convenient to limit the scope of knowledge resource. Something like top-10k terms from the corpus, etc.

However, currently, the way to limit the terms (elements) are to set min-count. This works the same way, but it would be nice if we can limit the terms by giving "top 10k frequent terms" that appeared in the corpus, or something like that. This will produce (a roughly) predictable size of lexical resource, with known set of lexical terms.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant