[Feature] Gensim similarity text analysis #213

sethwoodworth · 2012-12-19T21:56:53Z

This is a big feature, and is listed here as a placeholder for the conversation of when or if to add it.

Gensim is a free Python framework designed to automatically extract
semantic topics from documents, as efficiently (computer-wise) and
painlessly (human-wise) as possible.
...
Once these statistical patterns are found, any plain text documents can
be succinctly expressed in the new, semantic representation, and
queried for topical similarity against other documents.

http://radimrehurek.com/gensim/intro.html

There is also a pre-packaged server implementation of the library, that looks like it would be ideal as a dedicated processing server for document's similarity.

https://github.com/piskvorky/gensim-simserver

It uses an extreme free software license, the AGPL

This means you may use simserver freely in your application (even
commercial application!), but you must then open-source your
application as well, under an AGPL-compatible license.

But luckily for us, our license is totally compatible with theirs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Gensim similarity text analysis #213

[Feature] Gensim similarity text analysis #213

sethwoodworth commented Dec 19, 2012

[Feature] Gensim similarity text analysis #213

[Feature] Gensim similarity text analysis #213

Comments

sethwoodworth commented Dec 19, 2012