Skip to content

An experiment comparing the performance of text segmentation algorithms.

Notifications You must be signed in to change notification settings

contours/textseg

Repository files navigation

textseg

Use cabal configure, cabal build to create an executable in the dist directory. You may need to cabal install some dependencies.

The TopicTiling segmentation algorithm requires the "lda" executable from GibbsLDA++ to be in your $PATH. In fact, a slightly modified version (implementing the mode method for topic assignment) is required.

The "NLTK" segmentation algorithm (using the TextTiling implementation in NLTK) requires, naturally, that NLTK be installed and available to Python.

TODO: write the rest of this readme

About

An experiment comparing the performance of text segmentation algorithms.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published