Annif 0.51
This release includes a new STWFSA backend which is a wrapper around STWFSAPY, a lexical algorithm based on finite state automata. It achieves best results with short texts, i.e., titles and author keywords, and is best suited for English language data.
The NN ensemble backend has been improved with better handling of source weights. Retraining NN ensemble models after updating Annif to this version is recommended, since the quality of results can decrease if old models are used. A new option for several CLI commands has been added: --docs-limit/-d
option can be used to limit the number of documents to process, for example to create learning-curve data. Also several bugs have been fixed.
New features:
#438 Lexical STWFSAPY Backend (credit @mo-fu)
#465 Limit document number CLI option
Improvements:
#457/#458 Improved handling of source weights in NN ensemble
Bug fixes:
#454/#455 Address SonarCloud complaints
#459/#460 Pass limit parameter to Maui Server during train
#463 Fix TruncatingCorpus iterator