Releases: capreolus-ir/capreolus
Releases · capreolus-ir/capreolus
v0.2.6
- Bugfix: Pytorch best dev weights selection (#109)
- Bugfix: keep QIDs with no relevant docs returned by the first stage ranker (#110)
- Bugfix: long queries can no longer cause
maxseqlen
to be exceeded (issue #105) - Add AutoTokenizer to extractors and a
maxqlen
option to truncate long queries (#112) - Add separate
testthreshold
option to rerank task - Add Pytorch and Tensorflow CEDRKNRM rerankers
- Trainer improvements, including Tensorflow AMP support and aligning
itersize
options between TF and Pytorch (#121)
v0.2.5
- Add
environment.yml
to simplify installation with Conda - Bugfix: force QIDs in folds.json files to be strings
- Bugfix: add
multithreaded
option to Pytorch trainer controlling whether a separate worker is used with PytorchDataLoader
. Default to false; true is faster, but conflicting Python packages appear to cause deadlock. This is reproducible on Colab and when using Anaconda's default packages (rather than installing from Miniconda). - Update docs
v0.2.4.1
Bug fix: allow spaces in paths passed to Anserini (e.g., the collection path)
v0.2.4
- New rerankers: PARADE and BERT-MaxP variants
- Collection iterator:
for doc in collection_object: ...
- Experimental support for Birch reranker
- Bug fix:
evaluator
import should no longer break Python 3.6
v0.2.3
- Report results interpolated with searcher
- Add new, more efficient
EmbedText
that avoids generating random embeddings for OOV terms. The previous extractor is nowSlowEmbedText
- Bug fix: Add missing NF queries file
v0.2.2
v0.2.1
- Searcher refactor, with BM25, BM25RM3, BM25PRF, AxiomaticSemanticMatching, DirichletQL, and QLJM supporting lists of parameters
- support setting a relevance level threshold (i.e.,
trec_eval --level_for_rel
) - miscellaneous cleanup and convenience methods
v0.2.0
Flexible pipeline
- Replace fixed pipeline with a configurable version based on profane
- Add support for Tensorboard, Tensorflow, Tensorflow+TPUs, and TF-Ranking
- Add support for several new datasets, including TREC-COVID and COVID-QA
- Add support for all Anserini searchers
- Unit and integration tests