Skip to content

v5.1.0

Compare
Choose a tag to compare
@davidmezzetti davidmezzetti released this 18 Oct 14:13
· 639 commits to master since this release

This release adds new model support for the translation pipeline, OpenAI Whisper support in the transcription pipeline and ARM Docker images. Topic modeling was also updated with improvements, including how to use BM25/TF-IDF indexes to drive topic models.

See below for full details on the new features, improvements and bug fixes.

New Features

  • Multiarch docker image (#324)
  • Add notebook covering classic topic modeling with BM25 (#360)

Improvements

  • Read authentication parameters from storage task (#332)
  • Update scoring algorithms (#351)
  • Add config option for list of stopwords to ignore with topic generation (#352)
  • Allow for setting custom translation model path (#355)
  • Update caption pipeline to call image-to-text pipeline (#361)
  • Update transcription pipeline to call automatic-speech-recognition pipeline (#362)
  • Only pass tokenizer to pipeline when necessary (#363)
  • Improve default max length logic for text generation (#364)
  • Update transcription notebook (#365)
  • Update translation notebook (#366)
  • Move mkdocs dependencies from docs.yml to setup.py (#368)

Bug Fixes

  • GitHub Actions build error with torch 1.12 on macOS (#300)
  • SQLite JSON support not built into Python Windows builds < 3.9 (#356)
  • Use tags field in application.add (#359)
  • Fix issue with Application autosequencing (#367)