v3.0.0
txtai 3.0.0 is a major release with a significant number of new features. This release overhauls the project structure, consolidates logic into pipelines and introduces workflows.
Summary of txtai features:
- 🔎 Large-scale similarity search with multiple index backends (Faiss, Annoy, Hnswlib)
- 📄 Create embeddings for text snippets, documents, audio and images. Supports transformers and word vectors.
- 💡 Machine-learning pipelines to run extractive question-answering, zero-shot labeling, transcription, translation, summarization and text extraction
- ↪️️ Workflows that join pipelines together to aggregate business logic. txtai processes can be microservices or full-fledged indexing workflows.
- 🔗 API bindings for JavaScript, Java, Rust and Go
- ☁️ Cloud-native architecture that scales out with container orchestration systems (e.g. Kubernetes)
New Features
- Add Docker file for API (#59)
- Require Faiss 1.7.0 (#60)
- Add summary pipeline (#65)
- Add text extraction pipeline (#66)
- Add transcription pipeline (#67)
- Add translation pipeline (#68)
- Add workflow framework (#69)
- Add additional pipeline abstraction layer for tensor frameworks (#70)
- Add tests for new v3 functionality (#71)
- Add notebooks covering new v3 functionality (#73)
- Add Pipeline Factory (#76)
- Add API extensions (#77)
- Add workflow builder application (#80)
- Add text segmentation pipeline (#81)
- Add workflow to API (#82)
- Add service workflow task (#83)
- Add object storage workflow task (#84)
- Add URL workflow task (#85)
Improvements
- Refactor code into smaller components and modules (#63)
- Modify pipeline to accept GPU device id (#64)
- Allow direct download of sentence-transformer models (#72)
- Update documentation, add site through GitHub pages (#75)
- Modularize the API (#78)
- Add default truncation to pipelines (#79)