v0.5.0
This release adds new dedicated RAG examples showcasing state of the art usecases, switches to the latest API catalog endpoints from NVIDIA and also refactors the API interface of chain-server. This release also improves the developer experience by adding github pages based documentation and streamlining the example deployment flow using dedicated compose files.
Added
- Github pages based documentation.
- New examples showcasing
- Support for delete and list APIs in chain-server component
- Streamlined RAG example deployment
- Dedicated new docker compose files for every examples.
- Dedicated docker compose files for launching vector DB solutions.
- New configurations to control top k and confidence score of retrieval pipeline.
- Added a notebook which covers how to train SLMs with various techniques using NeMo Framework.
- Added more experimental examples showcasing new usecases.
- New dedicated notebook showcasing a RAG pipeline using web pages.
Changed
- Switched from NVIDIA AI Foundation to NVIDIA API Catalog endpoints for accessing cloud hosted LLM models.
- Refactored API schema of chain-server component to support runtime allocation of llm parameters like temperature, max tokens, chat history etc.
- Renamed
llm-playground
service in compose files torag-playground
. - Switched base containers for all components to ubuntu instead of pytorch and optimized container build time as well as container size.
- Deprecated yaml based configuration to avoid confusion, all configurations are now environment variable based.
- Removed requirement of hardcoding
NVIDIA_API_KEY
incompose.env
file. - Upgraded all python dependencies for chain-server and rag-playground services.
Fixed
- Fixed a bug causing hallucinated answer when retriever fails to return any documents.
- Fixed some accuracy issues for all the examples.