Skip to content

v0.7.0

Compare
Choose a tag to compare
@shubhadeepd shubhadeepd released this 18 Jun 15:52
· 93 commits to main since this release
b43e8b0

This release switches all examples to use cloud hosted GPU accelerated LLM and embedding models from Nvidia API Catalog as default. It also deprecates support to deploy on-prem models using NeMo Inference Framework Container and adds support to deploy accelerated generative AI models across the cloud, data center, and workstation using latest Nvidia NIM-LLM.

Added

Changed

  • All examples now use llama3 models from Nvidia API Catalog as default. Summary of updated examples and the model it uses is available here.
  • Switched default embedding model of all examples to Snowflake arctic-embed-I model
  • Added more verbose logs and support to configure log level for chain server using LOG_LEVEL enviroment variable.
  • Bumped up version of langchain-nvidia-ai-endpoints, sentence-transformers package and milvus containers
  • Updated base containers to use ubuntu 22.04 image nvcr.io/nvidia/base/ubuntu:22.04_20240212
  • Added llama-index-readers-file as dependency to avoid runtime package installation within chain server.

Deprecated