Skip to content

Commit

Permalink
Documentation updates
Browse files Browse the repository at this point in the history
  • Loading branch information
davidmezzetti committed Nov 24, 2024
1 parent ff4d4da commit 0f916d3
Show file tree
Hide file tree
Showing 5 changed files with 20 additions and 10 deletions.
9 changes: 6 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,9 +32,11 @@ txtai is an all-in-one embeddings database for semantic search, LLM orchestratio
![architecture](https://raw.githubusercontent.com/neuml/txtai/master/docs/images/architecture.png#gh-light-mode-only)
![architecture](https://raw.githubusercontent.com/neuml/txtai/master/docs/images/architecture-dark.png#gh-dark-mode-only)

Embeddings databases are a union of vector indexes (sparse and dense), graph networks and relational databases. This enables vector search with SQL, topic modeling, graph analysis and more.
Embeddings databases are a union of vector indexes (sparse and dense), graph networks and relational databases.

Embeddings databases can stand on their own and/or serve as a powerful knowledge source for large language model (LLM) prompts. Build autonomous agents, retrieval augmented generation (RAG) processes and multi-model workflows.
This foundation enables vector search and/or serves as a powerful knowledge source for large language model (LLM) applications.

Build autonomous agents, retrieval augmented generation (RAG) processes, multi-model workflows and more.

Summary of txtai features:

Expand All @@ -44,6 +46,7 @@ Summary of txtai features:
- ↪️️ Workflows to join pipelines together and aggregate business logic. txtai processes can be simple microservices or multi-model workflows.
- 🤖 Agents that intelligently connect embeddings, pipelines, workflows and other agents together to autonomously solve complex problems
- ⚙️ Build with Python or YAML. API bindings available for [JavaScript](https://github.com/neuml/txtai.js), [Java](https://github.com/neuml/txtai.java), [Rust](https://github.com/neuml/txtai.rs) and [Go](https://github.com/neuml/txtai.go).
- 🔋 Batteries included with defaults to get up and running fast
- ☁️ Run local or scale out with container orchestration

txtai is built with Python 3.9+, [Hugging Face Transformers](https://github.com/huggingface/transformers), [Sentence Transformers](https://github.com/UKPLab/sentence-transformers) and [FastAPI](https://github.com/tiangolo/fastapi). txtai is open-source under an Apache 2.0 license.
Expand Down Expand Up @@ -194,7 +197,7 @@ See the table below for the current recommended models. These models all allow c
| [Image Captions](https://neuml.github.io/txtai/pipeline/image/caption) | [BLIP](https://hf.co/Salesforce/blip-image-captioning-base) |
| [Labels - Zero Shot](https://neuml.github.io/txtai/pipeline/text/labels) | [BART-Large-MNLI](https://hf.co/facebook/bart-large) |
| [Labels - Fixed](https://neuml.github.io/txtai/pipeline/text/labels) | Fine-tune with [training pipeline](https://neuml.github.io/txtai/pipeline/train/trainer) |
| [Large Language Model (LLM)](https://neuml.github.io/txtai/pipeline/text/llm) | [Mistral 7B OpenOrca](https://hf.co/Open-Orca/Mistral-7B-OpenOrca) |
| [Large Language Model (LLM)](https://neuml.github.io/txtai/pipeline/text/llm) | [Llama 3.1 Instruct](https://hf.co/meta-llama/Llama-3.1-8B-Instruct) |
| [Summarization](https://neuml.github.io/txtai/pipeline/text/summary) | [DistilBART](https://hf.co/sshleifer/distilbart-cnn-12-6) |
| [Text-to-Speech](https://neuml.github.io/txtai/pipeline/audio/texttospeech) | [ESPnet JETS](https://hf.co/NeuML/ljspeech-jets-onnx) |
| [Transcription](https://neuml.github.io/txtai/pipeline/audio/transcription) | [Whisper](https://hf.co/openai/whisper-base) |
Expand Down
4 changes: 2 additions & 2 deletions docs/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,8 +66,8 @@ import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from txtai import LLM

# Load Mistral-7B-OpenOrca
path = "Open-Orca/Mistral-7B-OpenOrca"
# Load Phi 3.5-mini
path = "microsoft/Phi-3.5-mini-instruct"
model = AutoModelForCausalLM.from_pretrained(
path,
torch_dtype=torch.bfloat16,
Expand Down
7 changes: 5 additions & 2 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,9 +34,11 @@ txtai is an all-in-one embeddings database for semantic search, LLM orchestratio
![architecture](images/architecture.png#gh-light-mode-only)
![architecture](images/architecture-dark.png#gh-dark-mode-only)

Embeddings databases are a union of vector indexes (sparse and dense), graph networks and relational databases. This enables vector search with SQL, topic modeling, graph analysis and more.
Embeddings databases are a union of vector indexes (sparse and dense), graph networks and relational databases.

Embeddings databases can stand on their own and/or serve as a powerful knowledge source for large language model (LLM) prompts. Build autonomous agents, retrieval augmented generation (RAG) processes and multi-model workflows.
This foundation enables vector search and/or serves as a powerful knowledge source for large language model (LLM) applications.

Build autonomous agents, retrieval augmented generation (RAG) processes, multi-model workflows and more.

Summary of txtai features:

Expand All @@ -46,6 +48,7 @@ Summary of txtai features:
- ↪️️ Workflows to join pipelines together and aggregate business logic. txtai processes can be simple microservices or multi-model workflows.
- 🤖 Agents that intelligently connect embeddings, pipelines, workflows and other agents together to autonomously solve complex problems
- ⚙️ Build with Python or YAML. API bindings available for [JavaScript](https://github.com/neuml/txtai.js), [Java](https://github.com/neuml/txtai.java), [Rust](https://github.com/neuml/txtai.rs) and [Go](https://github.com/neuml/txtai.go).
- 🔋 Batteries included with defaults to get up and running fast
- ☁️ Run local or scale out with container orchestration

txtai is built with Python 3.9+, [Hugging Face Transformers](https://github.com/huggingface/transformers), [Sentence Transformers](https://github.com/UKPLab/sentence-transformers) and [FastAPI](https://github.com/tiangolo/fastapi). txtai is open-source under an Apache 2.0 license.
Expand Down
2 changes: 1 addition & 1 deletion docs/models.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ See the table below for the current recommended models. These models all allow c
| [Image Captions](./pipeline/image/caption.md) | [BLIP](https://hf.co/Salesforce/blip-image-captioning-base) |
| [Labels - Zero Shot](./pipeline/text/labels.md) | [BART-Large-MNLI](https://hf.co/facebook/bart-large) |
| [Labels - Fixed](./pipeline/text/labels.md) | Fine-tune with [training pipeline](./pipeline/train/trainer.md) |
| [Large Language Model (LLM)](./pipeline/text/llm.md) | [Mistral 7B OpenOrca](https://hf.co/Open-Orca/Mistral-7B-OpenOrca) |
| [Large Language Model (LLM)](./pipeline/text/llm.md) | [Llama 3.1 Instruct](https://hf.co/meta-llama/Llama-3.1-8B-Instruct) |
| [Summarization](./pipeline/text/summary.md) | [DistilBART](https://hf.co/sshleifer/distilbart-cnn-12-6) |
| [Text-to-Speech](./pipeline/audio/texttospeech.md) | [ESPnet JETS](https://hf.co/NeuML/ljspeech-jets-onnx) |
| [Transcription](./pipeline/audio/transcription.md) | [Whisper](https://hf.co/openai/whisper-base) |
Expand Down
8 changes: 6 additions & 2 deletions examples/01_Introducing_txtai.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -35,17 +35,21 @@
"\n",
"[txtai](https://github.com/neuml/txtai) is an all-in-one embeddings database for semantic search, LLM orchestration and language model workflows.\n",
"\n",
"Embeddings databases are a union of vector indexes (sparse and dense), graph networks and relational databases. This enables vector search with SQL, topic modeling, retrieval augmented generation (RAG) and more.\n",
"Embeddings databases are a union of vector indexes (sparse and dense), graph networks and relational databases.\n",
"\n",
"Embeddings databases can stand on their own and/or serve as a powerful knowledge source for large language model (LLM) prompts.\n",
"This foundation enables vector search and/or serves as a powerful knowledge source for large language model (LLM) applications.\n",
"\n",
"Build autonomous agents, retrieval augmented generation (RAG) processes, multi-model workflows and more.\n",
"\n",
"The following is a summary of key features:\n",
"\n",
"- 🔎 Vector search with SQL, object storage, topic modeling, graph analysis and multimodal indexing\n",
"- 📄 Create embeddings for text, documents, audio, images and video\n",
"- 💡 Pipelines powered by language models that run LLM prompts, question-answering, labeling, transcription, translation, summarization and more\n",
"- ↪️️ Workflows to join pipelines together and aggregate business logic. txtai processes can be simple microservices or multi-model workflows.\n",
"- 🤖 Agents that intelligently connect embeddings, pipelines, workflows and other agents together to autonomously solve complex problems\n",
"- ⚙️ Build with Python or YAML. API bindings available for [JavaScript](https://github.com/neuml/txtai.js), [Java](https://github.com/neuml/txtai.java), [Rust](https://github.com/neuml/txtai.rs) and [Go](https://github.com/neuml/txtai.go).\n",
"- 🔋 Batteries included with defaults to get up and running fast\n",
"- ☁️ Run local or scale out with container orchestration\n",
"\n",
"txtai is built with Python 3.9+, [Hugging Face Transformers](https://github.com/huggingface/transformers), [Sentence Transformers](https://github.com/UKPLab/sentence-transformers) and [FastAPI](https://github.com/tiangolo/fastapi). txtai is open-source under an Apache 2.0 license."
Expand Down

0 comments on commit 0f916d3

Please sign in to comment.