Skip to content

Commit

Permalink
feat: Added Document storage pros and cons to FAQ
Browse files Browse the repository at this point in the history
  • Loading branch information
tazarov committed Sep 5, 2024
1 parent 84cb74b commit 3fc9402
Showing 1 changed file with 23 additions and 1 deletion.
24 changes: 23 additions & 1 deletion docs/faq/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,9 @@ with Chroma. These information below is based on interactions with the Chroma co

### What does Chroma use to index embedding vectors?

Chroma uses its own [fork]() of HNSW lib for indexing and searching embeddings.
Chroma uses its own [fork]() of HNSW lib for indexing and searching embeddings. In addition to HNSW, Chroma also uses a
Brute Force index, which acts as a buffer (prior to updating the HNSW graph) and performs exhaustive search using the
same distance metric as the HNSW index.

**Alternative Questions:**

Expand Down Expand Up @@ -66,6 +68,26 @@ print(ef(["test"]))
the embeddings before adding them to Chroma (pass `normalize_embeddings=True` to the `SentenceTransformerEmbeddingFunction`
EF constructor).

### Should I store my documents in Chroma?

> Note: This applies to Chroma single-node and local embedded clients. (Chroma version ca. 0.5.x)
Chroma allows users to store both embeddings and documents, alongside metadata, in collections. Documents and metadata
are both optional and depending on your use case you may choose to store them in Chroma or externally, or not at all.

Here are some pros/cons to help you decide whether to store your documents in Chroma:

**Pros:**

- Keeps all the data in the same place. You don't have to manage a separate DB for the documents
- Allows you to do keyword searches on the documents

**Cons:**

- The database can grow substantially in size because documents are effectively duplicated - once for storing them as
metadata for queries and another for the FTS5 index.
- Queries performance hit

## Commonly Encountered Problems

### Collection Dimensionality Mismatch
Expand Down

0 comments on commit 3fc9402

Please sign in to comment.