Add embeddings cluster documentation

neuml · May 21, 2021 · b9e84d9 · b9e84d9
1 parent 6ac0a8c
commit b9e84d9
Showing 1 changed file with 15 additions and 0 deletions.
diff --git a/docs/api.md b/docs/api.md
@@ -88,6 +88,21 @@ docker run --name txtai.api --runtime=nvidia -p 8000:8000 --rm -it txtai.api
 
 This will bring up an API instance without having to install Python, txtai or any dependencies on your machine!
 
+## Distributed embeddings clusters
+
+The API supports combining multiple API instances into a single logical embeddings index. An example configuration is shown below.
+
+```yaml
+cluster:
+    shards:
+        - http://127.0.0.1:8002
+        - http://127.0.0.1:8003
+```
+
+This configuration aggregates the API instances above as index shards. Data is evenly split among each of the shards at index time. Queries are run in parallel against each shard and the results are joined together. This method allows horizontal scaling and supports very large index clusters.
+
+This method is only recommended for data sets in the 1 billion+ records. The ANN libraries can easily support smaller data sizes and this method is not worth the additional complexity. At this time, new shards can not be added after building the initial index.
+
 ## Differences between Python and API
 
 The txtai API provides all the major functionality found in this project. But there are differences due to the nature of JSON and differences across the supported programming languages. For example, any Python callable method is available at a named endpoint (i.e. instead of summary() the method call would be summary.summary()).