Skip to content

amikos-tech/chromadb-java-client

Folders and files

NameName
Last commit message
Last commit date

Latest commit

e00ea0d · Aug 1, 2023

History

23 Commits
Aug 1, 2023
Aug 1, 2023
Jul 31, 2023
Jul 31, 2023
Aug 1, 2023
Aug 1, 2023

Repository files navigation

Chroma Vector Database Java Client

This is a very basic/naive implementation in Java of the Chroma Vector Database API.

This client works with Chroma Version 0.4.3

Features

Embeddings Support

  • OpenAI API
  • Cohere API (including Multi-language support)
  • Sentence Transformers
  • PaLM API
  • Custom Embedding Function

Feature Parity with ChromaDB API

  • Reset
  • Heartbeat
  • List Collections
  • Raw SQL
  • Get Version
  • Create Collection
  • Delete Collection
  • Collection Add
  • Collection Get (partial without additional parameters)
  • Collection Count
  • Collection Query
  • Collection Modify
  • Collection Update
  • Collection Upsert
  • Collection Create Index
  • Collection Delete - delete documents in collection

TODO

Usage

Clone the repository and install the package locally:

git clone git@github.com:amikos-tech/chromadb-java-client.git

Install dependencies:

mvn clean compile

Ensure you have a running instance of Chroma running. We recommend one of the two following options:

Run tests:

| Important: Since we are using the OpenAI API, you need to set the OPENAI_API_KEY environment variable. Simply create .env file in the root of the repository.

mvn test

Example

import com.google.gson.internal.LinkedTreeMap;
import io.github.cdimascio.dotenv.Dotenv;
import tech.amikos.chromadb.Client;
import tech.amikos.chromadb.Collection;
import tech.amikos.chromadb.EmbeddingFunction;
import tech.amikos.chromadb.OpenAIEmbeddingFunction;
import tech.amikos.chromadb.handler.ApiException;

class TestApi {
    public void testQueryExample() throws ApiException {
        Client client = new Client("http://localhost:8000");
        Dotenv dotenv = Dotenv.load();
        String apiKey = dotenv.get("OPENAI_API_KEY");
        EmbeddingFunction ef = new OpenAIEmbeddingFunction(apiKey);
        Collection collection = client.createCollection("test-collection", null, true, ef);
        List<Map<String, String>> metadata = new ArrayList<>();
        metadata.add(new HashMap<String, String>() {{
            put("type", "scientist");
        }});
        metadata.add(new HashMap<String, String>() {{
            put("type", "spy");
        }});
        collection.add(null, metadata, Arrays.asList("Hello, my name is John. I am a Data Scientist.", "Hello, my name is Bond. I am a Spy."), Arrays.asList("1", "2"));
        LinkedTreeMap<String, Object> qr = collection.query(Arrays.asList("Who is the spy"), 10, null, null, null);
        System.out.println(qr);
    }
}

The above should output:

{ids=[[2, 1]], distances=[[0.28461432651150426, 0.5096168232841949]], metadatas=[[{key=value}, {key=value}]], embeddings=null, documents=[[Hello, my name is Bond. I am a Spy., Hello, my name is John. I am a Data Scientist.]]}

Development Notes

We have made some minor changes on top of the ChromaDB API (src/main/resources/openapi/api.yaml) so that the API can work with Java and Swagger Codegen. The reason is that statically type languages like Java don't like the anyOf and oneOf keywords (This also is the reason why we don't use the generated java client for OpenAI API).

Contributing

Pull requests are welcome.

References