Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: EOFError: Ran out of input #3637

Open
tejas-blitz opened this issue Jan 31, 2025 · 0 comments
Open

[Bug]: EOFError: Ran out of input #3637

tejas-blitz opened this issue Jan 31, 2025 · 0 comments
Labels
bug Something isn't working

Comments

@tejas-blitz
Copy link

tejas-blitz commented Jan 31, 2025

What happened?

I am running chroma db on an ec2 isntance to store 3M embeddings. After 12.5M, the push started failing.

Code:

chroma_client = chromadb.HttpClient(host= HOST, port = PORT)

image_loader = ImageLoader()

multimodal_ef = OpenCLIPEmbeddingFunction()

multimodal_db = chroma_client.get_or_create_collection(name="multimodal_db", embedding_function=multimodal_ef, data_loader=image_loader)

result = multimodal_db.get(
    limit=1,
    include=["embeddings"]
)
print(result)

I am using 300 python workers that push 5K embeddings per batch.

Error:

---------------------------------------------------------------------------
HTTPStatusError                           Traceback (most recent call last)
/usr/local/lib/python3.11/dist-packages/chromadb/api/base_http_client.py in _raise_chroma_error(resp)
     98         try:
---> 99             resp.raise_for_status()
    100         except httpx.HTTPStatusError:

6 frames
/usr/local/lib/python3.11/dist-packages/httpx/_models.py in raise_for_status(self)
    828         message = message.format(self, error_type=error_type)
--> 829         raise HTTPStatusError(message, request=request, response=self)
    830 

HTTPStatusError: Server error '500 Internal Server Error' for url 'http://HOST:PORT/api/v2/tenants/default_tenant/databases/default_database/collections/eba65d07-e5fc-4e0d-a8dc-1364ea172cad/get'
For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/500

During handling of the above exception, another exception occurred:

Exception                                 Traceback (most recent call last)
<ipython-input-5-b80efb7d61a4> in <cell line: 0>()
      4 # multimodal_db.peek()
      5 # multimodal_db.compact()
----> 6 result = multimodal_db.get(
      7     ids=["fuwxFFjY"],
      8     include=["embeddings", "metadatas"]

/usr/local/lib/python3.11/dist-packages/chromadb/api/models/Collection.py in get(self, ids, where, limit, offset, where_document, include)
    131         )
    132 
--> 133         get_results = self._client._get(
    134             collection_id=self.id,
    135             ids=get_request["ids"],

/usr/local/lib/python3.11/dist-packages/chromadb/telemetry/opentelemetry/__init__.py in wrapper(*args, **kwargs)
    148                 global tracer, granularity
    149                 if trace_granularity < granularity:
--> 150                     return f(*args, **kwargs)
    151                 if not tracer:
    152                     return f(*args, **kwargs)

/usr/local/lib/python3.11/dist-packages/chromadb/api/fastapi.py in _get(self, collection_id, ids, where, sort, limit, offset, page, page_size, where_document, include, tenant, database)
    372             limit = page_size
    373 
--> 374         resp_json = self._make_request(
    375             "post",
    376             f"/tenants/{tenant}/databases/{database}/collections/{collection_id}/get",

/usr/local/lib/python3.11/dist-packages/chromadb/api/fastapi.py in _make_request(self, method, path, **kwargs)
     88 
     89         response = self._session.request(method, url, **cast(Any, kwargs))
---> 90         BaseHTTPClient._raise_chroma_error(response)
     91         return orjson.loads(response.text)
     92 

/usr/local/lib/python3.11/dist-packages/chromadb/api/base_http_client.py in _raise_chroma_error(resp)
    101             trace_id = resp.headers.get("chroma-trace-id")
    102             if trace_id:
--> 103                 raise Exception(f"{resp.text} (trace ID: {trace_id})")
    104             raise (Exception(resp.text))

Exception: {"error":"EOFError('Ran out of input')"} (trace ID: 0)

I have tried this but a new problem has emerged now– when I run the same code I get an empty array in embeddings:
'embeddings': array([], dtype=float64),

Does someone know a solution to this?

Versions

chromadb.version is 0.6.3

Relevant log output

@tejas-blitz tejas-blitz added the bug Something isn't working label Jan 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant