You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
'text' parameter is deprecated and will be ignored. Future versions will remove this argument.
'tables' parameter is deprecated and will be ignored. Future versions will remove this argument.
Error while processing job ID 0: ../data/multimodal_test.pdf
[]: failed
Failed to process the message.
↪ Event that caused this failure: annotation::1bf595f6-2274-4097-8956-0d9b6841ed55 -> All images must have the same dimensions for gRPC batching. Found: [(532, 963, 3), (575, 970, 3)]
---------------------------------------------------------------------------
ZeroDivisionError Traceback (most recent call last)
Cell In[11], line 17
1 from nv_ingest_client.client import Ingestor
3 ingestor = (
4 Ingestor(message_client_hostname="localhost")
5 .files("../data/multimodal_test.pdf")
(...)
14 ).vdb_upload()
15 )
---> 17 results = ingestor.ingest()
File ~/.local/lib/python3.10/site-packages/nv_ingest_client/client/interface.py:228, in Ingestor.ingest(self, **kwargs)
226 result = self._client.fetch_job_result(self._job_ids, **fetch_kwargs)
227 if self._vdb_bulk_upload:
--> 228 self._vdb_bulk_upload.run(result)
229 # only upload as part of jobs user specified this action
230 self._vdb_bulk_upload = None
File ~/.local/lib/python3.10/site-packages/nv_ingest_client/util/milvus.py:95, in MilvusOperator.run(self, records)
93 if isinstance(collection_name, str):
94 create_nvingest_collection(collection_name, **create_params)
---> 95 write_to_nvingest_collection(records, collection_name, **write_params)
96 elif isinstance(collection_name, dict):
97 split_params_list = _dict_to_params(collection_name, write_params)
File ~/.local/lib/python3.10/site-packages/nv_ingest_client/util/milvus.py:570, in write_to_nvingest_collection(records, collection_name, milvus_uri, minio_endpoint, sparse, enable_text, enable_charts, enable_tables, enable_images, bm25_save_path, compute_bm25_stats, access_key, secret_key, bucket_name)
568 bm25_ef = None
569 if sparse and compute_bm25_stats:
--> 570 bm25_ef = create_bm25_model(
571 records,
572 enable_text=enable_text,
573 enable_charts=enable_charts,
574 enable_tables=enable_tables,
575 enable_images=enable_images,
576 )
577 bm25_ef.save(bm25_save_path)
578 elif sparse and not compute_bm25_stats:
File ~/.local/lib/python3.10/site-packages/nv_ingest_client/util/milvus.py:456, in create_bm25_model(records, enable_text, enable_charts, enable_tables, enable_images)
453 analyzer = build_default_analyzer(language="en")
454 bm25_ef = BM25EmbeddingFunction(analyzer)
--> 456 bm25_ef.fit(all_text)
457 return bm25_ef
File ~/.local/lib/python3.10/site-packages/milvus_model/sparse/bm25/bm25.py:126, in BM25EmbeddingFunction.fit(self, corpus)
125 def fit(self, corpus: List[str]):
--> 126 self._rebuild(corpus)
File ~/.local/lib/python3.10/site-packages/milvus_model/sparse/bm25/bm25.py:112, in BM25EmbeddingFunction._rebuild(self, corpus)
110 self._clear()
111 corpus = self._tokenize_corpus(corpus)
--> 112 term_document_frequencies = self._compute_statistics(corpus)
113 self._calc_idf(term_document_frequencies)
114 self._calc_term_indices()
File ~/.local/lib/python3.10/site-packages/milvus_model/sparse/bm25/bm25.py:80, in BM25EmbeddingFunction._compute_statistics(self, corpus)
78 term_document_frequencies[word] += 1
79 self.corpus_size += 1
---> 80 self.avgdl = total_word_count / self.corpus_size
81 return term_document_frequencies
ZeroDivisionError: division by zero
The raised error is from milvus during the sparse vector calculation because no chunks of text are passed to this stage. The real error seems to have something to do with the resizing of images.
The text was updated successfully, but these errors were encountered:
Version
main
Which installation method(s) does this occur on?
No response
Describe the bug.
When running the example https://github.com/NVIDIA/nv-ingest/blob/main/examples/langchain_multimodal_rag.ipynb in a brev.dev environment fails to correctly run the ingest with an error like this:
The raised error is from milvus during the sparse vector calculation because no chunks of text are passed to this stage. The real error seems to have something to do with the resizing of images.
The text was updated successfully, but these errors were encountered: