Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exception in chat.py due to maximal_marginal_relevance Invalid Argument in DeepLake Similarity Search #28

Open
digitalbuddha opened this issue Feb 29, 2024 · 3 comments

Comments

@digitalbuddha
Copy link

Howdy! I was going through the readme, all was well until I got to the step of doing a search.

Describe the bug
An uncaught exception occurs in the chat.py module when executing a similarity search through the DeepLake vector store. The traceback indicates that the maximal_marginal_relevance argument is not a valid parameter for the search method. This results in a failure of the search_db function, impacting the chat application's ability to process and respond to user inputs.

Set up the environment and dependencies as per the project requirements.
Run the chat application using the command:python src/main.py chat --activeloop-dataset-name my-dataset.
Input a query that triggers the search_db function, for example "what are the apis of the project"
The application throws the exception and terminates.
Expected behavior
The expected behavior is for the application to successfully process the query and return relevant results without crashing. The maximal_marginal_relevance argument should either be correctly handled or removed if it's not applicable to the similarity search method in the DeepLake vector store.

exception below

2024-02-29 09:03:58.916 Uncaught app exception
Traceback (most recent call last):
  File "/opt/homebrew/lib/python3.11/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 535, in _run_script
    exec(code, module.__dict__)
  File "/Users/mnakhimovich/workspace/Chat-with-Github-Repo/src/utils/chat.py", line 93, in <module>
    run_chat_app(args.activeloop_dataset_path)
  File "/Users/mnakhimovich/workspace/Chat-with-Github-Repo/src/utils/chat.py", line 42, in run_chat_app
    output = search_db(db, user_input)
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/mnakhimovich/workspace/Chat-with-Github-Repo/src/utils/chat.py", line 85, in search_db
    return qa.run(query)
           ^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/langchain_core/_api/deprecation.py", line 145, in warning_emitting_wrapper
    return wrapped(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/langchain/chains/base.py", line 545, in run
    return self(args[0], callbacks=callbacks, tags=tags, metadata=metadata)[
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/langchain_core/_api/deprecation.py", line 145, in warning_emitting_wrapper
    return wrapped(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/langchain/chains/base.py", line 378, in __call__
    return self.invoke(
           ^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/langchain/chains/base.py", line 163, in invoke
    raise e
  File "/opt/homebrew/lib/python3.11/site-packages/langchain/chains/base.py", line 153, in invoke
    self._call(inputs, run_manager=run_manager)
  File "/opt/homebrew/lib/python3.11/site-packages/langchain/chains/retrieval_qa/base.py", line 141, in _call
    docs = self._get_docs(question, run_manager=_run_manager)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/langchain/chains/retrieval_qa/base.py", line 221, in _get_docs
    return self.retriever.get_relevant_documents(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/langchain_core/retrievers.py", line 244, in get_relevant_documents
    raise e
  File "/opt/homebrew/lib/python3.11/site-packages/langchain_core/retrievers.py", line 237, in get_relevant_documents
    result = self._get_relevant_documents(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/langchain_core/vectorstores.py", line 674, in _get_relevant_documents
    docs = self.vectorstore.similarity_search(query, **self.search_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/langchain_community/vectorstores/deeplake.py", line 530, in similarity_search
    return self._search(
           ^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/langchain_community/vectorstores/deeplake.py", line 402, in _search
    self._validate_kwargs(kwargs, "search")
  File "/opt/homebrew/lib/python3.11/site-packages/langchain_community/vectorstores/deeplake.py", line 929, in _validate_kwargs
    raise TypeError(
TypeError: `maximal_marginal_relevance` are not a valid argument to search method
@vardotexe
Copy link

Hey did you find any solution for this?

@mameesie
Copy link

mameesie commented Oct 6, 2024

same here

@mameesie
Copy link

mameesie commented Oct 6, 2024

in utils/chat.py you should change def search_db with the following then it will work:

def search_db(db, query):
    """Search for a response to the query in the DeepLake database using MMR."""
    # Create a retriever that uses MMR search
    retriever = db.as_retriever(search_type="mmr")
    
    # Set the search parameters
    retriever.search_kwargs = {
        "distance_metric": "cos",
        "fetch_k": 100,  # Number of initial results to fetch
        "k": 10,         # Number of results to finally return
        #"lambda_mult": 0.5  # MMR diversity parameter (0 = max diversity, 1 = max relevance)
    }
    
    # Create a ChatOpenAI model instance
    model = ChatOpenAI(model="gpt-3.5-turbo")
    
    # Create a RetrievalQA instance from the model and retriever
    qa = RetrievalQA.from_llm(model, retriever=retriever)
    
    # Return the result of the query
    return qa.run(query)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants