An advanced Retrieval-Augmented Generation (RAG) solution designed to tackle complex questions that simple semantic similarity-based retrieval cannot solve. This project showcases a sophisticated deterministic graph acting as the "brain" of a highly controllable autonomous agent capable of answering non-trivial questions from your own data.
π Explore my comprehensive guide on RAG techniques to complement this advanced agent implementation with many other RAG techniques.
π€ Explore my GenAI Agents Repository to complement this advanced agent implementation with many other AI Agents implementations and tutorials.
π Cutting-edge Updates |
π‘ Expert Insights |
π― Top 0.1% Content |
Join thousands of AI enthusiasts getting unique cutting edge insights and free tutorials!
- Sophisticated Deterministic Graph: Acts as the "brain" of the agent, enabling complex reasoning.
- Controllable Autonomous Agent: Capable of answering non-trivial questions from custom datasets.
- Hallucination Prevention: Ensures answers are solely based on provided data, avoiding AI hallucinations.
- Multi-step Reasoning: Breaks down complex queries into manageable sub-tasks.
- Adaptive Planning: Continuously updates its plan based on new information.
- Performance Evaluation: Utilizes
Ragas
metrics for comprehensive quality assessment.
- PDF Loading and Processing: Load PDF documents and split them into chapters.
- Text Preprocessing: Clean and preprocess the text for better summarization and encoding.
- Summarization: Generate extensive summaries of each chapter using large language models.
- Book Quotes Database Creation: Create a database for specific questions that will need access to quotes from the book.
- Vector Store Encoding: Encode the book content and chapter summaries into vector stores for efficient retrieval.
- Question Processing:
- Anonymize the question by replacing named entities with variables.
- Generate a high-level plan to answer the anonymized question.
- De-anonymize the plan and break it down into retrievable or answerable tasks.
- Task Execution:
- For each task, decide whether to retrieve information or answer based on context.
- If retrieving, fetch relevant information from vector stores and distill it.
- If answering, generate a response using chain-of-thought reasoning.
- Verification and Re-planning:
- Verify that generated content is grounded in the original context.
- Re-plan remaining steps based on new information.
- Final Answer Generation: Produce the final answer using accumulated context and chain-of-thought reasoning.
The solution is evaluated using Ragas
metrics:
- Answer Correctness
- Faithfulness
- Answer Relevancy
- Context Recall
- Answer Similarity
The algorithm was tested using the first Harry Potter book, allowing for monitoring of the model's reliance on retrieved information versus pre-trained knowledge. This choice enables us to verify whether the model is using its pre-trained knowledge or strictly relying on the retrieved information from vector stores.
Q: How did the protagonist defeat the villain's assistant?
To solve this question, the following steps are necessary:
- Identify the protagonist of the plot.
- Identify the villain.
- Identify the villain's assistant.
- Search for confrontations or interactions between the protagonist and the villain.
- Deduce the reason that led the protagonist to defeat the assistant.
The agent's ability to break down and solve such complex queries demonstrates its sophisticated reasoning capabilities.
- Python 3.8+
- API key for your chosen LLM provider
- Clone the repository:
git clone https://github.com/NirDiamant/Controllable-RAG-Agent.git cd Controllable-RAG-Agent
- Set up environment variables:
Create a
.env
file in the root directory with your API key:you can look at theOPENAI_API_KEY= GROQ_API_KEY=
.env.example
file for reference.
- run the following command to build the docker image
docker-compose up --build
- Install required packages:
pip install -r requirements.txt
-
Explore the step-by-step tutorial:
sophisticated_rag_agent_harry_potter.ipynb
-
Run real-time agent visualization (no docker):
streamlit run simulate_agent.py
-
Run real-time agent visualization (with docker): open your browser and go to
http://localhost:8501/
- LangChain
- FAISS Vector Store
- Streamlit (for visualization)
- Ragas (for evaluation)
- Flexible integration with various LLMs (e.g., OpenAI GPT models, Groq, or others of your choice)
- Encoding both book content in chunks, chapter summaries generated by LLM, and quotes from the book.
- Anonymizing the question to create a general plan without biases or pre-trained knowledge of any LLM involved.
- Breaking down each task from the plan to be executed by custom functions with full control.
- Distilling retrieved content for better and accurate LLM generations, minimizing hallucinations.
- Answering a question based on context using a Chain of Thought, which includes both positive and negative examples, to arrive at a well-reasoned answer rather than just a straightforward response.
- Content verification and hallucination-free verification as suggested in "Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection" - https://arxiv.org/abs/2310.11511.
- Utilizing an ongoing updated plan made by an LLM to solve complicated questions. Some ideas are derived from "Plan-and-Solve Prompting" - https://arxiv.org/abs/2305.04091 and the "babyagi" project - https://github.com/yoheinakajima/babyagi.
- Evaluating the model's performance using
Ragas
metrics like answer correctness, faithfulness, relevancy, recall, and similarity to ensure high-quality answers.
Contributions are welcome! Please feel free to submit a pull request or open an issue for any suggestions or improvements.
Special thanks to Elad Levi for the valuable advice and ideas.
This project is licensed under the Apache-2.0 License - see the LICENSE file for details.
βοΈ If you find this repository helpful, please consider giving it a star!
Keywords: RAG, Retrieval-Augmented Generation, Agent, Langgraph, NLP, AI, Machine Learning, Information Retrieval, Natural Language Processing, LLM, Embeddings, Semantic Search