This framework provides flexible Retrieval Augmented Generation (RAG) by integrating with frameworks like Haystack and LlamaIndex. It uses parametrized configurations to leverage various transformer and LLM models. Key features include:
- Integration with Weaviate vector database and Nebula graph database.
- Custom retrievers combining Hybrid Search, Embedding Fine-tuning, Generative Pseudo Labelling, and ReRanking techniques.
Clone the repository:
git clone https://github.com/a-romero/qrage.git
cd qrage
- Navigate to the standard directory:
cd vectordb/weaviate/standard
- Start the Weaviate instance using Docker:
docker-compose up -d
- Navigate to the advanced directory:
cd vectordb/weaviate/advanced
- Start the Weaviate instance using Docker:
docker-compose up -d
Note: The Weaviate service becomes available locally on http://localhost:8080
- Navigate to the standard directory:
cd graphdb/nebulagraph
- Start the Nebula instance using Docker:
docker-compose up -d
Note: The Nebulagraph service becomes available locally on http://localhost:9669 with the default credentials root/nebula
- OpenAI API key exported to local env variable
OPENAI_API_KEY
- HuggingFace API key exported to local env variable
HUGGINGFACEHUB_API_TOKEN
- Cohere API key exported to local env variable
COHERE_API_KEY
The following file formats are supported:
- .text
- .md
- .docx
For data input, the following sources are supported:
- Local file
- Local directory (recursive ingestion)
- HTTP(S)
- S3
The embedding models supported are:
- "sentence-transformer" => "sentence-transformers/multi-qa-mpnet-base-dot-v1"
- (OpenAI) "ada" => "text-embedding-ada-002"
- (Cohere) "embed" => "embed-multilingual-v2.0"
haystack_embed.embed(source="./data",
index_name="test",
recreate_index=True,
batch_size=5,
model="sentence-transformer",
dim=768,
gpl=False,
language="en"
)
The generative models supported are:
- (Mistral) "mistral" => "mistralai/Mistral-7B-Instruct-v0.1"
- (TII) "falcon" => "tiiuae/falcon-7b-instruct"
- (OpenAI) "gpt-3.5-turbo"
- (OpenAI) "gpt-4"
- (OpenAI) "gpt-4-turbo" => "gpt-4-1106-preview"
- (Cohere) "command"
haystack_generate.generateWithVectorDB(query="How would Revolut be impacted by AIG going bankrupt?",
index_name="test",
embedding_model="sentence-transformer",
dim=768,
generative_model="gpt-4",
top_k=5,
draw_pipeline=False
)
Response:
Retriever: <haystack.nodes.retriever.dense.EmbeddingRetriever object at 0x7f39f1be1050>
Prompt: <haystack.nodes.prompt.prompt_node.PromptNode object at 0x7f390e2ce010>
Ranker: <haystack.nodes.ranker.cohere.CohereRanker object at 0x7f390c3c1dd0>
Batches: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 1/1 [00:00<00:00, 22.21it/s]
Answer: {'answers': [<Answer {'answer': "If AIG were to go bankrupt, Revolut could be impacted because it sells AIG car insurance. This could result in a loss of an insurance provider and potentially disrupt services for customers who have purchased AIG car insurance through Revolut. It might also impact Revolut's plans for launching mortgage loans.", 'type': 'generative', 'score': None, 'context': None, 'offsets_in_document': None, 'offsets_in_context': None, 'document_ids': ['073d20d7-9223-eac5-9333-54d528bc5a91', '4a723bfc-a5fd-9905-3ec7-c71ffacc2ccb', '26f31a1e-7109-1b8c-c25c-9f5283be1a93', '03aefc09-9643-93c2-4e83-f343006077a9', '67835bb8-e6fb-1c6c-a246-6e74c5f9b9b3'], 'meta': {'prompt': "Given the provided Documents, answer the Query.\n\n Query: How would Revolut be impacted by AIG going bankrupt?\n\n Documents: News β’ Jul 31, 2023\nRevolut to sell AIG car insurance, may be closer to launching mortgage loans News β’ Aug 4, 2023\nChannel News Asia β Revolut to stop crypto services for US customers\nNews β’ Aug 2, 2023\nFintech Singapore β Revolut Launches Instant Card Transfers to Over 80 Countries\nNews β’ Aug 2, 2023\nCrowdfund Insider β Revolut, Game4Ukraine to Raise Funds for Reconstruction of Ukrainian School\nNumber of Articles\n1,699\nRevolut SAVE\nGo back to Revolut's Signals & News\x0c16/08/2023, 23:44 Revolut - Recent News & Activity\nhttps://www.crunchbase.com/organization/revolut/signals_and_news/timeline 2/9\nNews β’ Jul 31, 2023\nRTE.ie β Revolut starts phasing in car insurance offering\nNews β’ Jul 31, 2023\nIrish Examiner β Revolut to sell AIG car insurance, may be closer to launching mortgage loans\nNews β’ Jul 31, 2023\nThe Independent.ie β β30pc cheaper ratesβ promised β Revolut car insurance launches in Ireland today\nwith a quote taking just βminutesβ on the app\nNews β’ Jul 25, 2023\nPYMNTS.com β Revolut Extends Accounts in US to Non-Citizens\nNews β’ Jul 24, 2023\nFinextra β Revolut launches joint accounts in the UK\nNews β’ Jul 19, 2023\nFStech β Monzo, Revolut and Wise demand βurgent reviewβ of hidden international fees\nNews β’ Jul 14, 2023\nSifted β Revolut moves closer to super-app status by adding tour and travel experience bookings\nNews β’ Jul 14, 2023\ntechbuzzireland β Revolut launches marketplace with over 300,000 tours, activities, and attractions\nas it supercharges trips around the world\nNews β’ Jul 14, 2023\nAltFi β Revolut expands travel offering with in-app marketplace\nNews β’ Jul 14, 2023\nBusinessCloud β Revolut launches βExperiencesβ marketplace\nNews
...
Using a WebRetriever RAG:
haystack_generate.generateWithWebsite("Write a brief introduction of Revolut's CEO",
domains=["crunchbase.com"],
litm_ranker=True,
max_length=800
Response:
Prompt: <haystack.nodes.prompt.prompt_node.PromptNode object at 0x7f3bad9094d0>
return self.fget.__get__(instance, owner)()
Ranker: <haystack.nodes.ranker.lost_in_the_middle.LostInTheMiddleRanker object at 0x7f3bad90ba10>
Answer: {'answers': [<Answer {'answer': 'Nikolay Storonsky is the Founder and Chief Executive Officer (CEO) of Revolut, an innovative financial technology company. Prior to founding Revolut, he served as an Equity Derivatives Trader at Credit Suisse and Lehman Brothers.', 'type': 'generative', 'score': None, 'context': None, 'offsets_in_document': None, 'offsets_in_context': None, 'document_ids': ['f88e3d044dc68bf9d77f6ad8f08f5493'], 'meta': {'prompt': "Given the provided Documents, answer the Query.\n\n Query: Write a brief introduction of Revolut's CEO\n\n Documents: Overview. Nikolay Storonsky is the Founder and CEO at Revolut. He is a former Equity Derivatives Trader at Credit Suisse and Lehman Brothers. Economic School.\n Answer: \n "}}>], 'invocation_context': {'query': "Write a brief introduction of Revolut's CEO", 'documents': [<Document: {'content': 'Overview. Nikolay Storonsky is the Founder and CEO at Revolut. He is a former Equity Derivatives Trader at Credit Suisse and Lehman Brothers. Economic School.', 'content_type': 'text', 'score': None, 'meta': {'url': 'https://www.crunchbase.com/person/nikolay-storonsky', 'timestamp': 1701187258, 'search.score': 0.18181818181818182, 'search.position': 1, 'snippet_text': 'Overview. Nikolay Storonsky is the Founder and CEO at Revolut. He is a former Equity Derivatives Trader at Credit Suisse and Lehman Brothers. Economic School.', '_split_id': 0, 'score': '0.67864573'}, 'id_hash_keys': ['content'], 'embedding': None, 'id': 'f88e3d044dc68bf9d77f6ad8f08f5493'}>], 'answers': [<Answer {'answer': 'Nikolay Storonsky is the Founder and Chief Executive Officer (CEO) of Revolut, an innovative financial technology company. Prior to founding Revolut, he served as an Equity Derivatives Trader at Credit Suisse and Lehman Brothers.', 'type': 'generative', 'score': None, 'context': None, 'offsets_in_document': None, 'offsets_in_context': None, 'document_ids': ['f88e3d044dc68bf9d77f6ad8f08f5493'], 'meta': {'prompt': "Given the provided Documents, answer the Query.\n\n Query: Write a brief introduction of Revolut's CEO\n\n Documents: Overview. Nikolay Storonsky is the Founder and CEO at Revolut. He is a former Equity Derivatives Trader at Credit Suisse and Lehman Brothers. Economic School.\n Answer: \n "}}>], 'prompts': ["Given the provided Documents, answer the Query.\n\n Query: Write a brief introduction of Revolut's CEO\n\n Documents: Overview. Nikolay Storonsky is the Founder and CEO at Revolut. He is a former Equity Derivatives Trader at Credit Suisse and Lehman Brothers. Economic School.\n Answer: \n "]}, 'documents': [<Document: {'content': 'Overview. Nikolay Storonsky is the Founder and CEO at Revolut. He is a former Equity Derivatives Trader at Credit Suisse and Lehman Brothers. Economic School.', 'content_type': 'text', 'score': None, 'meta': {'url': 'https://www.crunchbase.com/person/nikolay-storonsky', 'timestamp': 1701187258, 'search.score': 0.18181818181818182, 'search.position': 1, 'snippet_text': 'Overview. Nikolay Storonsky is the Founder and CEO at Revolut. He is a former Equity Derivatives Trader at Credit Suisse and Lehman Brothers. Economic School.', '_split_id': 0, 'score': '0.67864573'}, 'id_hash_keys': ['content'], 'embedding': None, 'id': 'f88e3d044dc68bf9d77f6ad8f08f5493'}>], 'root_node': 'Query', 'params': {}, 'query': "Write a brief introduction of Revolut's CEO", 'node_id': 'PromptNode'}
/usr/lib/python3.11/tempfile.py:895: ResourceWarning: Implicitly cleaning up <TemporaryDirectory '/tmp/tmp8g9evya6'>
_warnings.warn(warn_message, ResourceWarning)
llamaindex_embed.kg_index(source=path,
space_name=index_name
)
Response:
(Revolut, is making money accessible by, driving borderless finance in Latin America)
(Revolut, says Latin America is, key region for growth)
(Revolut, to boost crypto team by, 20% despite US exit)
(Revolut CEO, on growing a multi-currency card into, financial super app)
(Revolut, self-shutters US crypto business as, Coinbase moves to dismiss SEC suit)
(Revolut and Elon Musk's X, can become, everything apps)
(Revolut, to stop crypto services for, US customers)
(Revolut, launches instant card transfers to, over 80 countries)
(Revolut and Game4Ukraine, to raise funds for, reconstruction of Ukrainian school)
(Revolut, starts phasing in, car insurance offering)
(Revolut, sells, AIG car insurance)
(Revolut, may be closer to launching, mortgage loans)
(Revolut, launches, joint accounts in the UK)
(Revolut, demands urgent review of, hidden international fees)
(Revolut, moves closer to super-app status by adding, tour and travel experience bookings)
(Revolut, expands travel offering with, in-app marketplace)
...
retrieve_pipeline.get_response_with_VKBRetriever("How would Revolut be impacted by AIG going bankrupt?",
generative_model="gpt-4",
index_name=index_name,
space_name=index_name
)
Response:
Knowledge Graph index: <llama_index.indices.knowledge_graph.base.KnowledgeGraphIndex object at 0x7f67d0680550>
WARNING - llama_index.service_context - chunk_size_limit is deprecated, please specify chunk_size instead
If AIG were to go bankrupt, Revolut could be impacted as it sells AIG car insurance. This could potentially disrupt their insurance services and they might need to find a new insurance provider. However, the specific impact would depend on the details of their agreement with AIG and their contingency plans for such events.
- Support for json and csv docs
- Process file metadata and use it with the retriever
- LLM validation
- Structured prompting with Pydantic
- Support for router fine tuning
- Implement chain of thought