Retrieval-Augmented Generation Experiment

An initial exploration of retrieval-augmented generation for a research software repository.

Quickstart

Clone the Repository:
```
git clone <repository-url>
```
Add the API URL to Your .env File:
- Add the following lines to your .env file in the root directory:
```
API_URL="Your API URL here"
QUERY="Your query here"
```
Navigate to the Project Directory:
```
cd <project-directory>
```
Install the Necessary Dependencies:
```
pip install -r requirements.txt
```
Using Python 3.12.4 is recommended.
Create the directories:

Create the Dataset & Vectorizations:
- Before running the notebook for the RAG experiment, you need to create the dataset and generate text vectorizations for the retrieval part of the RAG.
- To do this, simply execute the 1_vectorisation.ipynb notebook. The data will be saved to your machine and will be available the next time you open the project.
Install Ollama

Install Ollama and download the model you want to use.
To install llama3, the model we're using in this notebook, run the following command ollama run llama3
Ollama has to run in the background for the chat-bots to work.

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
data		data
models		models
prompts		prompts
util		util
vectorisations		vectorisations
.gitignore		.gitignore
1_vectorisation.ipynb		1_vectorisation.ipynb
2_rag_experiment.ipynb		2_rag_experiment.ipynb
README.md		README.md
requirements.txt		requirements.txt