chunking-strategies

Installation

Ensure Python 3.10 is installed on your system. You can download it from the official Python website.

Create a virtual environment:
```
python3.10 -m venv venv
```
Activate the virtual environment:
- On macOS and Linux:
```
source venv/bin/activate
```
- On Windows:
```
.\venv\Scripts\activate
```
Install the required packages:
```
pip install -r requirements.txt
```

This will set up a virtual environment using Python 3.10 and install all the dependencies listed in the requirements.txt file.

Set up environment variables:

Create a .env file in the root directory of your project and add the following lines, replacing the placeholder values with your actual keys:
```
# keys
CUDA_VISIBLE_DEVICES=3,4
OPENAI_API_KEY=your_openai_api_key
HF_TOKEN=your_huggingface_token
HF_HOME=/path/to/your/huggingface/cache
```

To visibly check whether the RAG questions of a dataset can be answered by the corresponding context, run the following command:

python3 -m src.inspect.inspect_spans -d <dataset_name> -m <max_samples>

The dataset names can be found in the file itself.
The file can be used for every dataset that outputs the common EvalSample format.