alan-turing-institute · rchan26 · May 17, 2024 · May 14, 2024 · May 14, 2024 · May 14, 2024
diff --git a/README.md b/README.md
@@ -1,14 +1,12 @@
 # prompto
 
-`prompto` derives from the Italian word "pronto" which means "ready" and could also mean "I prompt" in Italian (if "promptare" was actually a verb meaning "to prompt").
+`prompto` derives from the Italian word "pronto" which means "ready" and could also mean "I prompt" in Italian (if "promptare" was a verb meaning "to prompt").
 
-`prompto` is a Python library facilitates batch processing of experiments stored as jsonl files. It automates querying the API and logs progress asynchronously. The library is designed to be extensible and can be used to query different models.
+`prompto` is a Python library facilitates batch processing of experiments stored as jsonl files. It automates querying API endpoints and logs progress asynchronously. The library is designed to be extensible and can be used to query different models.
 
 ## Getting Started
 
-(some key things to note - still need to finish)
-
-The library has functionality to process experiments and to run a pipeline which continually looks for new experiment jsonl files in the input folder. Everything starts with defining a data folder which contains:
+The library has functionality to process experiments and to run a pipeline which continually looks for new experiment jsonl files in the input folder. Everything starts with defining a pipeline data folder which contains:
 - `input` folder: contains the jsonl files with the experiments
 - `output` folder: where the results of the experiments will be stored. When an experiment is ran, a folder is created within the output folder of the experiment name (as defined in the jsonl file but removing the `.jsonl` extension) and the results and logs for the experiment are stored there
 - `media` folder: which contains the media files for the experiments. These files must be within folders of the same experiment name (as defined in the jsonl file but removing the `.jsonl` extension)
@@ -28,15 +26,6 @@ Before running the script, ensure you have the following:
 - Python >= 3.11
 - Poetry (for dependency management)
 
-### Models
-
-- Azure OpenAI
-    - Need to set `AZURE_OPENAI_API_KEY`, `AZURE_OPENAI_API_ENDPOINT` environment variables. You can also set the `AZURE_OPENAI_API_VERSION` variable too. Also recommended to set the `AZURE_OPENAI_MODEL_ID` in the environment variable to either avoid passing in the `model_name` each time if using the same one consistently.
-- OpenAI
-    - Need to set `OPENAI_API_KEY` environment variable. Also recommended to set the `OPENAI_MODEL_NAME` in the environment variable to either avoid passing in the `model_name` each time if using the same one consistently.
-- Gemini
-    - Need to set `GEMINI_PROJECT_ID`, and `GEMINI_LOCATION` environment variables. Also recommended to set the `GEMINI_MODEL_NAME` in the environment variable to either avoid passing in the `model_name` each time if using the same one consistently.
-
 ### Installation
 
 1. **Clone the Repository**

diff --git a/docs/README.md b/docs/README.md
@@ -0,0 +1,18 @@
+# Documentation
+
+### Getting Started
+
+* [Quickstart](../README.md#getting-started)
+* [Installation](../README.md#installation)
+* [Examples](../examples)
+
+### Using `prompto`
+
+* [Setting up an experiment file](./experiment_file.md)
+* [prompto Pipeline and running experiments](./pipeline.md)
+* [prompto commands](./commands.md)
+
+### Reference
+
+* [Implemented APIs](./models.md)
+* [Adding new API/model](./add_new_api.md)
diff --git a/docs/add_new_api.md b/docs/add_new_api.md
@@ -0,0 +1 @@
+# Instructions to add new API/model
diff --git a/docs/commands.md b/docs/commands.md
@@ -0,0 +1,104 @@
+# Commands
+
+- [Running an experiment file](#running-an-experiment-file)
+- [Running the pipeline](#running-the-pipeline)
+- [Run checks on an experiment file](#run-checks-on-an-experiment-file)
+- [Create judge file](#create-judge-file)
+- [Obtain missing results jsonl file](#obtain-missing-results-jsonl-file)
+- [Convert images to correct form](#convert-images-to-correct-form)
+- [Start up Quart server](#start-up-quart-server)
+
+## Running an experiment file
+
+As detailed in the [pipeline documentation](pipeline.md), you can run a single experiment file using the `prompto_run_experiment` command and passing in a file. To see all arguments of this command, run `prompto_run_experiment --help`.
+
+To run a particular experiment file with the data-folder set to the default path `./data`, you can use the following command:
+```
+prompto_run_experiment --file path/to/experiment.jsonl
+```
+
+This uses the default settings for the pipeline. You can also set the `--max-queries`, `--max-attempts`, and `--parallel` flags as detailed in the [pipeline documentation](pipeline.md).
+
+## Running the pipeline
+
+As detailed in the [pipeline documentation](pipeline.md), you can run the pipeline using the `prompto_run_pipeline` command. To see all arguments of this command, run `prompto_run_pipeline --help`.
+
+To run a particular experiment file with the data-folder set to `pipeline-data`, you can use the following command:
+```
+prompto_run_pipeline --data-folder pipeline-data
+```
+
+This uses the default settings for the pipeline. You can also set the `--max-queries`, `--max-attempts`, and `--parallel` flags as detailed in the [pipeline documentation](pipeline.md).
+
+## Run checks on an experiment file
+
+It is possible to run a check over an experiment file to ensure that all the prompts are valid and the experiment file is correctly formatted. We also check for environment variables and log any errors or warnings that are found. To run this check, you can use the `prompto_check_experiment` command and passing in a file. To see all arguments of this command, run `prompto_check_experiment --help`.
+
+To run a check on a particular experiment file, you can use the following command:
+```
+prompto_check_experiment --file path/to/experiment.jsonl
+```
+
+This will run the checks on the experiment file and log any errors or warnings that are found. You can optionally set the log-file to save the logs to a file using the `--log-file` flag (by default, it will be saved to a file in the current directory) and specify the path to the data folder using the `--data-folder` flag.
+
+Lastly, it's possible to automatically move the file to the input folder of the data folder if it is not already there. To do this, you can use the `--move-to-input` flag:
+```
+prompto_check_experiment \
+    --file path/to/experiment.jsonl \
+    --data-folder data \
+    --log-file path/to/logfile.txt \
+    --move-to-input
+```
+
+## Create judge file
+
+Once an experiment has been ran and responses to prompts have been obtained, it is possible to use another LLM as a "judge" to score the responses. This is useful for evaluating the quality of the responses obtained from the model. To create a judge file, you can use the `prompto_create_judge` command passing in the file containing the completed experiment and to a folder (i.e. judge location) containing the judge template and settings to use. To see all arguments of this command, run `prompto_create_judge --help`.
+
+To create a judge file for a particular experiment file with a judge-location as `./judge` and using judge `gemini-1.0-pro` you can use the following command:
+```
+prompto_create_judge \
+    --experiment-file path/to/experiment.jsonl \
+    --judge-location judge \
+    --judge gemini-1.0-pro
+```
+
+In `judge`, you must have two files:
+- `template.txt`: this is the template file which contains the prompts and the responses to be scored. The responses should be replaced with the placeholders `{INPUT_PROMPT}` and `{OUTPUT_RESPONSE}`.
+- `settings.json`: this is the settings json file which contains the settings for the judge(s). The keys are judge identifiers and the values are the "api", "model_name", "parameters" to specify the LLM to use as a judge (see the [experiment file documentation](experiment_file.md) for more details on these keys).
+
+See for example [this judge example](../examples/data/data/judge) which contains example template and settings files.
+
+The judge specified with the `--judge` flag should be a key in the `settings.json` file in the judge location. You can create different judge files using different LLMs as judge by specifying a different judge identifier from the keys in the `settings.json` file.
+
+## Obtain missing results jsonl file
+
+In some cases, you may have ran an experiment file and obtained responses for some prompts but not all. To obtain the missing results jsonl file, you can use the `prompto_obtain_missing_results` command passing in the input experiment file and the corresponding output experiment. You must also specify a path to a new jsonl file which will be created if any prompts are missing in the output file. The command looks at an ID key in the `prompt_dict`s of the input and output files to match the prompts, by default the name of this key is `id`. If the key is different, you can specify it using the `--id` flag. To see all arguments of this command, run `prompto_obtain_missing_results --help`.
+
+To obtain the missing results jsonl file for a particular experiment file with the input experiment file as `path/to/experiment.jsonl`, the output experiment file as `path/to/experiment-output.jsonl`, and the new jsonl file as `path/to/missing-results.jsonl`, you can use the following command:
+```
+prompto_obtain_missing_results \
+    --input-experiment path/to/experiment.jsonl \
+    --output-experiment path/to/experiment-output.jsonl \
+    --missing-results path/to/missing-results.jsonl
+```
+
+## Convert images to correct form
+
+The `prompto_convert_images` command can be used to convert images to the correct form for the multimodal LLMs. This command takes in a folder containing images and checks if `.jpg`, `.jpeg` and `.png` files are saved in the correct format. If not, we resave them in the correct format.
+
+To convert images in a folder `./images` to the correct form, you can use the following command:
+```
+prompto_convert_images --folder images
+```
+
+## Start up Quart server
+
+As described in the [Quart API model documentation](models.md#quart-api), we have implemented a simple [Quart API](../src/prompto/models/quart/quart_api.py) that can be used to quary a text-generation model from the [Huggingface model hub](https://huggingface.co/models) using the Huggingface `transformers` library. To start up the Quart server, you can use the `prompto_start_quart_server` command along with the Huggingface model name. To see all arguments of this command, run `prompto_start_quart_server --help`.
+
+To start up the Quart server with [`vicgalle/gpt2-open-instruct-v1`](https://huggingface.co/vicgalle/gpt2-open-instruct-v1), at `"http://localhost:8000"`, you can use the following command:
+```
+prompto_start_quart_server \
+    --model-name vicgalle/gpt2-open-instruct-v1 \
+    --host localhost \
+    --port 8000
+```
diff --git a/docs/experiment_file.md b/docs/experiment_file.md
@@ -0,0 +1,22 @@
+# Setting up an experiment file
+
+An experiment file is a [JSON Lines (jsonl)](https://jsonlines.org/) file that contains the prompts for the experiments along with any other parameters or metadata that is required for the prompt. Each line in the jsonl file is a valid JSON value which defines a particular input to the LLM which we will obtain a response for. We often refer to a single line in the jsonl file as a "`prompt_dict`" (prompt dictionary).
+
+For all models/APIs, we require the following keys in the `prompt_dict`:
+- `prompt`: the prompt for the model
+    - This is typically a _string_ that is passed to the model to generate a response, but for certain APIs and models, this could also take different forms. For example, for some API endpoints (e.g. OpenAI (`"api": "openai"`)) the prompt could also be a list of strings in which case we consider this to be a sequence of prompts to be sent to the model, or it could be a list of dictionaries where each dictionary has a "role" and "content" key which can be used to define a history of a conversation which is sent to the model for a response.
+    - See the [documentation](models.md) for the specific APIs/models for more details on the different accepted formats of the prompt.
+- `api`: the name of the API to query
+    - See the [available APIs/models](models.md) for the list of supported APIs and the corresponding names to use in the `api` key
+    - They are defined in the `ASYNC_MODELS` dictionary in the [`prompto.models` module](../src/prompto/models/__init__.py)
+
+In addition, there are other optional keys that can be included in the `prompt_dict`:
+- `parameters`: the parameter settings / generation config for the query (given as a dictionary)
+    - This is a dictionary that contains the parameters for the query. The parameters are specific to the model and the API being used. For example, for the Gemini API (`"api": "gemini"`), some paramters to configure are {`temperature`, `max_output_tokens`, `top_p`, `top_k`} etc. which are used to control the generation of the response. For the OpenAI API (`"api": "openai"`), some of these parameters are named differently for instance the maximum output tokens is set using the `max_tokens` parameter and `top_k` is not available to set. For Ollama (`"api": "ollama"`), the parameters are different again, e.g. the maximum number of tokens to predict is set using `num_predict`
+    - See the API documentation for the specific API for the list of parameters that can be set and their default values
+- `model_name`: the name of the model to query
+    - For most API endpoints, it is possible to define the name of the model to query. For example, for the OpenAI API (`"api": "openai"`), the model name could be `"gpt-3.5-turbo"`, `"gpt-4"`, etc.
+    - This is optional since you can also set the model name in the environment variable (e.g. `OPENAI_MODEL_NAME` for the OpenAI API) and avoid passing it in the `prompt_dict` each time if using the same one consistently
+    - It is still possible to have a default model name set in the environment variable and override it in the `prompt_dict` if you want to use a different model for a particular prompt
+
+Lastly, there are other optional keys that are only available for certain APIs/models. For example, for the Gemini API, you can have a `multimedia` key which is a list of dictionaries defining the multimedia files (e.g. images/videos) to be used in the prompt to a multimodal LLM. For these, see the documentation for the specific API/model for more details.