-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
5a5c71b
commit 8bf54c6
Showing
12 changed files
with
374 additions
and
169 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
Here you will find a repository of configurations that proved to work. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,62 @@ | ||
# skaff-rag-accelerator | ||
|
||
This is a starter kit to deploy a modularizable RAG locally or on the cloud (or across multiple clouds) | ||
|
||
## Features | ||
|
||
- A configurable RAG setup based around Langchain | ||
- `RAG` and `RagConfig` python classes to help you set things up | ||
- A REST API based on FastAPI to provide easy access to the RAG as a web backend | ||
- A demo Streamlit to serve as a basic working frontend (not production grade) | ||
- A document loader for the RAG | ||
- User authentication (unsecure for now, but usable for conversation history) | ||
- User feedback collection | ||
- Streamed responses | ||
|
||
## Quickstart | ||
|
||
In a fresh env: | ||
```shell | ||
pip install -r requirements.txt | ||
``` | ||
|
||
You will need to set some env vars, either in a .env file at the project root, or just by exporting them like so: | ||
```shell | ||
export OPENAI_API_KEY="xxx" # API key used to query the LLM | ||
export EMBEDDING_API_KEY="xxx" # API key used to query the embedding model | ||
export DATABASE_URL="sqlite:///$(pwd)/database/db.sqlite3" # For local developement only. You will need a real, cloud-based SQL database URL for prod. | ||
``` | ||
|
||
Start the backend server locally | ||
```shell | ||
uvicorn backend.main:app | ||
``` | ||
|
||
Start the frontend demo | ||
```shell | ||
streamlit run frontend/app.py | ||
``` | ||
|
||
You should then be able to login and chat to the bot: | ||
![](login_and_chat.gif) | ||
|
||
|
||
## Architecture | ||
|
||
### The `frontend`, the `backend`, and the `database` | ||
|
||
The whole goal of this repo is to decouple the "computing and LLM querying" part from the "rendering a user interface" part. We do this with a typical 3-tier architecture. | ||
|
||
![](3t_architecture.png) | ||
|
||
- The [frontend](frontend) is the end user facing part. It reches out to the backend **ONLY** through the REST API. We provide a frontend demo here for convenience, but ultimately it could live in a completely different repo, and be written in a completely different language. | ||
- The [backend](backend) provides a REST API to abstract RAG functionalities. It handles calls to LLMs, tracks conversations and users, handles the state management using a db, and much more. To get the gist of the backend, look at the of the API: http://0.0.0.0:8000/docs | ||
- The [database](database) is only accessed by the backend and persists the state of the RAG application. [Explore the data model here.](https://dbdiagram.io/d/RAGAAS-63dbdcc6296d97641d7e07c8) | ||
|
||
The structure of the repo mirrors this architecture. | ||
|
||
### The RAG | ||
|
||
![](rag_architecture.png) | ||
|
||
In the `backend` folder of the repository, you will find a `rag_components` directory that implements this architecture. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
## Loading documents | ||
|
||
The easiest but least flexible way to load documents to your RAG is to use the `RAG.load_file` method. It will semi-intellignetly try to pick the best Langchain loader and parameters for your file. | ||
|
||
```python | ||
from pathlib import Path | ||
|
||
from backend.rag_components.rag import RAG | ||
|
||
|
||
data_directory = Path("data") | ||
|
||
config_directory = Path("backend/config.yaml") | ||
rag = RAG(config_directory) | ||
|
||
for file in data_directory.iterdir(): | ||
if file.is_file(): | ||
rag.load_file(file) | ||
``` | ||
|
||
If you want more flexibility, you can use the `rag.load_documents` method which expects a list of `langchain.docstore.document` objects. | ||
|
||
**TODO: example** | ||
|
||
## Document indexing | ||
|
||
The document loader maintains an index of the loaded documents. You can change it in the configuration of your RAG at `vector_store.insertion_mode` to `None`, `incremental`, or `full`. | ||
|
||
[Details of what that means here.](https://python.langchain.com/docs/modules/data_connection/indexing) |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Oops, something went wrong.