🚀 FastAPI RAG Server

🧠 a lightweight FastAPI server designed for a Retrieval Augmented Generation (RAG) system

📝 Overview

This project implements a lightweight FastAPI server designed to support a Retrieval Augmented Generation (RAG) system. The server processes various document formats, generates embeddings using Hugging Face's sentence-transformers, and provides efficient querying via ChromaDB for vector-based retrieval.

🎯 Key Features

FastAPI Server: Lightweight and asynchronous for handling non-blocking operations.
ChromaDB Integration: Persistent vector database for storing and querying document embeddings.
Multi-format Support: Ingestion support for PDF, DOC, DOCX, and TXT files.
Embeddings: Utilizes sentence-transformers/all-MiniLM-L6-v2 for generating document embeddings.
Concurrency: Efficient handling of multiple requests using FastAPI's async capabilities.
RAG System: Retrieves relevant document chunks and generates context-aware responses.

⚙️ Technologies Used

FastAPI: For building the server backend.
ChromaDB: Vector database for storing document embeddings.
Sentence-Transformers: For embedding generation.
Hugging Face Inference API: For generating RAG-based responses.
PyPDF2 & python-docx: For parsing and ingesting PDF and DOCX documents.
Asyncio: For handling asynchronous operations and tasks.

🛠️ Installation & Setup

Clone the Repository

git clone https://github.com/your-username/fastapi-rag-server.git
cd fastapi-rag-server

Create a Virtual Environment

python -m venv venv
source venv/bin/activate   # On Windows: venv\Scripts\activate

Install Dependencies
```
pip install -r requirements.txt
```
Start the FastAPI Server
```
uvicorn app:app --reload
```
Access the Server
Open your browser and go to http://127.0.0.1:8000/docs to see the interactive API documentation.

📂 Project Structure

rag_server/
├── app.py              # Main FastAPI application
├── vector_db.py        # ChromaDB integration logic
├── load_data.py        # Document loading and splitting logic
├── prompts.py          # RAG prompt generation
├── utils.py            # Utility functions (e.g., async helpers)
├── ingest.py           # Document ingestion logic
├── COLLECTIONS.txt     # List of document collections
└── data/               # Directory for uploaded documents

🚀 Endpoints

1️⃣ Upload Documents

POST /upload
Upload documents (PDF, DOC, DOCX, or TXT) to the server for ingestion.

2️⃣ Query Documents

POST /query
Submit a query to retrieve relevant document chunks using the embeddings generated by sentence-transformers.

3️⃣ List Collections

GET /collections
View all document collections stored in the database.

🌟 Results

The RAG server is capable of:

Uploading and processing multiple document formats.
Efficiently querying stored document collections using vector-based retrieval.
Providing contextual responses based on the documents ingested.
Scalable to handle various document types and collections with ease.

👨‍💻 Contributing

Contributions are welcome! To contribute:

Fork the repository.
Create a new branch: git checkout -b feature-branch-name.
Make your changes and commit them: git commit -m 'Add some feature'.
Push to the branch: git push origin feature-branch-name.
Submit a pull request!

📜 License

This project is licensed under the MIT License - see the LICENSE file for details.

🎉 Conclusion

This project demonstrates the power of modern Python async programming, advanced NLP models, and vector-based information retrieval with ChromaDB to create an efficient and scalable Retrieval Augmented Generation system.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
model/chroma_db		model/chroma_db
COLLECTIONS.txt		COLLECTIONS.txt
Gokulnath FastAPI RAG.pdf		Gokulnath FastAPI RAG.pdf
README.md		README.md
app.py		app.py
chromadb_handler.py		chromadb_handler.py
download_model.py		download_model.py
embeddings.py		embeddings.py
ingest.py		ingest.py
llama2_config.yaml		llama2_config.yaml
load_data.py		load_data.py
load_llm.py		load_llm.py
prompts.py		prompts.py
requirements.txt		requirements.txt
utils.py		utils.py
vector_db.py		vector_db.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🚀 FastAPI RAG Server

🧠 a lightweight FastAPI server designed for a Retrieval Augmented Generation (RAG) system

📝 Overview

🎯 Key Features

⚙️ Technologies Used

🛠️ Installation & Setup

📂 Project Structure

🚀 Endpoints

1️⃣ Upload Documents

2️⃣ Query Documents

3️⃣ List Collections

🌟 Results

👨‍💻 Contributing

📜 License

🎉 Conclusion

About

Releases

Packages

Languages

Coding-Devil/FastAPI-Server-for-RAG

Folders and files

Latest commit

History

Repository files navigation

🚀 FastAPI RAG Server

🧠 a lightweight FastAPI server designed for a Retrieval Augmented Generation (RAG) system

📝 Overview

🎯 Key Features

⚙️ Technologies Used

🛠️ Installation & Setup

📂 Project Structure

🚀 Endpoints

1️⃣ Upload Documents

2️⃣ Query Documents

3️⃣ List Collections

🌟 Results

👨‍💻 Contributing

📜 License

🎉 Conclusion

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages