A medical consultation chatbot using RAG and RAFT techniques on the Gemma 2 model, achieving high performance in the healthcare domain.
- Demo site: https://undoc.vercel.app
- Demo page Repo: https://github.com/jagaldol/undoc
The repository for the Demo page implementation can be found here.
The RAFT-finetuned Gemma-2-2b-it
model is optimized for healthcare question-answering tasks using the RAG
.
Model | Median |
---|---|
Gemma-2-2b-it | 0.80 |
Gemma-2-2b-it+RAG | 0.89 |
DSF unit8 | 0.88 |
DSF unit8 + RAG | 0.93 |
DSF | 0.90 |
DSF + RAG | 0.94 |
RAFT unit8 | 0.86 |
RAFT unit8 + RAG | 0.93 |
RAFT | 0.88 |
RAFT + RAG | 0.96 |
We achieved a 16% performance increase from the base model!
we use RAGAS Semantic Similarity to measure performance.
Follow the steps below to set up your development environment and get started with the project:
Copy the .env.example file to .env and update the environment variables as needed:
$ cp .env.example .env
Open the .env file and update the variables with your specific configuration:
HF_TOKEN=your huggingface token
PINECONE_API_KEY=your pinecone api key
PINECONE_INDEX=your pinecone index name
Use pipenv to install the required dependencies and set up the virtual environment:
$ pipenv sync
This command will create a virtual environment if one does not already exist and install all dependencies as specified in the Pipfile.
To activate the virtual environment, run:
$ pipenv shell
After setting up the environment, you can start the application with the following command:
$ python main.py
Once the server is running, you can access the API at http://localhost:8000.
If you want a single response, run the inference.py script with a query to get a direct answer from the model:
$ python inference.py "배탈 난거 같은데 어떻게 해?"
This script takes a user query as input and generates a relevant response based on the AI model.
health-care-advisor
├── main.py
├── models
│ └── model_loader.py
├── notebooks
├── ragchain
│ ├── pipeline_setup.py
│ └── rag_chain.py
├── retriever
│ ├── hybrid_search_retriever.py
│ └── retriever_setup.py
├── template
│ ├── generation_prompt.py
│ └── script.py
└── utils
└── environment.py
- 초거대 AI 헬스케어 질의응답 데이터: AI 허브, 초거대 AI 헬스케어 질의응답 데이터
- Gemma 2 Model: "Gemma 2: Improving Open Language Models at a Practical Size", 2023.
- RAFT Methodology: "Adapting Language Model to Domain Specific RAG", arXiv preprint arXiv:2403.10131, 2023.
- RAGAS Evaluation: "RAGAS: Automated Evaluation of Retrieval Augmented Generation", 2023.
임영윤 | 안혜준 |
---|---|