Skip to content

Latest commit

 

History

History
32 lines (22 loc) · 1.28 KB

README.md

File metadata and controls

32 lines (22 loc) · 1.28 KB

Scientific question-answering by fine-tuned large language models (LLMs)

Create an environment using the following command:

conda env create --file env.yml

Dataset collection

  • Galaxy help forum
  • Biostars Q&A

Fine-tune Llama2 (2B and 7B)

  • Navigate to \llama2 and then execute python qlora-train.py
  • Utilizes HuggingFace's Transformers package to download pre-trained LLMs
  • qLoRA to drastically reduce the number of parameters (from 2B to 6 million)
  • SFT for setting up the training process

Outcomes

llama2_ans1

llama2_answers

Retrieval augmented generation (RAG)

rag_llm2

Save the fine-tuned model to HuggingFace Hub