In this project, we work with LLama by Meta. So, we start by downloading the model from the official download page. To use the model with the Hugging Face classes, we need to convert the model using a transform script.
We've used Llama3.2-3B
for text generation
and intfloat/e5-base-v2
for text embedding.
python convert_llama_weights_to_hf.py --model_size <size> --llama_version <version> --input_dir <model> --output_dir <model>_compile
# python convert_llama_weights_to_hf.py --model_size 3B --llama_version 3.2 --input_dir Llama3.2-3B --output_dir Llama3.2-3B_compile
python3 -m venv .venv
source .venv/bin/activate
pip3 install -r requirements.txt
As backbone of this project, we chose PostgreSQL together with the pgvector extension, so one has to install both as stated on their Website and GitHub.
Then we have to create a database
CREATE DATABASE <db>;
CREATE USER <user> WITH ENCRYPTED PASSWORD '<password>';
ALTER DATABASE <db> OWNER TO <user>;
GRANT ALL PRIVILEGES ON DATABASE <db> TO <user>;
Then, activate the pgvector extension and create the schema for the embeddings
/* String Similarity */
CREATE EXTENSION pg_trgm;
/* Activate pgvector */
CREATE EXTENSION IF NOT EXISTS vector;
CREATE EXTENSION IF NOT EXISTS vectorscale CASCADE;
/* create schema */
CREATE SCHEMA embeddings;
[DB]
database=<database>
host=<host>
user=<user>
password=<password>
port=<port>
[MODEL]
huggingface_token = <token>
path_generation = meta-llama/Llama-3.2-3B-Instruct
path_embeddings = intfloat/e5-base-v2
To create visualizations of the execution plan, we use Graphviz. Installation as stated on their Website, then use the Visualizator to create an image of the execution plan.