Download the Langchain-Chatchat with IPEX-LLM integrations from this link. Unzip the content into a directory, e.g. /home/arda/Langchain-Chatchat-ipex-llm
.
We recommend to use Conda to manage the running environment. For how to install Conda on your Linux distributions, please visit here.
Run the following commands to create a new python environment:
conda create -n ipex-llm-langchain-chatchat python=3.11
conda activate ipex-llm-langchain-chatchat
pip install --pre --upgrade ipex-llm[all] --extra-index-url https://download.pytorch.org/whl/cpu
pip3 install torchvision==0.16.2+cpu torchaudio==2.1.2+cpu --index-url https://download.pytorch.org/whl/cpu
Switch to the root directory of Langchain-Chatchat you've downloaded (refer to the download section), and install the dependencies with the commands below. Note: In the example commands we assume the root directory is /home/arda/Langchain-Chatchat-ipex-llm
. Remember to change it to your own path.
cd /home/arda/Langchain-Chatchat-ipex-llm
pip install -r requirements_ipex_llm.txt
pip install -r requirements_api_ipex_llm.txt
pip install -r requirements_webui.txt
# Due to an known issue on CPU, run with Llama-2 model
# will require to install transformers==4.34.0 to generate the correct result.
pip install transformers==4.34.0
- In root directory of Langchain-Chatchat, run the following command to create a config:
python copy_config_example.py
- Edit the file
configs/model_config.py
, changeMODEL_ROOT_PATH
to the absolute path of the parent directory where all the downloaded models (LLMs, embedding models, ranking models, etc.) are stored. - Edit the file
configs/model_config.py
, changeEMBEDDING_DEVICE
andLLM_DEVICE
tocpu
.
Download the models and place them in the directory MODEL_ROOT_PATH
(refer to details in Configuration section).
Currently, we support only the LLM/embedding models specified in the table below. You can download these models using the link provided in the table. Note: Ensure the model folder name matches the last segment of the model ID following "/", for example, for THUDM/chatglm3-6b
, the model folder name should be chatglm3-6b
.
Model | Category | download link |
---|---|---|
THUDM/chatglm3-6b |
Chinese LLM | HF or ModelScope |
meta-llama/Llama-2-7b-chat-hf |
English LLM | HF |
BAAI/bge-large-zh-v1.5 |
Chinese Embedding | HF |
BAAI/bge-large-en-v1.5 |
English Embedding | HF |
Run the following commands:
conda activate ipex-llm-langchain-chatchat
source ipex-llm-init -t
export no_proxy='localhost,127.0.0.1'
# We recommend to use cores in one socket for the best performance.
# You may need to change the 0-47 below accordingly.
numactl -C 0-47 -m 0 python startup.py -a
Note
The above numa configuration lead to optimal performance for Intel(R) Xeon(R) Platinum 8468.
You can find the Web UI's URL printted on the terminal logs, e.g. http://localhost:8501/.
Open a browser and navigate to the URL to use the Web UI.