🚀 Supercharge your LM Studio with a Vector Database! Now with Vision Models!
Ask questions about your documents and get an answer from LM Studio!
(https://www.youtube.com/watch?v=KXYH8zqN8c8)
GPU | Windows | Linux | Requirements |
---|---|---|---|
Nvidia | ✅ | ✅ | CUDA 11.8 |
AMD | ❌ | ✅ | ROCm 5.6 |
Apple/Metal | ✅ |
- 🐍Python 3.10 or Python 3.11 (PyTorch is NOT compatible with 3.12, at least as of 12/19/2023).
- Git
- Git Large File Storage.
- Pandoc.
- Microsoft Build Tools (🔥 Unconfirmed 🔥)
Some Windows users have reported installation errors regarding
hnswlib
,numpy
or other libraries. If you encounter this, try installing Microsoft Build Tools. This may or may not require Visual Studio. If installing in tandem with Visual Studio, check the box for "Desktop development with C++" extension. If this still doesn't work, try installing again but additionally checking the four boxes on the right that state "SDK."
-
Nvidia GPU acceleration (Windows or Linux) requires CUDA 11.8
-
AMD GPU acceleration on Linux requires ROCm 5.6
ROCm 5.7 support coming soon.
PyTorch does not support AMD GPUs on Windows yet.
🪟WINDOWS INSTRUCTIONS🪟
🟢 Nvidia GPU ➜ Install CUDA 11.8
CUDA 12+ support is coming as soon as the faster-whisper library supports it.
🔴 AMD GPU - PyTorch currently does not support AMD gpu-acceleration on Windows. There are several unofficial workarounds but I'm unable to verify since I don't have an AMD GPU nor use Linux. See HERE, HERE, HERE, and possibly HERE.
Download the ZIP file from the latest "release" and extract the contents anywhere you want. DO NOT simply clone this repository...there may be incremental changes to scripts that will be undone inbetween official releases.
Navigate to the src
folder, open a command prompt, and create a virtual environment:
python -m venv .
Activate the virtual environment:
.\Scripts\activate
Run setup:
python setup.py
Run this command if you want to doublecheck that you installed the Pytorch and gpu-acceleration software correctly:
python check_gpu.py
🐧LINUX INSTRUCTIONS🐧
🟢 Nvidia GPUs ➜ Install CUDA 11.8
🔴 AMD GPUs ➜ Install ROCm version 5.6.
THIS REPO also has instructions. Also, although I'm unable to test on my system...here are some "wheels" that I believe should work. However, you'd have to search and find the right one for your system.
Download the ZIP file from the latest "release" and extract the contents anywhere you want. DO NOT simply clone this repository...there may be incremental changes to scripts that will be undone inbetween official releases.
Navigate to the src
folder, open a command prompt, and create a virtual environment:
python -m venv .
Activate the virtual environment:
source bin/activate
python setup_linux.py
Run this script if you want to doublecheck wherher you installed the Pytorch and gpu-acceleration software correctly:
python check_gpu.py
🍎APPLE INSTRUCTIONS🍎
brew install portaudio
- This requires Homebrew to be installed first. If it's not, run the following command before running
brew install portaudio
:
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
For Pytorch to use 🔘Metal/MPS it requires MacOS 12.3+. Metal/MPS provides gpu-acceleration similiar to CUDA (for NVIDIA gpus) and rocM (for AMD gpus) do.
Install Xcode Command Line Tools.
Download the ZIP file from the latest "release" and extract the contents anywhere you want. DO NOT simply clone this repository...there may be incremental changes to scripts that will be undone inbetween official releases.
Navigate to the src
folder, open a command prompt, and create a virtual environment:
python -m venv .
Activate the virtual environment:
source bin/activate
python -m pip install --upgrade pip
pip3 install torch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2
pip install -r requirements.txt
Upgrade PDF loader by running:
python replace_pdf.py
Run this script if you want to doublecheck that you installed the Pytorch and gpu-acceleration software correctly:
python check_gpu.py
🖥️INSTRUCTIONS🖥️
- You do not have to create a virtual environment except when first installing the program, but you must activate the virtual environment each time by opening a command prompt/terminal from within the
src
folder and running the appropriate command above for your platform.
python gui.py
Only systems with an Nvidia GPU will display gpu power, usage, and VRAM metrics.
- Read the User Guide before sending me questions.
- In the
Vector Models
tab, choose the embedding model you want to download.
- In the
Databases Tab
, choose the directory containing the vector model you want to use to create the database. It can be any of the models you've already downloaded.Do not choose the
Embedding_Models
folder itself.
- Making sure to read the User Manual, set the chunk size and chunk overlap. Remember, anytime you want to change these two settings or add/remove documents, you must re-create the database for the changes to take effect.
- Click the
Choose Documents or Images
button to add files.-
- Supported non-image extensions are:
.pdf
,.docx
,.epub
,.txt
,.html
,.enex
,.eml
,.msg
,.csv
,.xls
,.xlsx
,.rtf
,.odt
.
- Supported non-image extensions are:
-
- Supported image extensions are:
.png
,.jpg
,.jpeg
,.bmp
,.gif
,.tif
,.tiff
- Supported image extensions are:
-
- In the
Tools Tab
, you can also transcribe one or more audio files into.txt
files to be put into the vector databse.Also, in the Tools Tab, don't forget to test the vision model you want to use before processing a large number of images.
- In the
Databases Tab
, select one or more files, right click, and delete. Re-create the database.
- Click the
Create Vector Database
button. Wait until the command prompt says "persisted" before proceeding to the next step.
- Start LM Studio and load a model.
The LLM within LM Studio works best with an appropriate "prompt format." In the Settings Tab
in my program, choose the prompt format from the pulldown menu or enter one manually. In order for prompt formatting to work, however, you must disable the "automatic prompt formatting" setting in the "Server" portion of LM Studio.
You do not need to do this if you're using
LM Studio v0.2.9
or earlier. Morever, there is a bug specific toLM Studio v0.2.10
preventing LM Studio from respecting the prompt format you choose. However, you can fix this by going to the Server settings (far right side) and:
⚠️ Delete any/all text within theUser Message Prefix
box; and⚠️ Delete any/all text within theUser Message Suffix
box.
- In the Server tab, click
Start Server.
- Type (or speak) your question and click
Submit Questions.
- If you wish to test the quality of the chunk settings, check the
Chunks Only
checkbox. This means the program will not connect to LM Studio and will instead simply provide you with the chunks retrieved from the vector database.
- This program uses "Bark" models to convert the response from LM Studio into audio. You must wait until the ENTIRE response is received, however, before clicking the
Bark Response
button.
-
Both the voice recorder and audio file transcriber use the
faster-whisper
library, and GPU acceleration is as follows:Note,
faster-whisper
only supports CUDA 11.8 currently (CUDA 12+ coming soon).
As of release 3.0 the program includes Vision Models that will generate summaries of what each picture depicts, which are then added to the vector database. I wrote a Medium article on this as well.
All suggestions (positive and negative) are welcome. "[email protected]" or feel free to message me on the LM Studio Discord Server.