diff --git a/docs/ai-chat.md b/docs/ai-chat.md index 636cca96cc..bf9700ff04 100755 --- a/docs/ai-chat.md +++ b/docs/ai-chat.md @@ -21,7 +21,9 @@ If you are concerned about these practices, you can either refuse to use AI, or Alternatively, you can run AI models locally so that your data never leaves your device and is therefore never shared with third parties. As such, local models are a more private and secure alternative to cloud-based solutions and allow you to share sensitive information to the AI model without worry. -## Hardware for Local AI Models +## AI Models + +### Hardware for Local AI Models Local models are also fairly accessible. It's possible to run smaller models at lower speeds on as little as 8GB of RAM. Using more powerful hardware such as a dedicated GPU with sufficient VRAM or a modern system with fast LPDDR5X memory offers the best experience. @@ -37,31 +39,12 @@ For consumer-grade hardware, it is generally recommended to use [quantized model To run AI locally, you need both an AI model and an AI client. -## AI Models - -### Find and Choose a Model +### Choosing a Model There are many permissively licensed models available to download. **[Hugging Face](https://huggingface.co/models)** is a platform that lets you browse, research, and download models in common formats like [GGUF](https://huggingface.co/docs/hub/en/gguf). Companies that provide good open-weights models include big names like Mistral, Meta, Microsoft, and Google. However, there are also many community models and 'fine-tunes' available. As mentioned above, quantized models offer the best balance between model quality and performance for those using consumer-grade hardware. To help you choose a model that fits your needs, you can look at leaderboards and benchmarks, of which there are many kinds. The most widely-used leaderboard is the community-driven [LM Arena](https://lmarena.ai). Additionally, the [OpenLLM Leaderboard](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard) focuses on the performance of open-weights models on common benchmarks like [MMLU-Pro](https://arxiv.org/abs/2406.01574). Furthermore, there are also specialized benchmarks which measure factors like [emotional intelligence](https://eqbench.com), ["uncensored general intelligence"](https://huggingface.co/spaces/DontPlanToEnd/UGI-Leaderboard), and [many others](https://www.nebuly.com/blog/llm-leaderboards). -### Model Security - -When you have found an AI model to your liking, you should download it in a safe manner. When you use an AI client that maintains their own library of model files (such as [Ollama](#ollama-cli) and [Llamafile](#llamafile)), you should download it from there. However, if you want to download models not present in their library, or use an AI client that doesn't maintain its library (such as [Kobold.cpp](#koboldcpp)), you will need to take extra steps to ensure that the AI model you download is safe and legitimate. - -We recommend downloading model files from Hugging Face, as it provides several features to verify that your download is genuine and safe to use. - -To check the authenticity and safety of the model, look for: - -- Model cards with clear documentation -- A verified organization badge -- Community reviews and usage statistics -- A "Safe" badge next to the model file (Hugging Face only) -- Matching checksums[^1] - - On Hugging Face, you can find the hash by clicking on a model file and looking for the **Copy SHA256** button below it. You should compare this checksum with the one from the model file you downloaded. - -A downloaded model is generally safe if it satisfies all of the above checks. - ## AI Chat Clients | Feature | [Kobold.cpp](#koboldcpp) | [Ollama](#ollama-cli) | [Llamafile](#llamafile) | @@ -163,6 +146,25 @@ Mozilla has made llamafiles available for only some Llama and Mistral models, wh To circumvent these issues, you can [load external weights](https://github.com/Mozilla-Ocho/llamafile#using-llamafile-with-external-weights). +## Securely Downloading Models + +When you have found an AI model to your liking, you should download it in a safe manner. If you use an AI client that maintains their own library of model files (such as [Ollama](#ollama-cli) and [Llamafile](#llamafile)), you should download it from there. + +However, if you want to download models not present in their library, or use an AI client that doesn't maintain its library (such as [Kobold.cpp](#koboldcpp)), you will need to take extra steps to ensure that the AI model you download is safe and legitimate. + +We recommend downloading model files from Hugging Face, as it provides several features to verify that your download is genuine and safe to use. + +To check the authenticity and safety of the model, look for: + +- Model cards with clear documentation +- A verified organization badge +- Community reviews and usage statistics +- A "Safe" badge next to the model file (Hugging Face only) +- Matching checksums[^1] + - On Hugging Face, you can find the hash by clicking on a model file and looking for the **Copy SHA256** button below it. You should compare this checksum with the one from the model file you downloaded. + +A downloaded model is generally safe if it satisfies all of the above checks. + ## Criteria Please note we are not affiliated with any of the projects we recommend. In addition to [our standard criteria](about/criteria.md), we have developed a clear set of requirements to allow us to provide objective recommendations. We suggest you familiarize yourself with this list before choosing to use a project and conduct your own research to ensure it's the right choice for you.