update the README and reorganize the docker guides structure. (intel-…

…analytics#11016) * update the README and reorganize the docker guides structure. * modified docker install guide into overview
hxsz1997 · May 14, 2024 · 586a151 · 586a151
1 parent 8931974
commit 586a151
Show file tree

Hide file tree

Showing 8 changed files with 897 additions and 828 deletions.
diff --git a/docker/llm/README.md b/docker/llm/README.md
diff --git a/docker/llm/README_backup.md b/docker/llm/README_backup.md
diff --git a/docs/readthedocs/source/_templates/sidebar_quicklinks.html b/docs/readthedocs/source/_templates/sidebar_quicklinks.html
@@ -25,12 +25,6 @@
                     <li>
                         <a href="doc/LLM/Quickstart/install_windows_gpu.html">Install IPEX-LLM on Windows with Intel GPU</a>
                     </li>
-                    <li>
-                        <a href="doc/LLM/Quickstart/docker_windows_gpu.html">Install IPEX-LLM in Docker on Windows with Intel GPU</a>
-                    </li>
-                    <li>
-                        <a href="doc/LLM/Quickstart/docker_pytorch_inference_gpu.html">Run PyTorch Inference on Intel GPU using Docker (on Linux or WSL)</a>
-                    </li>
                     <li>
                         <a href="doc/LLM/Quickstart/chatchat_quickstart.html">Run Local RAG using Langchain-Chatchat on Intel GPU</a>
                     </li>
@@ -73,6 +67,21 @@
                     </li>
                 </ul>
             </li>
+            <li>
+                <strong class="bigdl-quicklinks-section-title">IPEX-LLM Docker Guides</strong>
+                <input id="quicklink-cluster-llm-docker" type="checkbox" class="toctree-checkbox" />
+                <label for="quicklink-cluster-llm-docker" class="toctree-toggle">
+                    <i class="fa-solid fa-chevron-down"></i>
+                </label>
+                <ul class="bigdl-quicklinks-section-nav">
+                    <li>
+                        <a href="doc/LLM/Docker/docker_windows_gpu.html">Overview of IPEX-LLM Containers for Intel GPU</a>
+                    </li>
+                    <li>
+                        <a href="doc/LLM/Docker/docker_pytorch_inference_gpu.html">Run PyTorch Inference on an Intel GPU via Docker</a>
+                    </li>
+                </ul>
+            </li>
             <li>
                 <strong class="bigdl-quicklinks-section-title">IPEX-LLM Installation</strong>
                 <input id="quicklink-cluster-llm-installation" type="checkbox" class="toctree-checkbox" />

diff --git a/docs/readthedocs/source/_toc.yml b/docs/readthedocs/source/_toc.yml
@@ -15,14 +15,19 @@ subtrees:
                   title: "CPU"
                 - file: doc/LLM/Overview/install_gpu
                   title: "GPU"
+          - file: doc/LLM/Docker/index
+            title: "Docker Guides"
+            subtrees:
+              - entries:
+                - file: doc/LLM/Docker/docker_windows_gpu
+                - file: doc/LLM/Docker/docker_pytorch_inference_gpu
           - file: doc/LLM/Quickstart/index
             title: "Quickstart"
             subtrees:
               - entries:
                 - file: doc/LLM/Quickstart/bigdl_llm_migration
                 - file: doc/LLM/Quickstart/install_linux_gpu
                 - file: doc/LLM/Quickstart/install_windows_gpu
-                - file: doc/LLM/Quickstart/docker_windows_gpu
                 - file: doc/LLM/Quickstart/chatchat_quickstart
                 - file: doc/LLM/Quickstart/webui_quickstart
                 - file: doc/LLM/Quickstart/open_webui_with_ollama_quickstart

diff --git a/...uickstart/docker_pytorch_inference_gpu.md → ...LM/Docker/docker_pytorch_inference_gpu.md b/...uickstart/docker_pytorch_inference_gpu.md → ...LM/Docker/docker_pytorch_inference_gpu.md
@@ -1,16 +1,10 @@
-# Run PyTorch Inference on Intel GPU using Docker (on Linux or WSL)
+# Run PyTorch Inference on an Intel GPU via Docker
 
 We can run PyTorch Inference Benchmark, Chat Service and PyTorch Examples on Intel GPUs within Docker (on Linux or WSL).
 
 ## Install Docker
 
-1. Linux Installation
-
-    Follow the instructions in this [guide](https://www.docker.com/get-started/) to install Docker on Linux.
-
-2. Windows Installation
-
-    For Windows installation, refer to this [guide](https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Quickstart/docker_windows_gpu.html#install-docker-on-windows).
+Follow the [Docker installation Guide](./docker_windows_gpu.html#install-docker) to install docker on either Linux or Windows.
 
 ## Launch Docker
 
@@ -20,26 +14,52 @@ docker pull intelanalytics/ipex-llm-xpu:2.1.0-SNAPSHOT
 ```
 
 Start ipex-llm-xpu Docker Container:
-```bash
-export DOCKER_IMAGE=intelanalytics/ipex-llm-xpu:2.1.0-SNAPSHOT
-export CONTAINER_NAME=my_container
-export MODEL_PATH=/llm/models[change to your model path]
-
-docker run -itd \
-    --net=host \
-    --device=/dev/dri \
-    --memory="32G" \
-    --name=$CONTAINER_NAME \
-    --shm-size="16g" \
-    -v $MODEL_PATH:/llm/models \
-    $DOCKER_IMAGE
+
+```eval_rst
+.. tabs::
+   .. tab:: Linux
+
+      .. code-block:: bash
+
+        export DOCKER_IMAGE=intelanalytics/ipex-llm-xpu:2.1.0-SNAPSHOT
+        export CONTAINER_NAME=my_container
+        export MODEL_PATH=/llm/models[change to your model path]
+
+        docker run -itd \
+            --net=host \
+            --device=/dev/dri \
+            --memory="32G" \
+            --name=$CONTAINER_NAME \
+            --shm-size="16g" \
+            -v $MODEL_PATH:/llm/models \
+            $DOCKER_IMAGE
+
+   .. tab:: Windows WSL
+
+      .. code-block:: bash
+
+         #/bin/bash
+        export DOCKER_IMAGE=intelanalytics/ipex-llm-xpu:2.1.0-SNAPSHOT
+        export CONTAINER_NAME=my_container
+        export MODEL_PATH=/llm/models[change to your model path]
+
+        sudo docker run -itd \
+                --net=host \
+                --privileged \
+                --device /dev/dri \
+                --memory="32G" \
+                --name=$CONTAINER_NAME \
+                --shm-size="16g" \
+                -v $MODEL_PATH:/llm/llm-models \
+                -v /usr/lib/wsl:/usr/lib/wsl \ 
+                $DOCKER_IMAGE
 ```
 
+
 Access the container:
 ```
 docker exec -it $CONTAINER_NAME bash
 ```
-
 To verify the device is successfully mapped into the container, run `sycl-ls` to check the result. In a machine with Arc A770, the sampled output is:
 
 ```bash

diff --git a/docs/readthedocs/source/doc/LLM/Docker/docker_windows_gpu.md b/docs/readthedocs/source/doc/LLM/Docker/docker_windows_gpu.md
@@ -0,0 +1,115 @@
+# Overview of IPEX-LLM Containers for Intel GPU 
+
+
+An IPEX-LLM container is a pre-configured environment that includes all necessary dependencies for running LLMs on Intel GPUs. 
+
+This guide provides general instructions for setting up the IPEX-LLM Docker containers with Intel GPU. It begins with instructions and tips for Docker installation, and then introduce the available IPEX-LLM containers and their uses. 
+
+## Install Docker
+
+### Linux
+
+Follow the instructions in the [Offcial Docker Guide](https://www.docker.com/get-started/) to install Docker on Linux.
+
+
+### Windows
+
+```eval_rst
+.. tip::
+
+   The installation requires at least 35GB of free disk space on C drive.
+
+```
+```eval_rst
+.. note::
+
+   Detailed installation instructions for Windows, including steps for enabling WSL2, can be found on the [Docker Desktop for Windows installation page](https://docs.docker.com/desktop/install/windows-install/).
+
+```
+
+#### Install Docker Desktop for Windows 
+Follow the instructions in [this guide](https://docs.docker.com/desktop/install/windows-install/) to install **Docker Desktop for Windows**. Restart you machine after the installation is complete.
+
+#### Install WSL2
+
+Follow the instructions in [this guide](https://docs.microsoft.com/en-us/windows/wsl/install) to install **Windows Subsystem for Linux 2 (WSL2)**.
+
+```eval_rst
+.. tip::
+
+  You may verify WSL2 installation by running the command `wsl --list` in PowerShell or Command Prompt. If WSL2 is installed, you will see a list of installed Linux distributions.
+```
+
+#### Enable Docker integration with WSL2
+
+Open **Docker desktop**, and select `Settings`->`Resources`->`WSL integration`->turn on `Ubuntu` button->`Apply & restart`.
+     <a href="https://llm-assets.readthedocs.io/en/latest/_images/docker_desktop_new.png">
+       <img src="https://llm-assets.readthedocs.io/en/latest/_images/docker_desktop_new.png" width=100%; />
+     </a>
+
+```eval_rst
+.. tip::
+
+   If you encounter **Docker Engine stopped** when opening Docker Desktop, you can reopen it in administrator mode.
+```
+
+ #### Verify Docker is enabled in WSL2
+
+ Execute the following commands in PowerShell or Command Prompt to verify that Docker is enabled in WSL2:
+ ```bash
+ wsl -d Ubuntu # Run Ubuntu WSL distribution
+ docker version # Check if Docker is enabled in WSL
+ ```
+
+You can see the output similar to the following:
+
+<a href="https://llm-assets.readthedocs.io/en/latest/_images/docker_wsl.png">
+  <img src="https://llm-assets.readthedocs.io/en/latest/_images/docker_wsl.png" width=100%; />
+</a>
+
+```eval_rst
+.. tip::
+
+   During the use of Docker in WSL, Docker Desktop needs to be kept open all the time.
+```
+
+     
+## IPEX-LLM Docker Containers
+
+We have several docker images available for running LLMs on Intel GPUs. The following table lists the available images and their uses:
+
+| Image Name | Description | Use Case |
+|------------|-------------|----------|
+| intelanalytics/ipex-llm-cpu:2.1.0-SNAPSHOT | CPU Inference |For development and running LLMs using llama.cpp, Ollama and Python|
+| intelanalytics/ipex-llm-xpu:2.1.0-SNAPSHOT | GPU Inference |For development and running LLMs using llama.cpp, Ollama and Python|
+| intelanalytics/ipex-llm-serving-cpu:2.1.0-SNAPSHOT | CPU Serving|For serving multiple users/requests through REST APIs using vLLM/FastChat|
+| intelanalytics/ipex-llm-serving-xpu:2.1.0-SNAPSHOT | GPU Serving|For serving multiple users/requests through REST APIs using vLLM/FastChat|
+| intelanalytics/ipex-llm-finetune-qlora-cpu-standalone:2.1.0-SNAPSHOT | CPU Finetuning via Docker|For fine-tuning LLMs using QLora/Lora, etc. |
+|intelanalytics/ipex-llm-finetune-qlora-cpu-k8s:2.1.0-SNAPSHOT|CPU Finetuning via Kubernetes|For fine-tuning LLMs using QLora/Lora, etc. |
+| intelanalytics/ipex-llm-finetune-qlora-xpu:2.1.0-SNAPSHOT| GPU Finetuning|For fine-tuning LLMs using QLora/Lora, etc.|
+
+We have also provided several quickstarts for various usage scenarios:
+- [Run and develop LLM applications in PyTorch](./docker_pytorch_inference_gpu.html)
+
+... to be added soon.
+
+## Troubleshooting
+
+
+If your machine has both an integrated GPU (iGPU) and a dedicated GPU (dGPU) like ARC, you may encounter the following issue:
+
+```bash
+Abort was called at 62 line in file:
+./shared/source/os_interface/os_interface.h
+LIBXSMM_VERSION: main_stable-1.17-3651 (25693763)
+LIBXSMM_TARGET: adl [Intel(R) Core(TM) i7-14700K]
+Registry and code: 13 MB
+Command: python chat.py --model-path /llm/llm-models/chatglm2-6b/
+Uptime: 29.349235 s
+Aborted
+```
+To resolve this problem, you can disabling the iGPU in Device Manager on Windows as follows:
+
+<a href="https://llm-assets.readthedocs.io/en/latest/_images/disable_iGPU.png">
+ <img src="https://llm-assets.readthedocs.io/en/latest/_images/disable_iGPU.png" width=100%; />
+</a>
diff --git a/docs/readthedocs/source/doc/LLM/Docker/index.rst b/docs/readthedocs/source/doc/LLM/Docker/index.rst
@@ -0,0 +1,8 @@
+IPEX-LLM Docker Container User Guides
+=====================================
+
+In this section, you will find guides related to using IPEX-LLM with Docker, covering how to:
+
+
+* `Overview of IPEX-LLM Containers for Intel GPU <./docker_windows_gpu.html>`_
+* `Run PyTorch Inference on an Intel GPU via Docker <./docker_pytorch_inference_gpu.html>`_