modify

intel-analytics · Jun 4, 2024 · 7acc862 · 7acc862
1 parent d412ad7
commit 7acc862
Showing 1 changed file with 33 additions and 6 deletions.
diff --git a/docs/docs/integrations/llms/ipex_llm_gpu.ipynb b/docs/docs/integrations/llms/ipex_llm_gpu.ipynb
@@ -123,9 +123,7 @@
     "> For other GPU type, please refer to [here](https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#runtime-configuration) for Windows users, and  [here](https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Overview/install_gpu.html#id5) for Linux users.\n",
     "\n",
     "\n",
-    "## Basic Usage\n",
-    "\n",
-    "Setting `device` to `\"xpu\"` in `model_kwargs` when initializing `IpexLLM` will put the LLM model on Intel GPU and benefit from IPEX-LLM optimizations. Specify the prompt template for your model. In this example, we use the [vicuna-1.5](https://huggingface.co/lmsys/vicuna-7b-v1.5) model. If you're working with a different model, choose a proper template accordingly."
+    "## Basic Usage\n"
    ]
   },
   {
@@ -140,10 +138,39 @@
     "from langchain_community.llms import IpexLLM\n",
     "from langchain_core.prompts import PromptTemplate\n",
     "\n",
-    "warnings.filterwarnings(\"ignore\", category=UserWarning, message=\".*padding_mask.*\")\n",
+    "warnings.filterwarnings(\"ignore\", category=UserWarning, message=\".*padding_mask.*\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Specify the prompt template for your model. In this example, we use the [vicuna-1.5](https://huggingface.co/lmsys/vicuna-7b-v1.5) model. If you're working with a different model, choose a proper template accordingly."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
     "template = \"USER: {question}\\nASSISTANT:\"\n",
-    "prompt = PromptTemplate(template=template, input_variables=[\"question\"])\n",
-    "\n",
+    "prompt = PromptTemplate(template=template, input_variables=[\"question\"])"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Load the model locally using IpexLLM using `IpexLLM.from_model_id`. It will load the model directly in its Huggingface format and convert it automatically to low-bit format for inference. Set `device` to `\"xpu\"` in `model_kwargs` when initializing IpexLLM in order to load the LLM model to Intel GPU."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
     "llm = IpexLLM.from_model_id(\n",
     "    model_id=\"lmsys/vicuna-7b-v1.5\",\n",
     "    model_kwargs={\n",