update

intel-analytics · Oct 27, 2023 · d299ea8 · d299ea8
1 parent ed35455
commit d299ea8
Showing 1 changed file with 2 additions and 2 deletions.
diff --git a/python/llm/example/CPU/applications/streaming-llm/README.md b/python/llm/example/CPU/applications/streaming-llm/README.md
@@ -1,6 +1,6 @@
 # Low-Bit Streaming LLM using BigDL-LLM
 
-In this example, we apply [Streaming-LLM](https://github.com/mit-han-lab/streaming-llm/tree/main#efficient-streaming-language-models-with-attention-sinks) using BigDL-LLM, which can deploy low-bit(including FP4/INT4/FP8/INT8) LLMs for infinite-length inputs.
+In this example, we apply low-bit optimizations to [Streaming-LLM](https://github.com/mit-han-lab/streaming-llm/tree/main#efficient-streaming-language-models-with-attention-sinks) using BigDL-LLM, which can deploy low-bit(including FP4/INT4/FP8/INT8) LLMs for infinite-length inputs.
 Only one code change is needed to load the model using bigdl-llm as follows:
 ```python
 from bigdl.llm.transformers import AutoModelForCausalLM
@@ -18,7 +18,7 @@ pip install --pre --upgrade bigdl-llm[all]
 
 ## Run Example
 ```bash
-python ./run_streaming_llama.py  --repo-id-or-model-path REPO_ID_OR_MODEL_PATH  --enable_streaming
+python ./run_streaming_llama.py  --repo-id-or-model-path REPO_ID_OR_MODEL_PATH  --enable-streaming
 ```
 arguments info:
 - `--repo-id-or-model-path`: str value, argument defining the huggingface repo id for the large language model to be downloaded, or the path to the huggingface checkpoint folder, the value is 'meta-llama/Llama-2-7b-chat-hf' by default.