Skip to content

Commit

Permalink
update
Browse files Browse the repository at this point in the history
  • Loading branch information
songhappy committed Oct 27, 2023
1 parent ed35455 commit d299ea8
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions python/llm/example/CPU/applications/streaming-llm/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Low-Bit Streaming LLM using BigDL-LLM

In this example, we apply [Streaming-LLM](https://github.com/mit-han-lab/streaming-llm/tree/main#efficient-streaming-language-models-with-attention-sinks) using BigDL-LLM, which can deploy low-bit(including FP4/INT4/FP8/INT8) LLMs for infinite-length inputs.
In this example, we apply low-bit optimizations to [Streaming-LLM](https://github.com/mit-han-lab/streaming-llm/tree/main#efficient-streaming-language-models-with-attention-sinks) using BigDL-LLM, which can deploy low-bit(including FP4/INT4/FP8/INT8) LLMs for infinite-length inputs.
Only one code change is needed to load the model using bigdl-llm as follows:
```python
from bigdl.llm.transformers import AutoModelForCausalLM
Expand All @@ -18,7 +18,7 @@ pip install --pre --upgrade bigdl-llm[all]

## Run Example
```bash
python ./run_streaming_llama.py --repo-id-or-model-path REPO_ID_OR_MODEL_PATH --enable_streaming
python ./run_streaming_llama.py --repo-id-or-model-path REPO_ID_OR_MODEL_PATH --enable-streaming
```
arguments info:
- `--repo-id-or-model-path`: str value, argument defining the huggingface repo id for the large language model to be downloaded, or the path to the huggingface checkpoint folder, the value is 'meta-llama/Llama-2-7b-chat-hf' by default.
Expand Down

0 comments on commit d299ea8

Please sign in to comment.