Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pull] main from mlc-ai:main #296

Merged
merged 3 commits into from
Oct 15, 2024
Merged

[pull] main from mlc-ai:main #296

merged 3 commits into from
Oct 15, 2024

Conversation

pull[bot]
Copy link

@pull pull bot commented Oct 14, 2024

See Commits and Changes for more details.


Created by pull[bot]

Can you help keep this open source service alive? 💖 Please sponsor : )

rickzx and others added 3 commits October 13, 2024 13:37
This PR implements the DeepSeek-V2 Model architecture:
https://huggingface.co/deepseek-ai/DeepSeek-V2-Lite/blob/main/modeling_deepseek.py.

The notable changes from the common LLM architecture includes:
- Multihead Latent Attention (MLA)
- Yarn Rotary Positional Embeddings
- DeepSeekMoE

Example execution on M2 ultra:
```
% mlc_llm chat ../models/DeepSeek-V2-Lite-Chat-MLC-q0f16 --model-lib ../models/DeepSeek-V2-Lite-Chat-MLC-q
0f16/model.dylib
>>> who are you?
 I am an AI assistant created by DeepSeek to be helpful and harmless.
```

TODO:
- Currently the model architecture only supports Deepseek-V2-Lite.
To support Deepseek-V2, we also need to support the `group_limited_greedy`
strategy.
- Support tensor parallel > 1.
This PR fixes a bug in the streamer handling for UTF-8 characters.
Prior to this PR, the streamer has an assumption that a replacement
character (`�`) always correspond to an entire token. However, for
the Qwen2 model tokenizer, some token can be ` �` if decoded directly,
which breaks the assumption and leads to incorrect result generated
by the streamer.

This PR fixes this issue with a safer behavior that does not rely
on such an assumption.
@pull pull bot added the ⤵️ pull label Oct 15, 2024
@pull pull bot merged commit fead3e5 into kp-forks:main Oct 15, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants