Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement DeepSeekV2 #1010

Merged
merged 19 commits into from
Jan 5, 2025
Merged

Implement DeepSeekV2 #1010

merged 19 commits into from
Jan 5, 2025

Conversation

EricLBuehler
Copy link
Owner

@EricLBuehler EricLBuehler commented Dec 27, 2024

After this, V3 will be implemented.

Implementation TODOs:

  • Attention
    • Load
    • Forward
  • RoPE (yarn)
    • Load
    • Forward
  • DeepseekV2MoE
    • DeepseekV2MLP
    • MoEGate
      • Forward
    • Forward

@EricLBuehler EricLBuehler added the new feature New feature or request label Dec 27, 2024
Copy link

github-actions bot commented Dec 27, 2024

Code Metrics Report
  ===============================================================================
 Language            Files        Lines         Code     Comments       Blanks
===============================================================================
 C Header                2           35           28            0            7
 Dockerfile              1           41           22           10            9
 JSON                   12          105          104            0            1
 Python                 63         2706         2338           71          297
 Shell                   1           57           22           18           17
 Plain Text              3         3723            0         2413         1310
 TOML                   18          605          539            2           64
 YAML                    2           21           19            2            0
-------------------------------------------------------------------------------
 Jupyter Notebooks       4            0            0            0            0
 |- Markdown             2           77           32           31           14
 |- Python               2          205          178            1           26
 (Total)                            282          210           32           40
-------------------------------------------------------------------------------
 Markdown               43         3333            0         2526          807
 |- BASH                 6          103          100            0            3
 |- JSON                 1           12           12            0            0
 |- Python               7          121          109            0           12
 |- Rust                12          406          344            0           62
 |- TOML                 2           75           63            0           12
 (Total)                           4050          628         2526          896
-------------------------------------------------------------------------------
 Rust                  296        89888        80666         1863         7359
 |- Markdown           143         1593           25         1448          120
 (Total)                          91481        80691         3311         7479
===============================================================================
 Total                 445       100514        83738         6905         9871
===============================================================================
  

@EricLBuehler EricLBuehler merged commit a562fd0 into master Jan 5, 2025
12 checks passed
@EricLBuehler EricLBuehler deleted the deepseek2 branch January 5, 2025 00:35
@EricLBuehler EricLBuehler mentioned this pull request Jan 5, 2025
11 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new feature New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant