Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ggml : move rope type enum to ggml.h #8949

Merged
merged 9 commits into from
Aug 13, 2024

Conversation

danbev
Copy link
Collaborator

@danbev danbev commented Aug 9, 2024

This commit moves the llama_rope_type enum from llama.h to ggml.h and changes its name to ggml_rope_type.

The motivation for this change is to address the TODO in llama.h and use the enum in ggml.

Note: This commit does not change the mode parameter to be of type enum ggml_rope_type. The name mode and its usage suggest that it might be more generic and possibly used as a bit field for multiple flags. Further investigation/discussion may be needed to determine if mode should be restricted to RoPE types.

This commit moves the `llama_rope_type` enum from `llama.h` to
`ggml.h` and changes its name to `ggml_rope_type`.

The motivation for this change is to address the TODO in `llama.h` and
use the enum in ggml.

Note: This commit does not change the `mode` parameter to be of type
`enum ggml_rope_type`. The name `mode` and its usage suggest that it
might be more generic and possibly used as a bit field for multiple
flags. Further investigation/discussion may be needed to determine
if `mode` should be restricted to RoPE types.
@github-actions github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Aug 9, 2024
include/llama.h Outdated Show resolved Hide resolved
@slaren
Copy link
Collaborator

slaren commented Aug 9, 2024

If it is a bit field, then my preference would be to declare the different values as constants rather than as an enum. However, I don't think it is a bit field at the moment, even if it used to be in the past. Also, support for GLM rope was removed a while ago, so from what I can tell, the only two valid options are either normal or neox. GGML_ROPE_TYPE_NONE also doesn't really make sense in ggml, I understand that the reason it exists in llama.cpp is to support models that don't use rope, but that shouldn't be carried to ggml.

@ggerganov
Copy link
Owner

Yes make sense. Probably we need ggml_rope_type with "norm" and "neox" in ggml and then another enum llama_rope_type in llama that combines the ggml_rope_type values with an extra "skip" value.

This commit removes GGML_ROPE_TYPE_NONE and GGML_ROPE_TYPE_GLM from
ggml.h, and back the llama_rope_type enum.

I've kept the assert for GGML_ROPE_TYPE_GLM as I'm not sure if it is
safe to remove it yet.
@danbev
Copy link
Collaborator Author

danbev commented Aug 10, 2024

@slaren @ggerganov Thanks for the reviews/feedback! I've updated with a commit containing your suggestions as I understood them.

Would it makes sense to follow up this PR with one that updates the mode parameter to be of type ggml_rope_type for the ggml_rope_* functions?

@ggerganov
Copy link
Owner

If it is a bit field, then my preference would be to declare the different values as constants rather than as an enum.

@slaren Sorry, I overlooked this yesterday. What's the advantage of having the values as constants compared to enum? And to become a bit field, do you mean that the 0 values should be changed to 1?

@danbev It would be nice to make the change in a single PR. But let's first clarify how to express the rope types before we proceed.

@mofosyne mofosyne added the Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix label Aug 10, 2024
@slaren
Copy link
Collaborator

slaren commented Aug 10, 2024

What's the advantage of having the values as constants compared to enum?

I just think it is clearer to define the flags as int constants, because an enum is only meant to be used with one value at a time, not as a combination of flags. The type created by the enum would also not be usable (you can still use it as an int in C due to implicit casts, but in C++ that would require explicit casts).

And to become a bit field, do you mean that the 0 values should be changed to 1?

What I meant is that in practice only one value can be used at the same time, so there is no reason to make this a bit field, and using an enum is fine.

@ggerganov
Copy link
Owner

Ok, got it. So how about we define NeoX type like this and we don't even need the "norm" type within ggml:

diff --git a/ggml/include/ggml.h b/ggml/include/ggml.h
index 22f5dfed..a7ee17d2 100644
--- a/ggml/include/ggml.h
+++ b/ggml/include/ggml.h
@@ -244,6 +244,8 @@
 #define GGML_EXIT_SUCCESS 0
 #define GGML_EXIT_ABORTED 1
 
+#define GGML_ROPE_TYPE_NEOX 2
+
 #define GGUF_MAGIC "GGUF"
 
 #define GGUF_VERSION 3
@@ -437,12 +439,6 @@ extern "C" {
         GGML_FTYPE_MOSTLY_Q4_0_8_8 = 27, // except 1d tensors
     };
 
-    // Rotary Positional Embedding (RoPE) types
-    enum ggml_rope_type {
-        GGML_ROPE_TYPE_NORM =  0,
-        GGML_ROPE_TYPE_NEOX =  2,
-    };
-
     // available tensor operations:
     enum ggml_op {
         GGML_OP_NONE = 0,

And we keep the mode argument ggml_rope as int so that in the future we could use it as a bit field to store multiple options if needed.

@danbev We would need to update mode & 2 -> mode & GGML_ROPE_TYPE_NEOX in the backends as ell.

This commit removes the enum ggml_rope_type from ggml.h and replaces it
with a define (GGML_ROPE_TYPE_NEOX). This define is used in the code to
check if the mode is set to GPT-NeoX. Also the enum llama_rope_type has
been updated to reflect this change.
@github-actions github-actions bot added Nvidia GPU Issues specific to Nvidia GPUs Vulkan Issues specific to the Vulkan backend SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language labels Aug 11, 2024
danbev added 2 commits August 11, 2024 08:16
This commit contains a suggestion enable the GGML_ROPE_TYPE_NEOX
macro/define to be passed to the shader compiler.
This commit fixes the editorconfig-checker warnings.
ggml/include/ggml.h Show resolved Hide resolved
ggml/src/CMakeLists.txt Outdated Show resolved Hide resolved
danbev added 3 commits August 11, 2024 10:00
Update comment for ggml_rope function.
Add GGML_ROPE_TYPE_NEOX to rope_common.comp.
@ggerganov ggerganov requested a review from slaren August 11, 2024 08:17
include/llama.h Outdated Show resolved Hide resolved
@slaren slaren merged commit 06943a6 into ggerganov:master Aug 13, 2024
50 of 51 checks passed
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 15, 2024
* ggml : move rope type enum to ggml.h

This commit moves the `llama_rope_type` enum from `llama.h` to
`ggml.h` and changes its name to `ggml_rope_type`.

The motivation for this change is to address the TODO in `llama.h` and
use the enum in ggml.

Note: This commit does not change the `mode` parameter to be of type
`enum ggml_rope_type`. The name `mode` and its usage suggest that it
might be more generic and possibly used as a bit field for multiple
flags. Further investigation/discussion may be needed to determine
if `mode` should be restricted to RoPE types.

* squash! ggml : move rope type enum to ggml.h

This commit removes GGML_ROPE_TYPE_NONE and GGML_ROPE_TYPE_GLM from
ggml.h, and back the llama_rope_type enum.

I've kept the assert for GGML_ROPE_TYPE_GLM as I'm not sure if it is
safe to remove it yet.

* squash! ggml : move rope type enum to ggml.h

This commit removes the enum ggml_rope_type from ggml.h and replaces it
with a define (GGML_ROPE_TYPE_NEOX). This define is used in the code to
check if the mode is set to GPT-NeoX. Also the enum llama_rope_type has
been updated to reflect this change.

* squash! ggml : move rope type enum to ggml.h

This commit contains a suggestion enable the GGML_ROPE_TYPE_NEOX
macro/define to be passed to the shader compiler.

* squash! ggml : move rope type enum to ggml.h

This commit fixes the editorconfig-checker warnings.

* squash! ggml : move rope type enum to ggml.h

Update comment for ggml_rope function.

* Revert "squash! ggml : move rope type enum to ggml.h"

This reverts commit 6261222.

* squash! ggml : move rope type enum to ggml.h

Add GGML_ROPE_TYPE_NEOX to rope_common.comp.

* remove extra line

---------

Co-authored-by: slaren <[email protected]>
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 18, 2024
* ggml : move rope type enum to ggml.h

This commit moves the `llama_rope_type` enum from `llama.h` to
`ggml.h` and changes its name to `ggml_rope_type`.

The motivation for this change is to address the TODO in `llama.h` and
use the enum in ggml.

Note: This commit does not change the `mode` parameter to be of type
`enum ggml_rope_type`. The name `mode` and its usage suggest that it
might be more generic and possibly used as a bit field for multiple
flags. Further investigation/discussion may be needed to determine
if `mode` should be restricted to RoPE types.

* squash! ggml : move rope type enum to ggml.h

This commit removes GGML_ROPE_TYPE_NONE and GGML_ROPE_TYPE_GLM from
ggml.h, and back the llama_rope_type enum.

I've kept the assert for GGML_ROPE_TYPE_GLM as I'm not sure if it is
safe to remove it yet.

* squash! ggml : move rope type enum to ggml.h

This commit removes the enum ggml_rope_type from ggml.h and replaces it
with a define (GGML_ROPE_TYPE_NEOX). This define is used in the code to
check if the mode is set to GPT-NeoX. Also the enum llama_rope_type has
been updated to reflect this change.

* squash! ggml : move rope type enum to ggml.h

This commit contains a suggestion enable the GGML_ROPE_TYPE_NEOX
macro/define to be passed to the shader compiler.

* squash! ggml : move rope type enum to ggml.h

This commit fixes the editorconfig-checker warnings.

* squash! ggml : move rope type enum to ggml.h

Update comment for ggml_rope function.

* Revert "squash! ggml : move rope type enum to ggml.h"

This reverts commit 6261222.

* squash! ggml : move rope type enum to ggml.h

Add GGML_ROPE_TYPE_NEOX to rope_common.comp.

* remove extra line

---------

Co-authored-by: slaren <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language Vulkan Issues specific to the Vulkan backend
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants