Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feat] on-paper form of RoPE #61

Open
yundai424 opened this issue Aug 24, 2024 · 2 comments
Open

[feat] on-paper form of RoPE #61

yundai424 opened this issue Aug 24, 2024 · 2 comments
Labels
enhancement New feature or request feature good first issue Good for newcomers

Comments

@yundai424
Copy link
Collaborator

yundai424 commented Aug 24, 2024

🚀 The feature, motivation and pitch

right now our implementation of RoPE assumes the rotation matrix is created and used in the HuggingFace model code way, i.e. instead of the rotation matrix described in original RoPE paper https://arxiv.org/pdf/2104.09864, we assume it looks something like this instead:

$$\begin{pmatrix} \cos m \theta_0 & 0 & 0 & \dots & 0 & -\sin m \theta_0 & 0 & 0 & \dots & 0 \\\ 0 & \cos m \theta_1 & 0 & \dots & 0 & 0 & -\sin m \theta_1 & 0 & \dots & 0 \\\ 0 & 0 & \cos m \theta_2 & \dots & 0 & 0 & 0 & -\sin m \theta_2 & \dots & 0 \\\ \vdots & \vdots & \vdots & \ddots & \vdots & \vdots & \vdots & \vdots & \ddots & \vdots \\\ 0 & 0 & 0 & \dots & \cos m \theta_{d/2-1} & 0 & 0 & 0 & \dots & -\sin m \theta_{d/2-1} \\\ \sin m \theta_0 & 0 & 0 & \dots & 0 & \cos m \theta_0 & 0 & 0 & \dots & 0 \\\ 0 & \sin m \theta_1 & 0 & \dots & 0 & 0 & \cos m \theta_1 & 0 & \dots & 0 \\\ 0 & 0 & \sin m \theta_2 & \dots & 0 & 0 & 0 & \cos m \theta_2 & \dots & 0 \\\ \vdots & \vdots & \vdots & \ddots & \vdots & \vdots & \vdots & \vdots & \ddots & \vdots \\\ 0 & 0 & 0 & \dots & \sin m \theta_{d/2-1} & 0 & 0 & 0 & \dots & \cos m \theta_{d/2-1} \end{pmatrix} \times \begin{pmatrix} q_0 \\\ q_1 \\\ q_2 \\\ \vdots \\\ q_{d/2-1} \\\ q_{d/2} \\\ q_{d/2+1} \\\ q_{d/2+2} \\\ \vdots \\\ q_{d-1} \end{pmatrix}$$

We should also support use cases where people create their RoPE cos & sin buffers following the original formula.

Alternatives

We may need to consider the complex form too (i.e. what official meta llama code is doing https://github.com/meta-llama/llama/blob/6c7fe276574e78057f917549435a2554000a876d/llama/model.py#L64-L74)

Additional context

No response

@yundai424 yundai424 added the enhancement New feature or request label Aug 24, 2024
@ByronHsu ByronHsu added the good first issue Good for newcomers label Aug 24, 2024
@yundai424 yundai424 changed the title on-paper form of RoPE [feat] on-paper form of RoPE Aug 24, 2024
@Himanshunitrr
Copy link

#take @ByronHsu I would like to implement this. Can you assign it to me?

@Comet0322
Copy link
Contributor

#take I made a PR, please take a look, thanks @ByronHsu

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request feature good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

4 participants