Is this a typo of rotary embedding (rope)? #184

xilanhua12138 · 2025-01-14T06:29:10Z

` def forward(self, hidden_states: torch.Tensor, frame_id: int=None) -> torch.Tensor:
batch_size, num_channels, num_frames, height, width = hidden_states.shape
rope_sizes = [num_frames // self.patch_size_t, height // self.patch_size, width // self.patch_size]

    axes_grids = []
    for i in range(3):
        # Note: The following line diverges from original behaviour. We create the grid on the device, whereas
        # original implementation creates it on CPU and then moves it to device. This results in numerical
        # differences in layerwise debugging outputs, but visually it is the same.
        grid = torch.arange(0, rope_sizes[i], device=hidden_states.device, dtype=torch.float32)
        axes_grids.append(grid)
    grid = torch.meshgrid(*axes_grids, indexing="ij")  # [W, H, T]
    grid = torch.stack(grid, dim=0)  # [3, W, H, T]

`

this should be [T, H, W] not [W, H, T]， am I thinking right？

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is this a typo of rotary embedding (rope)? #184

Is this a typo of rotary embedding (rope)? #184

xilanhua12138 commented Jan 14, 2025 •

edited

Loading

Is this a typo of rotary embedding (rope)? #184

Is this a typo of rotary embedding (rope)? #184

Comments

xilanhua12138 commented Jan 14, 2025 • edited Loading

xilanhua12138 commented Jan 14, 2025 •

edited

Loading