Skip to content

Commit

Permalink
llama : advanced batch splits
Browse files Browse the repository at this point in the history
This includes equal-sequence-length batch splits which are useful
to simplify recurrent model operators.

* llama : always make recurrent state slots contiguous

* ggml : simplify mamba operators
  • Loading branch information
compilade committed Jul 17, 2024
1 parent a38b884 commit c51daef
Show file tree
Hide file tree
Showing 3 changed files with 1,060 additions and 647 deletions.
9 changes: 3 additions & 6 deletions ggml/include/ggml.h
Original file line number Diff line number Diff line change
Expand Up @@ -1760,10 +1760,8 @@ extern "C" {

GGML_API struct ggml_tensor * ggml_ssm_conv(
struct ggml_context * ctx,
struct ggml_tensor * s,
struct ggml_tensor * x,
struct ggml_tensor * c,
struct ggml_tensor * sq);
struct ggml_tensor * sx,
struct ggml_tensor * c);

GGML_API struct ggml_tensor * ggml_ssm_scan(
struct ggml_context * ctx,
Expand All @@ -1772,8 +1770,7 @@ extern "C" {
struct ggml_tensor * dt,
struct ggml_tensor * A,
struct ggml_tensor * B,
struct ggml_tensor * C,
struct ggml_tensor * sq);
struct ggml_tensor * C);

// partition into non-overlapping windows with padding if needed
// example:
Expand Down
Loading

0 comments on commit c51daef

Please sign in to comment.