Skip to content

Commit

Permalink
CUDA: fix MMQ stream-k for --split-mode row (#8167)
Browse files Browse the repository at this point in the history
  • Loading branch information
JohannesGaessler authored Jun 27, 2024
1 parent f675b20 commit 85a267d
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion ggml/src/ggml-cuda/mmq.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -2475,7 +2475,7 @@ static void launch_mul_mat_q(ggml_backend_cuda_context & ctx, const mmq_args & a

const dim3 block_nums_mmq(nsm, 1, 1);

ggml_cuda_pool & pool = ctx.pool();
ggml_cuda_pool & pool = ctx.pool(id);
ggml_cuda_pool_alloc<float> tmp_fixup(pool, block_nums_mmq.x * mmq_x*mmq_y);

if (args.ne01 % mmq_y == 0) {
Expand Down

0 comments on commit 85a267d

Please sign in to comment.