Fix incorrect use of `ctx_split` for bias tensors #9063

suhara · 2024-08-17T06:33:26Z

Creating a separate PR for incorrect use of ctx_split for bias tensors, following the suggestions by @slaren in in #8922 .

Please see this message for details.

Citing @slaren's comment

ctx_split only makes a difference when using tensor parallelism with -sm row, which is only supported on the CUDA backend when using multiple GPUs. When using -sm row, ctx_split splits the rows of the matrix between the available GPUs. This is only supported for matrix multiplication, so it should only be used with the matrix portion of linear/dense layers. The other cases are also wrong and should be corrected as well, but it doesn't need to be done here.

As far as I see, there are four such lines, which have been fixed in this PR.

Special thanks to @slaren !

I have read the contributing guidelines
Self-reported review complexity:
- Low
- Medium
- High

slaren

Thanks for fixing this.

Fix incorrect use of ctx_split for bias tensors

b1c163b

slaren approved these changes Aug 17, 2024

View reviewed changes

slaren merged commit 2fb9267 into ggerganov:master Aug 17, 2024
51 of 52 checks passed

arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 15, 2024

Fix incorrect use of ctx_split for bias tensors (ggerganov#9063)

4999a33

arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 18, 2024

Fix incorrect use of ctx_split for bias tensors (ggerganov#9063)

b84db8f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix incorrect use of `ctx_split` for bias tensors #9063

Fix incorrect use of `ctx_split` for bias tensors #9063

suhara commented Aug 17, 2024

slaren left a comment

Fix incorrect use of ctx_split for bias tensors #9063

Fix incorrect use of ctx_split for bias tensors #9063

Conversation

suhara commented Aug 17, 2024

slaren left a comment

Choose a reason for hiding this comment

Fix incorrect use of `ctx_split` for bias tensors #9063

Fix incorrect use of `ctx_split` for bias tensors #9063