Unexpected behavior for GGML_MEAN #1005

JohannesGaessler · 2024-11-02T19:25:31Z

The usual behavior for the "mean" operation in numerical frameworks is a reduction of a tensor to a single value. However, in GGML this operation instead calculates the mean per row. This is I think unexpected behavior and the naming is also inconsistent with GGML_SUM (per tensor) and GGML_SUM_ROWS (per row).

The text was updated successfully, but these errors were encountered:

aksh16 · 2024-11-25T20:54:24Z

Hi, is this issue still up for grabs? If it is, may I take this up?

slaren · 2024-11-25T20:57:29Z

Most operations already operate on a per-row basis. If you want a full reduction, in most cases you can do that by creating an 1D view (at worst you might have to make it contiguous). It might make more sense to standardize on that rather than duplicating the ops between a global and per-row version.

JohannesGaessler · 2024-11-25T21:08:32Z

I am not aware of anyone working on this at the moment. If desirable we can talk for a few minutes in order to explain the general structure of the code.

Most operations already operate on a per-row basis. If you want a full reduction, in most cases you can do that by creating an 1D view (at worst you might have to make it contiguous). It might make more sense to standardize on that rather than duplicating the ops between a global and per-row version.

As of right now a per-tensor mean is not used for anything so performance would be completely irrelevant anyways. I think ggml_scale(ggml_sum(a), 1.0f/ggml_nelements(a)) would be a simpler workaround than calculating a per-row mean though. The minimal solution would I think be to just rename the op since the confusing name is in my opinion the biggest issue.

slaren · 2024-11-25T21:18:32Z

It might make more sense to remove ggml_sum entirely and keep only ggml_sum_rows (and rename it to ggml_sum), and change other operations that operate on all the elements to operate per-row. Most operations already operate on rows, so why should sum or mean be different? And again, if you want to operate on all the elements, you can in most cases make a 1D view.

JohannesGaessler · 2024-11-25T21:49:33Z

I would also be fine with having only a single op for summation and defining reductive GGML ops to be per-row by default (but I think that we should then document this since the default in e.g. NumPy is per tensor). As far as I can tell there are currently SUM, COUNT_EQUAL and CROSS_ENTROPY_LOSS that are doing reductions per tensor. SUM and COUNT_EQUAL can be removed/converted to per-row logic without issue but not CROSS_ENTROPY_LOSS I think because for that operation the shape of the input matters.

aksh16 · 2024-11-27T09:07:21Z

So, these could be the tasks for this issue?

rename ggml_sum_rows to ggml_sum
remove the current ggml_sum
update count_equal to per row
update documentation to reflect these changes (the per row ops)

JohannesGaessler · 2024-11-27T13:36:42Z

That seems to cover it. There are C and CUDA implementations for COUNT_EQUAL that will need to be adjusted. If you are unfamiliar with CUDA I can help with that part.

aksh16 · 2024-11-27T20:16:09Z

I have played around a little bit with CUDA but I might reach out if I feel stuck.

JohannesGaessler added the good first issue Good for newcomers label Nov 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unexpected behavior for GGML_MEAN #1005

Unexpected behavior for GGML_MEAN #1005

JohannesGaessler commented Nov 2, 2024

aksh16 commented Nov 25, 2024

slaren commented Nov 25, 2024 •

edited

Loading

JohannesGaessler commented Nov 25, 2024

slaren commented Nov 25, 2024 •

edited

Loading

JohannesGaessler commented Nov 25, 2024

aksh16 commented Nov 27, 2024

JohannesGaessler commented Nov 27, 2024

aksh16 commented Nov 27, 2024

Unexpected behavior for GGML_MEAN #1005

Unexpected behavior for GGML_MEAN #1005

Comments

JohannesGaessler commented Nov 2, 2024

aksh16 commented Nov 25, 2024

slaren commented Nov 25, 2024 • edited Loading

JohannesGaessler commented Nov 25, 2024

slaren commented Nov 25, 2024 • edited Loading

JohannesGaessler commented Nov 25, 2024

aksh16 commented Nov 27, 2024

JohannesGaessler commented Nov 27, 2024

aksh16 commented Nov 27, 2024

slaren commented Nov 25, 2024 •

edited

Loading

slaren commented Nov 25, 2024 •

edited

Loading