[Enhancement] Native Convolution Support #907

airMeng · 2024-08-01T02:02:16Z

hi @ggerganov @slaren, current ggml_conv_2d is implemented only in img2col algorithms, which is quite slow for most of the cases and will introduce extra memory consumption*
* source: https://github.com/leejet/stable-diffusion.cpp/blob/4a6e36edc586779918535e12b4fbe0583044ee6f/README.md#L57

Shall I made the the following changes and leave the convolution implementation to the each backend which I believe there will be much more efficient implementation themselves:

struct ggml_tensor * ggml_conv_2d(
        struct ggml_context * ctx,
        struct ggml_tensor  * a,
        struct ggml_tensor  * b,
        int                  s0,
        int                  s1,
        int                  p0,
        int                  p1,
        int                  d0,
        int                  d1) {
-    struct ggml_tensor * im2col = ggml_im2col(ctx, a, b, s0, s1, p0, p1, d0, d1, true, GGML_TYPE_F16); // [N, OH, OW, IC * KH * KW]
-
-    struct ggml_tensor * result =
-        ggml_mul_mat(ctx,
-                ggml_reshape_2d(ctx, im2col, im2col->ne[0],  im2col->ne[3] * im2col->ne[2] * im2col->ne[1]), // [N, OH, OW, IC * KH * KW] => [N*OH*OW, IC * KH * KW]
-                ggml_reshape_2d(ctx, a, (a->ne[0] * a->ne[1] * a->ne[2]),  a->ne[3]));                       // [OC，IC, KH, KW] => [OC, IC * KH * KW]
-
-    result = ggml_reshape_4d(ctx, result, im2col->ne[1], im2col->ne[2], im2col->ne[3], a->ne[3]); // [OC, N, OH, OW]
-    result = ggml_cont(ctx, ggml_permute(ctx, result, 0, 1, 3, 2)); // [N, OC, OH, OW]

+    ne = ...
+    struct ggml_tensor * result = ggml_new_tensor(ctx, a->dtype, 4, ne);
+    int32_t params[] = { s0, s1, p0, p1, d0, d1};
+    ggml_set_op_params(result, params, sizeof(params));
+    result->op = GGML_OP_CONV_2D;
+    result->grad = is_node ? ggml_dup_tensor(ctx, result) : NULL;
+    result->src[0] = a;
+    result->src[1] = b;

    return result;
}

@luoyu-intel cc our performance expert for awareness

The text was updated successfully, but these errors were encountered:

slaren · 2024-08-01T14:28:13Z

It would be good to allow backends to implement convolutions without im2col, but if you make this change as you are suggesting you are going to break conv2d support in every backend. Do you intend to fix the implementations in the backends as well?

airMeng · 2024-08-01T14:35:08Z

yes, I will split it into several PRs, the first PR will update the ggml_conv_2d as the above and keep the current im2col implementation in each backend itself, does it make sense?

slaren · 2024-08-01T14:42:56Z

Sure, that would be good, but it may require more changes than you expect. For instance, the Metal backend does not have an internal memory pool to allocate memory for the im2col, and the CUDA backend may require significant changes to adapt ggml_cuda_mul_mat to work with a temporary buffer allocated from the pool.

airMeng · 2024-08-05T01:54:12Z

Okay it takes me a weekend to try it on the Metal backend and find it really a lot of changes. Can I only dispatch on the SYCL backend? It will introduce additional macros in the common files

slaren · 2024-08-05T19:05:42Z

Until we have a better way to do this, you could add a new function and let the applications choose which one to use depending on the backend they are using. This cannot be done in the common ggml files since ggml does not know what backend will be used at the time ggml_conv_2d is called.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Enhancement] Native Convolution Support #907

[Enhancement] Native Convolution Support #907

airMeng commented Aug 1, 2024 •

edited

Loading

slaren commented Aug 1, 2024

airMeng commented Aug 1, 2024

slaren commented Aug 1, 2024

airMeng commented Aug 5, 2024 •

edited

Loading

slaren commented Aug 5, 2024

[Enhancement] Native Convolution Support #907

[Enhancement] Native Convolution Support #907

Comments

airMeng commented Aug 1, 2024 • edited Loading

slaren commented Aug 1, 2024

airMeng commented Aug 1, 2024

slaren commented Aug 1, 2024

airMeng commented Aug 5, 2024 • edited Loading

slaren commented Aug 5, 2024

airMeng commented Aug 1, 2024 •

edited

Loading

airMeng commented Aug 5, 2024 •

edited

Loading