From 8822608fcb4d81dc2d8aa26c04d20d9841d28bd4 Mon Sep 17 00:00:00 2001 From: John Stiles Date: Mon, 24 Apr 2023 16:42:19 -0400 Subject: [PATCH] Reorder struct for better Neon codegen. I unexpectedly discovered that we can reduce our splat-copy ops by one instruction by swapping these struct fields. This is apparently because we can wedge a right-shift by 32 into an add instruction. (`uxtw` means zero-extend.) Before (splat_2_constants): bcf4: 28 04 40 f9 ldr x8, [x1, #8] bcf8: 09 fd 60 d3 lsr x9, x8, #32 <--- eliminated bcfc: 30 01 27 1e fmov s16, w9 bd00: 10 06 04 4e dup.4s v16, v16[0] bd04: 88 40 28 8b add x8, x4, w8, uxtw <--- changed bd08: 10 41 00 ad stp q16, q16, [x8] bd0c: 25 0c 41 f8 ldr x5, [x1, #16]! bd10: a0 00 1f d6 br x5 After: baa0: 28 04 40 f9 ldr x8, [x1, #8] baa4: 10 01 27 1e fmov s16, w8 baa8: 10 06 04 4e dup.4s v16, v16[0] baac: 88 80 48 8b add x8, x4, x8, lsr #32 bab0: 10 41 00 ad stp q16, q16, [x8] bab4: 25 0c 41 f8 ldr x5, [x1, #16]! bab8: a0 00 1f d6 br x5 (This also saves an op on Haswell!) Change-Id: Icea7196b42bc4057d697bbf049d368193d46f27e Reviewed-on: https://skia-review.googlesource.com/c/skia/+/679719 Commit-Queue: John Stiles Auto-Submit: John Stiles Reviewed-by: Brian Osman Commit-Queue: Brian Osman --- src/core/SkRasterPipelineOpContexts.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/core/SkRasterPipelineOpContexts.h b/src/core/SkRasterPipelineOpContexts.h index 715728382931..af2d57d65d51 100644 --- a/src/core/SkRasterPipelineOpContexts.h +++ b/src/core/SkRasterPipelineOpContexts.h @@ -156,8 +156,8 @@ struct SkRasterPipeline_TablesCtx { using SkRPOffset = uint32_t; struct SkRasterPipeline_ConstantCtx { - SkRPOffset dst; float value; + SkRPOffset dst; }; struct SkRasterPipeline_UniformCtx {