Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stb_image_resize2: SIMD decoder broken when using input callback and unaligned image width #1749

Open
jspohr opened this issue Feb 21, 2025 · 7 comments

Comments

@jspohr
Copy link

jspohr commented Feb 21, 2025

Describe the bug
Under rare circumstances, the SIMD decoder loop in stbir__decode_uint8_linear will run over already decoded data, and produces garbage values.
Conditions where this happens:

  1. Input callback is set, and the callback uses the provided temp scanline buffer
  2. SIMD is enabled
  3. Size of a scanline of the input image is not a multiple of 16 bytes

I'm using stbir__decode_uint8_linear as an example, but the bug can be triggered in stbir__decode_uint16_linear, as well as the ..._scaled variants.
When width_times_channels isn't a multiple of 16, the loop will decode the remainder twice - which is a problem if (and only if) the input buffer and the decode buffer are overlapping. This is the case when an input callback is set, in which case stbir__decode_scanline will shave off a temp buffer at the end of the decode buffer, which makes those two overlap. When decoding the remaining bytes following the 16 byte long sections, the SIMD code will go over the bytes it has already decoded to floats, and decode them again. In the output, you'll see a black bar on the right side of the image, where float data was interpreted as uint8 data and effectively converted to float twice.

To Reproduce
Steps to reproduce the behavior:

  1. Use an input image of size 17x1 pixels, uint8 linear.
  2. Resize it to 16x1, providing an input callback that writes the value 255 to all 17 bytes of the temp buffer.
  3. Make sure SIMD is enabled
  4. The decoder will create a series of values where there's a 0.0f (which is where it read a 0 byte as part of the 1.0f float value) and other "garbage" values.

Expected behavior
The decoder should create a run of 17 times the value 1.0f.
The issue can be worked around by either disabling SIMD, or returning your own temp input buffer from the callback, so the input and decode buffers don't overlap.

Screenshots

@jeffrbig2
Copy link

Hmmm, yeah, that's intentional. For the remnant loops, we just run as far as we can and then back up and do the input again right to the edge. There are other places where the input pixels are read more than once as well (wrap mode filtering). Also, if the buffer is in flipped mode, you'll also overwrite.

You'll have to buffer the double buffer input scanline if it might change.

@jspohr
Copy link
Author

jspohr commented Feb 22, 2025

Sure, I understand that reading the same data twice is ok and by design, but not in this case, where the decoder reads data it had already overwritten. That’s what I really meant by decoding it twice. It would be nice to at least trigger an assert in this case, as right now the resize will silently hand me a broken image.

You'll have to buffer the double buffer input scanline if it might change.

Can you please elaborate? I can use the temp buffer if my scanline width is aligned, and if not, I have to allocate a whole line and return that from the callback?
Thanks!

@jeffrbig2
Copy link

Oh, I see - I thought you were using the same input and output buffer (which would need at least scanline buffering). Let me look at that decoder loop and see how we can pad with the same functionality...

@jspohr
Copy link
Author

jspohr commented Feb 22, 2025

Yes, increasing the width of the padding so that the write pointer doesn't catch up with the read pointer before it enters the remnant loop should fix it. I gave it a try myself but wasn't absolutely sure that it wouldn't overrun the decode buffer if I increased the temp buffer size.
Thanks for looking into it!

@jeffrbig2
Copy link

Yeah, it's complicated with all the weird modes you can run in...

@jeffrbig2
Copy link

BTW, what I meant by double buffering was that you don't use the optional output and instead just fill your own static buffer and return that pointer from the callback...

@jspohr
Copy link
Author

jspohr commented Feb 22, 2025

Ah I see, thanks for the follow-up. That's what I'm doing now, I did figure that's the easiest way to workaround the issue (other than turning off SIMD).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants