Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experiment with simd_masked_load to read beyond without undefined behavior #98

Draft
wants to merge 30 commits into
base: main
Choose a base branch
from

Conversation

ogxd
Copy link
Owner

@ogxd ogxd commented Nov 6, 2024

No description provided.

@ogxd ogxd self-assigned this Nov 6, 2024
@ogxd ogxd force-pushed the read-beyond-no-ub branch from b7705c9 to ce077bf Compare November 6, 2024 13:33

let indices = vld1q_s8([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15].as_ptr());
let mask = vreinterpretq_s8_u8(vcgtq_s8(vdupq_n_s8(len as i8), indices));
std::intrinsics::simd::simd_masked_load(mask, data as *const i8, vdupq_n_s8(len as i8))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So do I understand correctly that it would be valid for LLVM to basically emit the same assembly as before, but somehow it doesn't?

I'm not a codegen expert so I have no idea whether this is just hard for LLVM to do or something they could reasonably fix. Maybe it'd be worth making an LLVM bugreport about this? (We can try to find some people that could help nail down the core issue here, if you are interested.)

@ogxd ogxd force-pushed the read-beyond-no-ub branch from 9a04834 to 57ddc68 Compare November 8, 2024 21:50
@ogxd ogxd force-pushed the read-beyond-no-ub branch from 5cefdf3 to 61da550 Compare November 9, 2024 22:25
@ogxd ogxd force-pushed the read-beyond-no-ub branch from 61da550 to 259e8a4 Compare November 9, 2024 22:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consider using simd_masked_load for the Read Beyond of Death trick
2 participants