-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tracking Issue for Missing BMI1, AVX2, SSE2, SSE4.1, SSE4a and TBM intrinsics #126936
Comments
These intrinsics were supposed to be part of a already-stabilized set, but were previously overlooked. @rfcbot fcp merge |
Team member @Amanieu has proposed to merge this. The next step is review by the rest of the tagged team members: No concerns currently listed. Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! See this document for info about what commands tagged team members can give me. |
🔔 This is now entering its final comment period, as per the review above. 🔔 |
Oh no, more non-temporal operations... quoting from a recent x86 memory model paper:
Are we sure we can just pretend that those are regular loads, for the purpose of language semantics? |
Non-temporal loads are not allowed to violate normal memory ordering rules, at least when accessing normal (i.e. write-back cachable) memory. x86 of course allows some regions of memory to be marked as write-combining, at which point the normal memory ordering rules go out the window, but this only happens for memory-mapped I/O, not normal memory. The problem with non-temporal stores on x86 is that they violate normal memory ordering rules even when used on normal (write-back) memory. See this answer on SO for more details. |
Thanks; I will get in touch with the authors of the paper to clarify whether the architect they spoke with was referring to non-temporal loads behaving in odd ways only for "non-standard" memory regions or also for write-back memory. Meanwhile, would be worth warning about people using these intrinsics on non-write-back memory? Though that warning is probably better placed at whatever operation creates such memory. It's not really well-defined to access such memory with Rust operations (i.e., outside of inline assembly) anyway... |
The final comment period, with a disposition to merge, as per the review above, is now complete. As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed. This will be merged soon. |
The feature gate is
#[feature(simd_x86_updates)]
.The Public API is 13 new intrinsics (probably overlooked in the
simd_x86
feature). See rust-lang/stdarch#1178.BMI1
_tzcnt_u16
AVX2
_mm_broadcastsi128_si256
_mm256_stream_load_si256
SSE2
_mm_loadu_si16
_mm_storeu_si16
_mm_loadu_si32
_mm_storeu_si32
_mm_storeu_si64
SSE4.1
_mm_stream_load_si128
SSE4a
_mm_extracti_si64
_mm_inserti_si64
TBM
_bextri_u32
_bextri_u64
Steps
Implementation History
sse4a
andtbm
intrinsics stdarch#1607We cannot add
_mm_malloc
and_mm_free
as they need access to OS, butcore_arch
is ano_std
environment.The text was updated successfully, but these errors were encountered: