Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

atomics part has misleading statements about hardware reordering caused by cache #400

Open
NikZak opened this issue Feb 8, 2023 · 0 comments

Comments

@NikZak
Copy link

NikZak commented Feb 8, 2023

This is from rustonomicon

On the other hand, even if the compiler totally understood what we wanted and respected our wishes, our hardware might instead get us in trouble. Trouble comes from CPUs in the form of memory hierarchies. There is indeed a global shared memory space somewhere in your hardware, but from the perspective of each CPU core it is so very far away and so very slow. Each CPU would rather work with its local cache of the data and only go through all the anguish of talking to shared memory only when it doesn't actually have that memory in cache.

After all, that's the whole point of the cache, right? If every read from the cache had to run back to shared memory to double check that it hadn't changed, what would the point be? The end result is that the hardware doesn't guarantee that events that occur in some order on one thread, occur in the same order on another thread. To guarantee this, we must issue special instructions to the CPU telling it to be a bit less smart.

This is misleading and probably incorrect. CPU caches are always coherent (MESI protocol guarantees invalidation of other copies before committing a write to a cache line).

Out-of-order execution is one reason the reordering may happen in the CPU core but it is not the only reason. Here are two more examples of what may cause reordering in the CPU. And here is a detailed explanation of one of the examples (with still a lot of discussion afterwards).

Suggestion to rewrite parts about cache. I could suggest a rewriting but my knowledge of internal memory management in a CPU is limited and I don't want to inadvertently introduce more confusion or incorrect statements

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant