-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AoS-to-SoA conversion #4
base: master
Are you sure you want to change the base?
Conversation
…he default appears optimal
@jeffhammond Thanks. Regrading the SoA versus AoS, I was thinking that AoS
would be more efficient since I loop over the each atom during the simulations and access all the data for a given atom inside the loop. AoS makes the memory for a given atom contiguous, so this is the reason that I would think AoS is more efficient. Or is the reason that SoA is more efficient here due to padding/alignment of memory? |
I don't see any performance benefit associated with this optimization right now, so empirically it is not significant. However, SoA is more amenable to vectorization, because there will be contiguous memory streams associated with all five struct members. In the AoS version, accesses are not contiguous and not aligned. I suspect that the effects of SoA will be more visible on an architecture like Xeon Phi, but I have not measured this yet. |
Feel free to reject this one, since it does not make a huge difference on performance. However, the AoS-to-SoA conversion is a well-known optimization that is worth illustrating in a code like this.
This optimization should also help in the CUDA version, but I don't plan to submit those changes, since Intel employees are discouraged from writing CUDA ;-)
AoS = array of structs, the way your code was before.
SoA = struct of arrays, as demonstrated in these changes.