Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add synchronization for multicore #782

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

soyersoyer
Copy link
Contributor

There is currently no memory synchronization between the processor cores, which could cause problems in theory. I added memory barriers and atomic access to the cores' status.

I also suspend the cores when waiting. I thought this would make the PI a little cooler, but I haven't been able to measure it. Can someone measure it with a USB power meter?

Maybe not all of these changes are necessary. I'll get to know the barriers better.

Copy link

github-actions bot commented Jan 3, 2025

Build for testing:
MiniDexed_2025-01-03-0b26d6c
Use at your own risk.

@probonopd
Copy link
Owner

Thanks @soyersoyer.
@rsta2, maybe you could have a quick glance at these changes to advise us whether we are on the right track here? Thank you very much!

Copy link

github-actions bot commented Jan 3, 2025

Build for testing:
MiniDexed_2025-01-03-40e62e2
Use at your own risk.

Copy link

github-actions bot commented Jan 6, 2025

Build for testing:
MiniDexed_2025-01-06-9e9d74e
Use at your own risk.

@soyersoyer
Copy link
Contributor Author

Barriers are not needed because they are already in the Acquire () / Release () of the spinlock of the CDexedAdapter getSamples () function.

Volatile variables are written and read with STR/LDR ARM instructions, which are atomic, and it would be enough if there were no other m_nFramesToProcess variable, for which there is no guarantee that its new value would be available to the other cores sooner than the new value of m_CoreStatus. If I understand correctly?
For std::atomic, the STLR/LDAR instruction is used for release/acquire and seq_cst (default) modes, which already ensures that the new value of m_nFrameToProcess is also available.

If volatiles remain, I think that another solution could be to omit the nFrames variable in ProcessSound (use m_nFramesToProcess instead)?
Or pass n_samples as a pointer to CDexedAdapter::getSamples and dereference it after m_SpinLock.Acquire () (DataMemoryBarrier)?

The other thing I don't know is whether Core1 can wait indefinitely if m_CoreStatus[nCore] != CoreStatusIdle is read, then it starts waiting for the interrupt, but between the read and the wait for the interrupt, m_CoreStatus changes (by Core2) and it has already received its Interrupt. Or, how can this be ensured so that it doesn't happen.

I haven't been able to measure yet whether it really consumes less this way and whether it's really worth it, I'll let you know when the meter arrives.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants