Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upstream improvements needed for MRA on devices #298

Merged
merged 8 commits into from
Nov 12, 2024

Conversation

devreal
Copy link
Contributor

@devreal devreal commented Oct 16, 2024

Set of changes made while implementing the first MRA kernels with device support. More to come but I wanted to get these out.

Summary:

  • Allow specifying the scope (allocate/syncin) on buffers. This is useful for data that is first produced on the device to avoid the initial transfer.
  • Allowing const T on ttg::Buffer to specify that the content of the buffer is immutable.
  • Device-side bindings for broadcastk
  • Deferring errors as long as possible if a type is not copyable. We will try to defer the mutating task until all consumers of the object have completed. Only if we find a second mutating task on the same object we have to throw our hands up.
  • Introduce ttg::device::Stream representing a device stream object.
  • Couple of minor fixes

I tried to break down the changes into individual commits for easier review.

@devreal
Copy link
Contributor Author

devreal commented Nov 12, 2024

Rebased on top of current master. I need a review on this so I can merge and make progress on other PRs.

Similar to device scratch, we may want to specify
whether the buffer should be synchronized to the device or not.
This can help reduce superfluous transfers in cases where the
data is overwritten on the device anyway.

Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Signed-off-by: Joseph Schuchart <[email protected]>
Also more aggressive nulling of data_out pointers

Signed-off-by: Joseph Schuchart <[email protected]>
Try to defer tasks whenever possible if a conflict is detected
and the type is not copyable. Only throw an exception
if we run out of options (e.g., if there are two competing tasks
mutating a value).

Signed-off-by: Joseph Schuchart <[email protected]>
Copy link
Contributor

@therault therault left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM...

2 questions, the second is not directly related to this PR:

  • this does not change the PaRSEC dependency. Does this work with the master PaRSEC (or at least commit 58f8f3089eca which is selected in TTG master)?
  • we don't have TTG tests that are compiled by default, so we need to go in the output of tests to see if these .h files compile without a problem.... Should we change that at some point?

@devreal
Copy link
Contributor Author

devreal commented Nov 12, 2024

Yes, this does not yet need a new ParSEC version. That will be a separate PR.

The CI should compile all examples and tests and so all headers are compiled multiple times. Is that what you mean?

@therault
Copy link
Contributor

therault commented Nov 12, 2024 via email

@devreal devreal merged commit c22faf9 into TESSEorg:master Nov 12, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants