Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide collective operation TTs #276

Open
devreal opened this issue Mar 11, 2024 · 0 comments
Open

Provide collective operation TTs #276

devreal opened this issue Mar 11, 2024 · 0 comments

Comments

@devreal
Copy link
Contributor

devreal commented Mar 11, 2024

There are applications that require scalable global collective communication (e.g., allreduce for matrix-vector multiplication, like in CG). Currently, these reductions won't be efficient in TTG with the reduction terminals because they are a star and we have no notion of collectiveness in them. TTG should expand the set of collective operations and could even integrate MPI collectives for scalability. It could look like this:

ttg::Edge<void, double> rin, rout;
auto reduce_tt = ttg::coll::reduce(MPI_COMM_WORLD, rin, rout, 1, MPI_SUM, root); // sum over 1 element of type double
auto producer_tt = ttg::make_tt(..., ttg::edges(), ttg::edges(rin));
auto consumer_tt = ttg::make_tt(..., ttg::edges(rout), ...); // may distribute the value further

The input and output edges must have key type void because there can be only one concurrent instance per collective TT. When creating the TT we duplicate the communicator so there can be multiple collective TTs at the same time. The backend will need a way to suspend the task and check for the operation to complete so as to not block the thread in MPI.

Straightforward operations to consider:

  • Reduce and allreduce
  • Broadcast (we have ttg::bcast but it's not using the underlying collective)

There should probably be an overload for std::vector for count > 1.

Would need some more thought on how to describe the difference between input and output count (and a use-case):

  • Gather and scatter
  • Alltoall
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant