You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
if there is no column linear layer with all-gather, we can't deal with single linear layer
i can see the rowlinear with allreduce aka LinearAllreduce. but there is no any implementations about column linear layer with all gather.
how could i set the linear type when running dit models:
i can set the attn.to_out.0 attn.to_add_out ff.net.2 ff_context.net.2 to LinearAllreduce, but how to deal with norm1.linear and norm1_context.linear. i need all gather the results of a single linear layer or it will cause error because the inputs of both norm1 and norm1_context are a whole hidden_states
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
if there is no column linear layer with all-gather, we can't deal with single linear layer
i can see the rowlinear with allreduce aka LinearAllreduce. but there is no any implementations about column linear layer with all gather.
how could i set the linear type when running dit models:
i can set the
attn.to_out.0 attn.to_add_out ff.net.2 ff_context.net.2
to LinearAllreduce, but how to deal withnorm1.linear
andnorm1_context.linear
. i need all gather the results of a single linear layer or it will cause error because the inputs of both norm1 and norm1_context are a whole hidden_statesThe text was updated successfully, but these errors were encountered: