Message passing operation #59

raimis · 2022-06-07T15:15:11Z

Move the code from TorchMD-NET (Nearest neighbor operation #58)
Integration
Tests
Documentation

raimis · 2022-08-19T15:26:45Z

@peastman could you review?

src/pytorch/messages/passMessages.py

peastman · 2022-08-22T16:24:21Z

src/pytorch/messages/passMessages.py

+        details.
+    messages: `torch.Tensor`
+        Atom pair messages. The shape of the tensor is `(num_pairs, num_features)`.
+        For efficient, `num_features` has to be a multiple of 32 and <= 1024.


Are those limitations really necessary? It's very common for the number of features not to be a multiple of 32.

In contrary, the number of internal features is always factor of 32 (e.g. in TorchMD-NET, I have seen usage of 64, 128, 256). GPU computes in warps of 32 threads, so it is the best to match that patter for computational efficiency.

I frequently create models that don't satisfy those requirements, including in TorchMD-Net. For example, I've trained models with 48 or 80 features per layer.

I don't understand your question. Why not? The number of features is a hyperparameter. It's one of many hyperarameters you tune to balance training accuracy, overfitting, speed, and memory use. Why place arbitrary limits on it when there's no need to?

peastman · 2022-08-22T16:32:50Z

src/pytorch/messages/passMessagesCUDA.cu

+    const int32_t i_feat = threadIdx.x;
+    atomicAdd(&new_states[i_atom][i_feat], messages[i_neig][i_feat]);


You can eliminate the limitations on number of features by just rewriting this as a loop.

for (int32_t i_feat = threadIdx.x; i_feat < num_features; i_feat += blockDim.x) atomicAdd(&new_states[i_atom][i_feat], messages[i_neig][i_feat]);

Apart from solving a non-existing problem, this would make the memory access not coalesced and reduce speed.

It would have no effect on speed at all. If the number of features happens to satisfy your current requirement, the behavior would be identical to what it currently does. The atomicAdd() would be executed once by every thread with i_feat equal to threadIdx.x. The only change would be if the number doesn't satisfy your current requirements, either because it's not a multiple of 32 or it's more than 1024. In that case it would produce correct behavior, unlike the current code. So there's no downside at all.

It will reduce the number of thread by the number of features. The reduced parallelism would result into reduced speed.

I'm not suggesting any change to the number of threads. The only thing I'm suggesting is wrapping the atomicAdd() in a loop as shown above. If num_features happens to match your current restrictions, nothing will change. Every thread will still call it exactly once.

Move the code from openmm#58

1a44054

raimis self-assigned this Jun 7, 2022

Raimondas Galvelis added 8 commits June 7, 2022 17:39

Integrate the message passing kernel

8fd009d

Support pre-Kepler GPUs

b9de467

Merge branch 'master' into messages

5d9994a

Fix a merging artifact

4070236

Fix another merge artifact

7f47b19

Factor out atomicAdd

6f22079

Add an include guard

b8e0089

Factor out the accessor

71a8469

Merge branch 'openmm:master' into messages

de52137

raimis mentioned this pull request Aug 2, 2022

Refactor the common code #65

Merged

2 tasks

Raimondas Galvelis added 7 commits August 2, 2022 15:53

Fix a function name

982e5b5

Merge branch 'master' into messages

0463d65

Add the message passing tests

c0c6142

Tune the test tolerances

5b6507e

Enable the tests

8826219

Fixed the tests

9bf61c6

Add documentation

77cb418

raimis marked this pull request as ready for review August 19, 2022 15:26

raimis requested a review from peastman August 19, 2022 15:26

peastman reviewed Aug 22, 2022

View reviewed changes

src/pytorch/messages/passMessages.py Outdated Show resolved Hide resolved

peastman reviewed Aug 22, 2022

View reviewed changes

Fix a typo

47c2495

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Message passing operation #59

Message passing operation #59

raimis commented Jun 7, 2022 •

edited

Loading

raimis commented Aug 19, 2022

peastman Aug 22, 2022

raimis Sep 7, 2022

peastman Sep 8, 2022

raimis Sep 16, 2022

peastman Sep 18, 2022

peastman Aug 22, 2022

raimis Sep 7, 2022

peastman Sep 8, 2022

raimis Sep 16, 2022

peastman Sep 18, 2022

		const int32_t i_feat = threadIdx.x;
		atomicAdd(&new_states[i_atom][i_feat], messages[i_neig][i_feat]);

Message passing operation #59

Are you sure you want to change the base?

Message passing operation #59

Conversation

raimis commented Jun 7, 2022 • edited Loading

raimis commented Aug 19, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

raimis commented Jun 7, 2022 •

edited

Loading