Multihead committees #800

beckobert · 2025-01-23T08:51:55Z

Hello everyone,

This pull request will allow to use the multihead mechanism to build computationally efficient committees of MACE models, by sharing the large atomic descriptor part of the MLP and only using different output blocks for the individual committee members. This can be used as an uncertainty measure for the MLP.

This PR aims to use and keep as much of the original infrastructure. The code currently works well and the results are promising, but there are at the moment two main problems, where I would be glad for any help and recommendations, and a few items that are still on my to-do list.

Problems

In theory, this multihead committee should predict energy at forces at a negligable additional computational costs, but in practice, this is not yet true. While the autodifferentiated graph is retained when computing the forces, MACE still has to repeat the actual calculations of the values for every committee member separately, even though most of it is redundant.
I had to change the MLP output layer away from the original masking set-up towards more structures connections between the nodes. This works very well for e3nn but I haven't figured out how to do this with cuEquivariance.

To-Do list

Adapt ASE calculator (should be straight forward)
Write tests
Write Documentation (once code is in it's final form)

Please, let me know you opinions, if you require any explanations and of course, if you have an idea how to tackle the 2 key problems.

ilyes319 · 2025-02-03T11:14:05Z

@beckobert, Hey, thank you for the well structured PR!!
Could you tell me in more details what is happening with cuequivariance? Also can you explain what are these "structured connections" that you changed to?

beckobert · 2025-02-04T09:47:30Z

The changes are in the NonLinearReadoutBlock. In the old version, the input nodes were connected to all hidden nodes and those were connected to all output nodes and during prediction, a mask blocked out all "unwanted" hidden nodes so that only the nodes corresponding to a certain output head contribute to the prediction. However, this does not work for the multihead committee, were I want the model to return the correct predictions for all heads at once.
My change ("structured connections") takes advantage of the instructions keyword in the e3nn layers, that allows to specify which nodes of the layers are connected to each other. So now, from each node in the hidden layer there is only one connection to the correct output node and no need for any masking any more. (This also makes the scaling of the normalization when loading a model and adding more heads for foundation model fine tuning easier, see related changes in that part of the code).
However, I have not found a feature similar to e3nn's instructions in cuEquvariance, so I am still looking for an "elegant" implementation of the NonLinearReadoutBlock when using that module.

tisabe · 2025-02-05T15:35:09Z

Hi, this PR looks very interesting, as I am currently also working with MACE committees to get uncertainty predictions, however it is very slow to train multiple models for multiple iterations (in an active learning scenario).
Unfortunately, I don't think I can help with your issues right now, but I am curious about the application and performance of this method. I assume the committee takes some advantages (other than performance) from the multihead functionality, e.g. use different bootstrapped training/validation splits for the different heads?

beckobert · 2025-02-07T15:13:18Z

Yes, that is a pretty much the basic idea of the PR.

beckobert added 28 commits October 16, 2024 17:43

Prototype for multihead committee MACE

194602a

prototype to train and eval committees

0a5b2d2

make committee training sets disjoint

c6c46e9

always shuffle training data

b6cce01

Fix head specification for specified validation file

b650cd1

diverse gates for nun-linear output block

eaf6042

choice between joint and disjoint train set

4772aee

bugfix

c5436d5

return energy contributions in scale_shift_mace

56b1f78

print out features in eval_configs

e75f7fe

cleanup

75e4a47

bug fixes

4d29a4e

fix eval configs

78bb5b8

diverse gates

b686869

Adapt finetuning to new readout

9f438b8

comment out return features

a3c9a2c

Adapt finetuning to multihead committees

4ebab3f

botch to freeze message-passing

0d7d14e

add argument to freeze message passing

8889e15

Adapt preprocessed datasets for MHC

49aa58e

Improve error tables

dd2c042

clean up

24d4a8c

Merge branch 'develop' into multihead-committee

ae99351

clean up eval configs

7b5d000

clean up error tables

ff348a1

bug fix

a3ccd74

skip cuequivariance with multihead for now

76308b2

Merge branch 'develop' into multihead-committee

842a7d0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multihead committees #800

Multihead committees #800

beckobert commented Jan 23, 2025

ilyes319 commented Feb 3, 2025

beckobert commented Feb 4, 2025

tisabe commented Feb 5, 2025

beckobert commented Feb 7, 2025 •

edited

Loading

Multihead committees #800

Are you sure you want to change the base?

Multihead committees #800

Conversation

beckobert commented Jan 23, 2025

ilyes319 commented Feb 3, 2025

beckobert commented Feb 4, 2025

tisabe commented Feb 5, 2025

beckobert commented Feb 7, 2025 • edited Loading

beckobert commented Feb 7, 2025 •

edited

Loading