Allow user to control extent of weight sharing for spherical targets #458

jwa7 · 2025-01-28T09:01:54Z

When learning multiple angular channels of a target decomposed on a spherical basis, often it is useful to train independent models for each o3_lambda with little to no weight sharing. It would be useful to be able to control how models per block are initialised.

As far as I understand, currently different L channels in a spherical target share the same weights until the head, where the TensorBasis builds the specific equivariant of order L with a transformation that contains (relative to the rest of the architecture) minimal weights.

This is also connected to learning on a basis that depends on atomic types (i.e. for the electron density), where each combination of o3_lambda and center_type (and in general o3_sigma) can be learned with different models, and is therefore linked to issue #444 .

For example, we could include an option along the lines of "weight_share_by" (or alternatively a negation like "separate_weights_for") that takes a list of key dimensions. Passing a target TensorMap with keys ["o3_lambda", "o3_sigma", "center_type"] with option "weight_share_by": [] would mean that separate SOAP-BPNN models are initialised for each block.

The text was updated successfully, but these errors were encountered:

Luthaf · 2025-01-28T14:23:09Z

So this feels like a question about the strategy for equivariant models in SOAP-BPNN. The current strategy adds a small final layer to learn equivariant on top of an invariant architecture; while if I understand it correctly, what you are asking would be better served by a fully equivariant NN, is this right?

It might make sense to define the equivariant strategy in more details, but this could also be a separate architecture (instead of more options in the same architecture).

jwa7 · 2025-01-28T15:06:36Z

I thought the idea was (and correct me if I'm wrong @MichelangeloDomina) was that the TensorBasis allows you to define an arbitrary scalar function to build an internal invariant representation that is projected onto an equivariant basis at/near the output layer.

This means you're not constrained to equivariant operations before that, nor do you suffer the dimensionality problems you get from generating and transforming equivariant descriptors at higher L. If you just have a fully equivariant NN, you don't get these benefits - or is there something I'm missing here?

In any case, regardless of what the scalar function is, it might be useful to define models with independent weights for each L channel, whether it be SOAP-BPNN, or PET or whatever

jwa7 added Discussion Issues to be discussed by the contributors Infrastructure: Miscellaneous General infrastructure issues SOAP BPNN SOAP BPNN experimental architecture labels Jan 28, 2025

jwa7 mentioned this issue Jan 28, 2025

Some general points about sample handling #459

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow user to control extent of weight sharing for spherical targets #458

Allow user to control extent of weight sharing for spherical targets #458

jwa7 commented Jan 28, 2025 •

edited

Loading

Luthaf commented Jan 28, 2025

jwa7 commented Jan 28, 2025

Allow user to control extent of weight sharing for spherical targets #458

Allow user to control extent of weight sharing for spherical targets #458

Comments

jwa7 commented Jan 28, 2025 • edited Loading

Luthaf commented Jan 28, 2025

jwa7 commented Jan 28, 2025

jwa7 commented Jan 28, 2025 •

edited

Loading