Molecule graph environment #139

michalkoziarski · 2023-06-16T15:22:47Z

First round of changes for the sake of conformer experiments.

Works with the MLP policy (https://wandb.ai/michalkoziarski/GFlowNet/reports/Conformers-v5-beta-32---Vmlldzo1MTAwNTA1), contains some initial implementation of graph policy.

We tried to make it as small as possible!

…ni_proxy # Conflicts: # gflownet/envs/alaninedipeptide_mixture.py # gflownet/utils/molecule/datasets.py # gflownet/utils/molecule/dgl_conformer.py # gflownet/utils/molecule/old_conformer.py # gflownet/utils/molecule/rdkit_conformer.py # gflownet/utils/molecule/torsions.py # tests/gflownet/utils/molecule/test_torsions.py

Merge main into molecule_graph_env

Using TorchANI proxy for AlanineDipeptide

michalkoziarski · 2023-09-07T20:17:29Z

gflownet/utils/molecule/rotatable_bonds.py

+# Taken from https://pyxtal.readthedocs.io/en/latest/_modules/pyxtal/molecule.html.
+
+from operator import itemgetter
+
+
+def find_rotor_from_smile(smile):
+    """
+    Find the positions of rotatable bonds in the molecule.
+    """
+
+    def cleaner(list_to_clean, neighbors):
+        """
+        Remove duplicate torsion from a list of atom index tuples.
+        """
+
+        for_remove = []
+        for x in reversed(range(len(list_to_clean))):
+            ix0 = itemgetter(0)(list_to_clean[x])
+            ix3 = itemgetter(3)(list_to_clean[x])
+            # for i-j-k-l, we don't want i, l are the ending members
+            # here C-C-S=O is not a good choice since O is only 1-coordinated
+            if neighbors[ix0] > 1 and neighbors[ix3] > 1:
+                for y in reversed(range(x)):
+                    ix1 = itemgetter(1)(list_to_clean[x])
+                    ix2 = itemgetter(2)(list_to_clean[x])
+                    iy1 = itemgetter(1)(list_to_clean[y])
+                    iy2 = itemgetter(2)(list_to_clean[y])
+                    if [ix1, ix2] == [iy1, iy2] or [ix1, ix2] == [iy2, iy1]:
+                        for_remove.append(y)
+            else:
+                for_remove.append(x)
+        clean_list = []
+        for i, v in enumerate(list_to_clean):
+            if i not in set(for_remove):
+                clean_list.append(v)
+        return clean_list
+
+    if smile in ["Cl-", "F-", "Br-", "I-", "Li+", "Na+"]:
+        return []
+    else:
+        from rdkit import Chem
+
+        smarts_torsion1 = "[*]~[!$(*#*)&!D1]-&!@[!$(*#*)&!D1]~[*]"
+        smarts_torsion2 = "[*]~[^2]=[^2]~[*]"  # C=C bonds
+        # smarts_torsion2="[*]~[^1]#[^1]~[*]" # C-C triples bonds, to be fixed
+
+        mol = Chem.MolFromSmiles(smile)
+        N_atom = mol.GetNumAtoms()
+        neighbors = [len(a.GetNeighbors()) for a in mol.GetAtoms()]
+        # make sure that the ending members will be counted
+        neighbors[0] += 1
+        neighbors[-1] += 1
+        patn_tor1 = Chem.MolFromSmarts(smarts_torsion1)
+        torsion1 = cleaner(list(mol.GetSubstructMatches(patn_tor1)), neighbors)
+        patn_tor2 = Chem.MolFromSmarts(smarts_torsion2)
+        torsion2 = cleaner(list(mol.GetSubstructMatches(patn_tor2)), neighbors)
+        tmp = cleaner(torsion1 + torsion2, neighbors)
+        torsions = []
+        for t in tmp:
+            (i, j, k, l) = t
+            b = mol.GetBondBetweenAtoms(j, k)
+            if not b.IsInRing():
+                torsions.append(t)
+        # if len(torsions) > 6: torsions[1] = (4, 7, 10, 15)
+        return torsions


This was copy-pasted from the source at the top of the file, suggested by Chenghao - likely not to be used when we move to DGLConformer.

michalkoziarski · 2023-09-07T20:21:12Z

gflownet/utils/molecule/torsions.py

+import torch
+import networkx as nx
+import numpy as np
+
+from pytorch3d.transforms import axis_angle_to_matrix
+
+from gflownet.utils.molecule import constants
+
+
+def get_rotation_masks(dgl_graph):
+    """
+    :param dgl_graph: the dgl.Graph object with bidirected edges in the order: [e_1_fwd, e_1_bkw, e_2_fwd, e_2_bkw, ...]
+    """
+    nx_graph = nx.DiGraph(dgl_graph.to_networkx())
+    # bonds are indirected edges
+    bonds = torch.stack(dgl_graph.edges()).numpy().T[::2]
+    bonds_mask = np.zeros(bonds.shape[0], dtype=bool)
+    nodes_mask = np.zeros((bonds.shape[0], dgl_graph.num_nodes()), dtype=bool)
+    rotation_signs = np.zeros(bonds.shape[0], dtype=float)
+    # fill in masks for bonds
+    for bond_idx, bond in enumerate(bonds):
+        modified_graph = nx_graph.to_undirected()
+        modified_graph.remove_edge(*bond)
+        if not nx.is_connected(modified_graph):
+            smallest_component_nodes = sorted(
+                nx.connected_components(modified_graph), key=len
+            )[0]
+            if len(smallest_component_nodes) > 1:
+                bonds_mask[bond_idx] = True
+                rotation_signs[bond_idx] = (
+                    -1 if bond[0] in smallest_component_nodes else 1
+                )
+                affected_nodes = np.array(list(smallest_component_nodes - set(bond)))
+                nodes_mask[bond_idx, affected_nodes] = np.ones_like(
+                    affected_nodes, dtype=bool
+                )
+
+    # broadcast bond masks to edges masks
+    edges_mask = torch.from_numpy(bonds_mask.repeat(2))
+    rotation_signs = torch.from_numpy(rotation_signs.repeat(2))
+    nodes_mask = torch.from_numpy(nodes_mask.repeat(2, axis=0))
+    return edges_mask, nodes_mask, rotation_signs
+
+
+def apply_rotations(graph, rotations):
+    """
+    Apply rotations (torsion angles updates)
+    :param dgl_graph: bidirectional dgl.Graph
+    :param rotations: a sequence of torsion angle updates of length = number of bonds in the molecule.
+    The order corresponds to the order of edges in the graph, such that action[i] is
+    an update for the torsion angle corresponding to the edge[2i]
+    """
+    pos = graph.ndata[constants.atom_position_name]
+    edge_mask = graph.edata[constants.rotatable_edges_mask_name]
+    node_mask = graph.edata[constants.rotation_affected_nodes_mask_name]
+    rot_signs = graph.edata[constants.rotation_signs_name]
+    edges = torch.stack(graph.edges()).T
+    # TODO check how slow it is and whether it's possible to vectorise this loop
+    for idx_update, update in enumerate(rotations):
+        # import ipdb; ipdb.set_trace()
+        idx_edge = idx_update * 2
+        if edge_mask[idx_edge]:
+            begin_pos = pos[edges[idx_edge][0]]
+            end_pos = pos[edges[idx_edge][1]]
+            rot_vector = end_pos - begin_pos
+            rot_vector = (
+                rot_vector
+                / torch.linalg.norm(rot_vector)
+                * update
+                * rot_signs[idx_edge]
+            )
+            rot_matrix = axis_angle_to_matrix(rot_vector)
+            x = pos[node_mask[idx_edge]]
+            pos[node_mask[idx_edge]] = (
+                torch.matmul((x - begin_pos), rot_matrix.T) + begin_pos
+            )
+    graph.ndata[constants.atom_position_name] = pos
+    return graph


This is currently being significantly changed by @AlexandraVolokhova in #201, I'd suggest not reviewing for now.

…tion_fix_mk Formatting changes & typo fixes for TA fix

…xhernandezgarcia/gflownet into torsion_angles_detection_fix

…tection_fix

…ection_fix

…ezgarcia/gflownet into torsion_angles_detection_fix

…xhernandezgarcia/gflownet into torsion_angles_detection_fix

…tion_fix Torsion angles detection fix

AlexandraVolokhova and others added 30 commits February 28, 2023 17:59

implemented rotation masks

130338d

add masks to featuraser

fe9d5ac

implemented apply rotations

6c9de56

add simple tests for apply rotations

99fb7c7

fix bug in torsios, add test with AD

d1713cb

add a comment to ConformerDataset

217099c

implemented rotation masks

b000332

add masks to featuraser

03a445c

implemented apply rotations

b9be53c

add simple tests for apply rotations

c80b2f8

fix bug in torsios, add test with AD

ea8e410

add a comment to ConformerDataset

7cf2f3e

updated setup

38afb29

fixing tests

f0d8225

black

8b7502e

Merge pull request #120 from alexhernandezgarcia/torch_ani_proxy

6f7b738

Merge main into molecule_graph_env

WiP molecule TorchANI proxy

f0d53ab

added configs

e87e441

updated proxy defaults

1d095be

overwritten deepcopy

879dd38

optional batching

c2897b7

scaled energy

e51b8f0

updated config

1465827

fixed docstring

4b8816a

energy divider as an argument

9b4fa15

removed unused import

feee48b

Merge pull request #124 from alexhernandezgarcia/torch_ani_proxy

38fc208

Using TorchANI proxy for AlanineDipeptide

added aromatic bond type

004f905

XTB proxy

6a908de

remove old_conformer

0897394

michalkoziarski commented Sep 7, 2023

View reviewed changes

moved install script

d3bf9f4

michalkoziarski requested a review from alexhernandezgarcia September 7, 2023 20:23

michalkoziarski changed the title ~~[WIP] Molecule graph environment~~ Molecule graph environment Sep 7, 2023

michalkoziarski marked this pull request as ready for review September 7, 2023 20:24

michalkoziarski and others added 23 commits September 7, 2023 18:35

Merge branch 'main' into molecule_graph_env

98d154e

Merge branch 'main' of https://github.com/alexhernandezgarcia/gflownet

699bee7

fixed rotatatable bonds

e3c6aac

add test

3dda1f8

isort fix

e792e49

formatting changes & typo fixes

34d70a0

black

dc7233e

Merge pull request #213 from alexhernandezgarcia/torsion_angles_detec…

e5aa6dd

…tion_fix_mk Formatting changes & typo fixes for TA fix

fixed function name

4384c4e

Merge branch 'torsion_angles_detection_fix' of https://github.com/ale…

9d7e02c

…xhernandezgarcia/gflownet into torsion_angles_detection_fix

Merge branch 'torsion_angles_detection_fix_mk' into torsion_angles_de…

d1d0e1b

…tection_fix

Merge github.com:alexhernandezgarcia/gflownet into torsion_angles_det…

f4c2f35

…ection_fix

Merge branch 'main' into molecule_graph_env

286c5ec

Merge branch 'torsion_angles_detection_fix' of github.com:alexhernand…

9032cb0

…ezgarcia/gflownet into torsion_angles_detection_fix

Merge branch 'molecule_graph_env' into torsion_angles_detection_fix

582f95a

Merge branch 'torsion_angles_detection_fix' of github.com:alexhernand…

65c8482

…ezgarcia/gflownet into torsion_angles_detection_fix

add hydrogens fix

00f614a

fix ordering bug, add check for hydrogen ta

0ae2a8a

Merge branch 'torsion_angles_detection_fix' of https://github.com/ale…

d38aab8

…xhernandezgarcia/gflownet into torsion_angles_detection_fix

fix another bug

2875505

Merge branch 'torsion_angles_detection_fix' of https://github.com/ale…

23149eb

…xhernandezgarcia/gflownet into torsion_angles_detection_fix

black & isort

9b45f15

Merge pull request #212 from alexhernandezgarcia/torsion_angles_detec…

c2fb4e1

…tion_fix Torsion angles detection fix

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Molecule graph environment #139

Molecule graph environment #139

michalkoziarski commented Jun 16, 2023 •

edited

Loading

michalkoziarski Sep 7, 2023

michalkoziarski Sep 7, 2023

Molecule graph environment #139

Are you sure you want to change the base?

Molecule graph environment #139

Conversation

michalkoziarski commented Jun 16, 2023 • edited Loading

michalkoziarski Sep 7, 2023

Choose a reason for hiding this comment

michalkoziarski Sep 7, 2023

Choose a reason for hiding this comment

michalkoziarski commented Jun 16, 2023 •

edited

Loading