Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce overhead in preperation #275

Open
rwxayheee opened this issue Dec 10, 2024 · 3 comments · Fixed by #276
Open

Reduce overhead in preperation #275

rwxayheee opened this issue Dec 10, 2024 · 3 comments · Fixed by #276
Assignees
Labels
refactor Improvements to the code without necessarily adding features

Comments

@rwxayheee
Copy link
Contributor

          @rwxayheee we might be able to remove those lines of code. They compute a graph that is stored in instances of `Ring`. I think those graphs are never used, along with other attributes that are probably also never used, see https://github.com/forlilab/Meeko/pull/109#discussion_r1588546970

Originally posted by @diogomart in #269 (comment)

Currently looking at molsetup.py to see if we could reduce overhead during molecule preparation, or remove unused functions to make it more light-weighted

@rwxayheee rwxayheee self-assigned this Dec 10, 2024
@rwxayheee
Copy link
Contributor Author

rwxayheee commented Dec 10, 2024

Hi @diogomart

This is not very urgent. Thanks for your time and kind advice in advance.

I went through the conversation in #109. Started in a new PR #276, I removed graph and flip_corners from class Ring. The remaining attributes are still used in other codes.

Current ring finding method in molsetup is perceive_rings:

def perceive_rings(self, keep_chorded_rings: bool, keep_equivalent_rings: bool):

When typing macrocycles, function FlexMacrocycle.collect_rings evaluates existing rings that were found by perceive_rings.

Current ring breaking method forbids breaking of aromatic rings:

if setup.rings[ring_id].is_aromatic:
rigid_rings.append(ring_id)

Which depends on the is_aromatic attribute of rings.

To further simplify the preparation, I have a few quick questions:

  1. Is there other simplication you would like to make about rings?

  2. Can they be removed: attribute rmsd_symmetry_indices and function get_symmetries_for_rmsd

    Meeko/meeko/molsetup.py

    Lines 2065 to 2069 in a4e45d7

    def get_symmetries_for_rmsd(mol, max_matches=17):
    mol_noHs = Chem.RemoveHs(mol)
    matches = mol.GetSubstructMatches(
    mol_noHs, uniquify=False, maxMatches=max_matches
    )

    They are relatively easy to re-compute with a greater max_matches or in a custom manner, if needed.

  3. Can atom typing use a bottom-up approach (based on atom's properties and attribute graph) instead of substructure match (top-down, or is it??)? Bottom-up might (or might not, I'm not sure) be more efficient for typing very large systems.

@diogomart
Copy link
Contributor

Thanks!

  1. Yes. We can just remove the code in macrocycle.py that uses the is_aromatic attribute. With min_ring_size <= 6 aromatic rings like benzene will end up in breakable_rings instead of in rigid_rings, but the bond typer sets aromatic bonds as not rotatable, so the rings will never be opened/broken because only rotatable bonds can be broken to open a ring. In other words, aromatic rings are just a subset of the set of rings without rotatable bonds, and do not require custom code just because they happen to be aromatic. I'll add a test to the PR soon. Then, the class Ring has only one attribute: a tuple of atom indices. We might as well use a tuple of atom indices instead of instances of Ring and remove the Ring class and its JSON encoder and decoder.

  2. Don't remove them: they are used elsewhere. But let's turn off their calculation in meeko, i.e. remove call to get_symmetries_for_rmsd() from RDKitMoleculeSetup.from_mol(). Downstream code will call get_symmetries_for_rmsd() when needed.

  3. Good question, I don't know. The current spec is a list of SMARTS in order of ascending priority and descending generality: the last SMARTS to type an atom is the one that prevails, and thus the more general types are assigned first. OpenFF forcefields work the same way.

@rwxayheee
Copy link
Contributor Author

Got it, thanks! I pushed the commits to remove is_aromatic attribute and the call get_symmetries_for_rmsd() from RDKitMoleculeSetup.from_mol().

@rwxayheee rwxayheee added the refactor Improvements to the code without necessarily adding features label Dec 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
refactor Improvements to the code without necessarily adding features
Projects
None yet
2 participants