-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Matching atoms and merging ring systems (fused rings, different hybridisation states) - issues #103
Comments
Thanks, I'll take a closer look when I get a chance. We had quite a few tests for weird ring-size and ring-breaking edge cases, but unfortunately most of these now need to be marked as Out of interest, do you have the original SDF files used to load the ligands, i.e. prior to parameterisation? I wonder whether these would give the same mappings. (The SDF specific attributes would be retained as properties of the molecules.) We now use Sire's built in Just to note that when converting to RDKit with Sire (say for ligands 27 and 42) I get valence warnings, and if I attempt to compute a mapping using the In [1]: import sire as sr
In [2]: from gufe import SmallMoleculeComponent
In [3]: from openfe.setup.atom_mapping import LomapAtomMapper
In [4]: m0 = sr.load("ligands/*27*")
In [5]: m1 = sr.load("ligands/*42*")
In [6]: rdm0 = sr.convert.to_rdkit(m0)
[15:10:12] Explicit valence for atom # 1 C, 5, is greater than permitted
In [7]: rdm1 = sr.convert.to_rdkit(m1)
[15:10:14] Explicit valence for atom # 8 C, 5, is greater than permitted
In [8]: openfe_m0 = SmallMoleculeComponent(rdm0)
In [9]: openfe_m1 = SmallMoleculeComponent(rdm1)
In [10]: mapper = LomapAtomMapper()
In [11]: mapping_gen = mapper.suggest_mappings(openfe_m0, openfe_m1)
In [12]: mapping = next(mapping_gen)
[15:10:53] Explicit valence for atom # 1 C, 5, is greater than permitted
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ in <module>:1 │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
StopIteration |
Yep, here are the sdf files: I just tried running the above with the sdf files and didn't get any errors when converting them to rdkit or mapping. I guess I've also been using the older BSS; I'll run the above again through the newer BSS and see if any of the mapping problems persist! |
I've run the initial code with the most recent version of the devel, which should use the # Generate the MCS match.
mcs = _rdFMCS.FindMCS(
mols,
atomCompare=_rdFMCS.AtomCompare.CompareAny,
bondCompare=_rdFMCS.BondCompare.CompareAny,
completeRingsOnly=complete_rings_only,
ringMatchesRingOnly=True,
matchChiralTag=False,
matchValences=False,
maximizeBonds=False,
timeout=timeout,
) as # bss
{0: 34, 1: 10, 2: 9, 3: 8, 4: 7, 5: 12, 6: 11, 7: 35, 8: 36, 37: 33}
# atom mapper
LigandAtomMapping(componentA=SmallMoleculeComponent(name=MOL), componentB=SmallMoleculeComponent(name=MOL), componentA_to_componentB={0: 39, 1: 19, 2: 18, 3: 17, 4: 16, 5: 25, 6: 20, 7: 21, 8: 24, 9: 0, 10: 1, 11: 26, 12: 27, 13: 2, 14: 28, 15: 29, 16: 3, 17: 30, 18: 31, 19: 4, 20: 5, 21: 6, 22: 7, 23: 8, 24: 9, 25: 10, 26: 11, 27: 12, 28: 36, 29: 35, 30: 34, 31: 33, 32: 32, 33: 13, 34: 14, 35: 15, 36: 37, 37: 38}, annotations={}) |
Okay, thanks for letting me know. I'll compare options with the |
(Sorry, forgot to say that LOMAP calls the same RDKit MCS matcher internally, but likely using different options.) |
When I have some time I can also try cycling some of my perturbations through different options to see how that changes the perturbed region. I'm wondering if it would be worth checking the number of perturbing atoms after the MCS has run, and if this is above a certain number (which is generally how I noticed that the mapping was not ideal) running it through the MCS again with different options? Depending on how the other options impact the mapped region |
Hello! I have an update - I've managed to look through some of the different options here:
I've found that, for the perturbations above (lig_27~lig_42, lig_27~lig_43, lig_43~lig_45), having Using |
Thanks for the update. I've looked into swappaing out our MCS with the |
Hello! I'm currently matching and merging molecules for MCL1 ligands. The basic code used to replicate these is:
With the files named as the ligands attached.
ligands.zip
There are some different situations that occur in relation to ring systems, outlined below.
Case 1
In the most basic case, as in the case of some ligands (lig_27~lig_42), this does not proceed correctly initially. Many of the atoms are included in the perturbable region and some of the shared atoms are not included in the MCS.
When the mapping dictionary
{21:6}
is added asprematch
, this is then able to proceed okay. Prematch for this ligand series is chosen so the N in the core region match.Notably, this is able to proceed with
ring_size=False
andring_break=False
and the ring is mapped correctly to the fused ring system, which is not the case in Case 2.Case 2
However in a another case, lig_27~lig_43, when the mapping dictionary
{ 21:26 }
which is also the N is added, this is not able to proceed without settingring_size=True
,ring_break=True
. In this case then the resulting perturbed region is like below:I think ideally, with the introduced mapping, this should still be able to proceed with ring_size and ring_break set as False, and the whole ring region in this case is taken as the perturbed region in each case. So for lig_27 the phenyl ring and for lig_43 the entire fused ring system. Then if these were both set to True, in that case the phenyl ring should ideally be mapped onto an entire ring in the fused phenyl as opposed to split between the two fused rings.
The difference between lig_42 and lig_43 in this case is minor, with lig_42 being a 1-napthyl group and lig_43 being a 2-napthyl group, but this causes a large difference in the mapping.
Case 3
Here, lig_43~lig_45, there is a change of hybridisation state with the rings. Again, without prematch, the core region of the molecule is ignored:
And when prematch
{26:6}
is introduced, withring_size=False
,ring_break=False
it then maps the differently hybridised C to each other. The differently hybridised atoms should not map to each other as this is not great for the free energy perturbation.I was wondering if there would be some way to programmatically address the mapping and merging ring perturbations so the intended perturbable molecule can be obtained? I guess the atoms could also be removed manually from the generated mapping that is passed into the merge function, however this requires quite some time per perturbation to do manually as each atom needs to be inspected then.
Thank you!
The text was updated successfully, but these errors were encountered: