Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update nucleotide templates, add NAKB_templates.json #203

Merged
merged 5 commits into from
Oct 22, 2024

Conversation

rwxayheee
Copy link
Contributor

@rwxayheee rwxayheee commented Oct 17, 2024

This PR includes two changes:

  • Minor updates in nucleotide templates residue_chem_templates.json
    The phosphate capping groups (p), and the two molecular (ligand) forms of nucleotides are dropped. They aren't currently used for template matching.

  • Add NAKB_templates.json
    This is a ready to use additional chemical template file. It includes up to four variants of the 5',3'-linking fragments for individual standard and nonstandard nucleotides.

Currently the additional JSON file has 1778 templates for 453 residues

All nucleotides (824) from the NAKB modified nucleotide pool were considered. The Python script to make is here. A nonstandard nucleotide / a template is dropped if:

  • Is tagged not mappable by NAKB
  • Contains unsupported elements
  • Has a bad definition CIF file, and created rdkit molecule can't be sanitized or has implicit hydrogens
  • Does not fall within some safe nets to ensure the uniqueness of matching and deleterious editing
  • Is redundant (has the same smiles and atom_name) under the same parent
  • Won't pass Meeko template check for other reasons (very very few cases)

Without further optimization it takes about 0.15 seconds per nucleotide to fetch the definition, process and write a template file (1-4 variants) on a Mac. It might be possible to integrate the process with prepare_receptor in future. But for the time being, I hope the additional chemical template file can help users who work with RNA/DNA systems where nonstandard nucleotides are present

08d7575 Added a few templates that went missing because of network issue

add NAKB_templates.json
@rwxayheee rwxayheee requested a review from diogomart October 17, 2024 00:56
@diogomart
Copy link
Contributor

Looks great. If I understand correctly these new templates aren't automatically loaded. Should we add a command line and Python option for that?

@rwxayheee
Copy link
Contributor Author

rwxayheee commented Oct 18, 2024

Hi @diogomart
The new template file can be used by --add_templates in mk_receptor_preparation.py. I added it to meeko/data so that it's distributed with Meeko. What do you think?

I haven't checked the name conflicts, but will do some more work, and make sure it's not overwriting the existing templates.

@rwxayheee rwxayheee force-pushed the update_NA_templates branch from 5fbbe6c to 08d7575 Compare October 19, 2024 02:03
@rwxayheee rwxayheee marked this pull request as draft October 19, 2024 02:53
@rwxayheee
Copy link
Contributor Author

rwxayheee commented Oct 19, 2024

898a29c Regenerated the templates with new codes. Checked the name conflicts. Fixed some incorrect templates, added some templates that didn't go through because of mistakes in check. Re-counted the number of templates & residues.

This should be ready to merge. But again it's not urgent.

@rwxayheee rwxayheee marked this pull request as ready for review October 19, 2024 03:12
@rwxayheee
Copy link
Contributor Author

merging the edited template + additional template

@rwxayheee rwxayheee merged commit 269b227 into forlilab:develop Oct 22, 2024
1 check passed
@rwxayheee rwxayheee deleted the update_NA_templates branch October 22, 2024 20:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants