Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enhancement: Creating a PDB output file with all atoms that are part of patches labeled in b-factor column #35

Open
sgrannem opened this issue Jan 14, 2025 · 11 comments

Comments

@sgrannem
Copy link

I am very keen to use this package but I could not figure out which resides actually belong to the patches. Ideally, each residue that is part of an electrostatic patch would be labeled somehow in the original PDB file. Perhaps the b-factor column could be used for this? Then I can start integrating the tool in my pipeline.

@fwaibl
Copy link
Contributor

fwaibl commented Jan 16, 2025

Hello,
Thanks for you interest in using our tool!
We have a pull request open (#32), that might be related to your request. Specifically, it would implement a --resout option that would write the residues involved in each patch to a CSV file. Would this solve your problem?
@vhoer should we merge this PR?

Best,
Franz

@sgrannem
Copy link
Author

sgrannem commented Jan 16, 2025 via email

@fwaibl
Copy link
Contributor

fwaibl commented Jan 17, 2025

Dear Sander,
I just had a look at #32, and saw that it is not really compatible with our current main branch anymore because we changed some units in the last PR that we merged. To be consistent, I just did another change (already merged that), which updates the units in pep_patch_electrostatics to be nanometer-based. I am aware that this is an unpleasant change for existing users, but it increases our internal consistency, and it means that the PR #32 is again working. Do you want to try out the Patch_Residues_Logging branch, or should I simply merge it?
Best,
Franz

@sgrannem
Copy link
Author

sgrannem commented Jan 17, 2025 via email

@sgrannem
Copy link
Author

sgrannem commented Jan 17, 2025 via email

@fwaibl
Copy link
Contributor

fwaibl commented Jan 17, 2025

Dear Sander,
when I install the branch of #32, I get the following output (note the -ro or --resout option). Are you sure that you installed the package correctly in your environment?
I am still waiting for a comment from @vhoer, after that we'll merge this PR.

pep_patch_electrostatic starting at 2025-01-17 17:41:30.984986
Command line arguments:
/localhome/fwaibl/.conda/envs/py310_2/bin/pep_patch_electrostatic --help
usage: pep_patch_electrostatic [-h] [--ref REF] [--protein_ref PROTEIN_REF] [--stride STRIDE] [--dx [DX]] [--apbs_dir APBS_DIR] [--probe_radius PROBE_RADIUS] [-o OUT] [-ro RESOUT] [-c PATCH_CUTOFF PATCH_CUTOFF]
                               [-ic INTEGRAL_CUTOFF INTEGRAL_CUTOFF] [--surface_type {sas,ses,gauss}] [--ply_out PLY_OUT] [--pos_patch_cmap POS_PATCH_CMAP] [--neg_patch_cmap NEG_PATCH_CMAP] [--ply_cmap PLY_CMAP]
                               [--ply_clim PLY_CLIM PLY_CLIM] [--check_cdrs] [-n N_PATCHES] [-s SIZE_CUTOFF] [--gauss_shift GAUSS_SHIFT] [--gauss_scale GAUSS_SCALE] [--pH PH] [--ion_species [ION_SPECIES ...]]
                               parm trajs [trajs ...]

positional arguments:
  parm
  trajs

options:
  -h, --help            show this help message and exit
  --ref REF             Reference structure with the SAME atoms (default: None)
  --protein_ref PROTEIN_REF
                        Reference structure for protein alignment using TMalign (default: None)
  --stride STRIDE
  --dx [DX]             Optional dx file with the electrostatic potential. If this is omitted, you must specify --apbs_dir (default: None)
  --apbs_dir APBS_DIR   Directory in which intermediate files are stored when running APBS. Will be created if it does not exist. (default: None)
  --probe_radius PROBE_RADIUS
                        Probe radius in nm (default: 0.14)
  -o OUT, --out OUT     Output csv file. (default: None)
  -ro RESOUT, --resout RESOUT
                        Residue csv file. (default: None)
  -c PATCH_CUTOFF PATCH_CUTOFF, --patch_cutoff PATCH_CUTOFF PATCH_CUTOFF
                        Cutoff for positive and negative patches. (default: (2.0, -2.0))
  -ic INTEGRAL_CUTOFF INTEGRAL_CUTOFF, --integral_cutoff INTEGRAL_CUTOFF INTEGRAL_CUTOFF
                        Cutoffs for "high" and "low" integrals. (default: (0.3, -0.3))
  --surface_type {sas,ses,gauss}
                        Which type of molecular surface to produce. (default: sas)
  --ply_out PLY_OUT     Base name for .ply output for PyMOL. Will write BASE-pos.ply and BASE-neg.ply. (default: None)
  --pos_patch_cmap POS_PATCH_CMAP
                        Matplotlib colormap for .ply positive patches output. (default: tab20c)
  --neg_patch_cmap NEG_PATCH_CMAP
                        Matplotlib colormap for .ply negative patches output. (default: tab20c)
  --ply_cmap PLY_CMAP   Matplotlib colormap for .ply potential output. (default: coolwarm_r)
  --ply_clim PLY_CLIM PLY_CLIM
                        Colorscale limits for .ply output. (default: None)
  --check_cdrs          For an antibody Fv region as input: check whether patches belong to CDRs. (default: False)
  -n N_PATCHES, --n_patches N_PATCHES
                        Restrict output to n patches. Positive values output n largest patches, negative n smallest patches. (default: 0)
  -s SIZE_CUTOFF, --size_cutoff SIZE_CUTOFF
                        Restrict output to patches with an area of over s A^2. If s = 0, no cutoff is applied (default). (default: 0.0)
  --gauss_shift GAUSS_SHIFT
  --gauss_scale GAUSS_SCALE
  --pH PH               Specify pH for pdb2pqr calculation. If None, no protonation is performed. (default: None)
  --ion_species [ION_SPECIES ...]
                        Specify ion species and their properties (charge, concentration, and radius). Provide values for multiple ion species as charge1, conc1, radius1, charge2, conc2, radius2, etc. (default: None)

@vhoer
Copy link
Collaborator

vhoer commented Jan 17, 2025

Hey all,
I'd like to include the fix to #34 as well, which I'll add tomorrow and fix some small issues with the tests as well.
Then I think we can merge the PR.

@sgrannem
Copy link
Author

sgrannem commented Jan 21, 2025 via email

@fwaibl
Copy link
Contributor

fwaibl commented Jan 29, 2025

Hi,

I just merged #32, please let me know if this helps. I think the residue number is consistent with the mdtraj object rather than the input PDB. Would it be possible to work with these (continuous) residue numbers? E.g., it would be easy to write another PDB file that also has this numbering. But generally, it is not very easy to retain all residue information after processing with MDTraj, especially with insertion codes etc.

@sgrannem
Copy link
Author

sgrannem commented Jan 29, 2025 via email

@fwaibl
Copy link
Contributor

fwaibl commented Jan 31, 2025

Hi,
in the current code, we re-number the residues starting from 1. I believe that this was added in the past to avoid ending up with multiple residues with the exact same name, because the insertion codes are lost in the MDTraj object.
As far as I see, there are multiple options. @sgrannem @vhoer what is your preference?

  • We can retain the current behavior and force the user to re-number the PDB. This is always possible because the current numbering is unique, but might be tedious.
  • We can remove the re-numbering, which would mostly give the expected result, but might be more tedious in cases where the insertion codes are relevant.
  • We can think of a more elaborate solution. E.g., we could re-assign insertion codes when multiple residues have the same number.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants