Skip to content

Mroyyy/GUI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

56 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Graphical User Interface

Here is the source code for an application about fpegenaute Master Thesis for the MSc in Bioinformatics for Health Sciences using Python and Pyqt5 library. The idea of his project ("Automated structural information retrieval for Integrative Modeling") was to create a tool that combines information from already existing and experimental data in order to predict the dynamics of proteins. Basically, we gather information from PDB (Protein Data Bank) and AlphaFold or RoseTTaFold and use it as an input for IMP (Integrative Modeling Platform), which gives us an hybrid approach integrating data from diverse biochemical and biophysical experiments.

Installation and Requirements

The Graphical User Interface, as the original program, uses external programs to work:

Also, to install the Python packages you will need Conda and pip.

Set up the program

1.- Open a Terminal

2.- Download this repository

git clone https://github.com/Mroyyy/GUI.git

3.- Create a conda environment to avoid dependency issues

conda create --name your_env python=3.8

4.- Install the Python packages needed by the program in the working directory

cd path/to/working/directory
pip install -r requirements.txt
  • What is recommended, if you have conda installed, you can create the environment with all the dependencies as:
conda env create -f environment.yaml --name your_env

5.- Configure the program. Open the file config.py inside the bin/ folder, and in the line 6 change:

"blastdb" : "/path/BLAST/database"

Practical example: SEC3

Sec3 is an essential protein which participates in exocytosis, whose full structure is still unkown. Its sequence in FASTA format is on "input_fasta" directory. All sequences you want to study have to be in the same directory as the program to work

There are instructions in a Help Window, but to start you will only need two arguments:

  • Input sequence in FASTA format
  • Output directory to store the retrieved PDBs

helpred

help

1.- Activate your environment

conda activate your_env

2.- Run GUI

python3 gui.py

Screenshot from 2022-08-26 16-01-57

3.- Output

  • Coverage: Which parts of the FASTA are covered
  • Hinges and flexibilility: Hinges are regions of the protein that allow it to move and change conformations. In this plot you can observe its prediction
  • Composite and Topology File: From all the structures retrieved by the program, a composite is generated, trying to cover as much of the reference sequence as possible, avoiding overlaps.
  • Custom hinges: In this section, hinge regions can be introduced, allowing the movement of the more rigid domains. An example is shown below:

hinges

This custom topology file will be the input file for IMP:

top

IMP Output

Screencast 2022-08-29 10_23_15

At this pont, the output directory should contain the files and directories described bellow:

SEC3/
 BLAST/
  SEC3_blast.out
 FASTA/
  5LG4.fa
  5YFP.fa
 HINGES/
  5lg4_B.hng
  5lg4_B_packman_output.txt
  5yfp_A.hng
  5yfp_A_packman_output.txt
 IMP/
  SEC3.topology
  *SEC3_custom.topology*
 LOG/
  SEC3.log
 PDB/
  CHAINS/
  partial/
   5lg4_B.pdb
   5yfp_A.pdb
  5lg4.cif
  5yfp.cif
 PLOTS/
  coverage_plot.html
  coverage_plot.png
  hinges_prediction.html
  hinges_prediction.png
  structure_plot.html
  structure_plot.png
 REPORT/
  COVERAGE/
   5lg4_B_coverage.csv
   5yfp_A_coverage.csv
   SEC3_composite_coverage.csv
  DFI/
   5lg4_B_DFI_coverage.csv
   5yfp_A_DFI_coverage.csv

FAQS

Why in the main window there are more labels than used? The idea is to implement AlphaFold and RoseTTaFold models (or to run the server directly) and obtain the most complete structure as possible. If it was the case in the Composite and Topology File we could also observe and select fragments from AlphaFold model for example.

Can I only run it on Linux? Currently you have to run the GUI from the terminal, but the future idea is to obtain an .exe one file with all packages and needed files.

It needs to have the same file name and header e. g. the file SEC3.fasta starts with the line ">SEC3"

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages