Skip to content

Latest commit

 

History

History
83 lines (63 loc) · 4.97 KB

README.md

File metadata and controls

83 lines (63 loc) · 4.97 KB

Metabolic Modelling of AKR1A1 Deficiency

Description

This project involves analysis of bulk RNA-seq data and construction and simulation of genome-scale metabolic models in MATLAB to study AKR1A1 deficiency.

Installation

MATLAB 2019B along with the required toolboxes ( ). Dependencies include rFASTCORMICS and R functions for gene length normalization.

Data Pre-processing and Descriptive Statistics

Normalized RNA-seq counts are processed to assess data distribution using boxplots, histograms, and ksdensity functions. PCA is performed for dimensionality reduction, followed by data discretization and metabolic modeling.

Usage

Execute the masterdriver.m script to run the pre-processing and analysis pipeline, which includes data loading, preprocessing, discretization, model building and initial model analysis. For detailed metabolic flux analysis post-HPC sampling test, execute masterdriverAnalysisSampling.m.

Scripts and Analysis Workflow

  1. driverData.m: Loads RNA-seq data and performs initial preprocessing and descriptive statistics.
    • Boxplots: Visual representations of the distribution across different samples.
    • h_boxplot_all.png, r_boxplot_all.png
    • Histograms: Distributions of gene expression levels.
    • h_gene_distribution_all.png, r_gene_distribution_all.png
    • Cumulative Distribution Functions (CDFs): Density plots for gene expression.
    • h_cdf_all.png, r_cdf_all.png
    • PCA Plots: Scatter plots from PCA analysis showing data clustering.
    • h_pca_score_1.png, r_pca_score_1.png
  2. driverModel_withoutO2S.m: Sets up medium composition and constructs genome-scale metabolic models.
  3. setMediumConstraints_Chiara.m: Applies medium constraints to models based on experimental conditions.
    • Histograms illustrating the distribution of non-zero exchange reaction fluxes for all conditions (sc1, sc2, sc12) in both 769-P and Huh7 type models.
  4. KO_GLO1_treatment.m: Simulates genetic alterations or treatment effects like GLO1 gene knockout.
  5. removeunusedgen.m: Removes unused genes to optimize model efficiency.
  6. analysis.m: Conducts preliminary analysis on refined models.
    • Jaccard Similarity Heatmaps: Visual representations of model similarities.
    • ModelsimilaritybasedonJaccarddistance_H.png
    • ModelsimilaritybasedonJaccarddistance_7.png
    • Pathway Activity Clustergrams: Shows the activity of different metabolic pathways across models.
    • Pathwayactivityforallmodels_H.png
    • Pathwayactivityforallmodels_7.png
    • Flux Variability Analysis (FVA) Heatmaps: Similarity based on Flux Variability Analysis.
    • FVA_similarity_heatmap_7_.png
    • FVA_similarity_heatmap_H_.png

High-Performance Computing (HPC) Sampling Tests

After completing the initial modeling and analysis on your local machine, the models may be subject to more intensive computational tasks, such as sampling tests, which are typically run on a High-Performance Computing (HPC) system. These HPC-related steps will follow a different procedure. Upon completion of the HPC sampling tests, the results are collected and used for further analysis, which may involve additional scripts.

Sampling Analysis

Sampling Analysis are conducted by masterdriverAnalysisSampling.m, which includes detailed metabolic flux analysis.

Scripts Managed by masterdriverAnalysisSampling

  1. performAnalysisSampling: Performs metabolic flux sampling analysis, comparing control and treated models to highlight significant metabolic differences.
  2. performAnalysisFluxSum: Performs metabolic flux sum sampling analysis (fluxsum=metabolite turnover rate)

AKR1A1 Metabolic Pathway Visualization

Metabolic Pathway Visualization was performed with the R script AKR1A1_exploration5.R designed to visualize metabolic pathway alterations in AKR1A1 deficiency using heatmaps. The script generates heatmaps for selected subsystems under various conditions, highlighting significant metabolic shifts.

Output

The script generates a series of PDF files, each representing a heatmap of Signal-to-Noise ratio (SNR) changes across different conditions for selected subsystems:

Flux Sampling

  • File Naming Convention: Sampling_heatmap_[subsystem]_dir_[direction].pdf
    • Example files:
      • Sampling_heatmap_Glycolysis_gluconeogenesis_dir_0.pdf
      • Sampling_heatmap_Pentose_phosphate_pathway_dir_1.pdf
      • Sampling_heatmap_Pyruvate_metabolism_dir_-1.pdf

Each PDF file corresponds to a specific direction of change:

  • dir_0: No significant change
  • dir_1: Positive change
  • dir_-1: Negative change

These heatmaps help identify key metabolic changes and are grouped by the directionality of reaction flux differences: increased, decreased, or unchanged.

Flux Sum Sampling

  • File Naming Convention: heatmap_fluxsum_[subsystem]_5_[condition].pdf
    • Example: heatmap_fluxsum_ppp_5_0.pdf

Contributing

Evelyn Gonzalez, Chiara Pecorari, Maria Pires Pacheco, Thomas Sauter

04/24