Skip to content

Latest commit

 

History

History
109 lines (78 loc) · 6.36 KB

README.md

File metadata and controls

109 lines (78 loc) · 6.36 KB

scRNA-seq Scripts

This repository contains scripts necessary for running scRNA-seq analyses related to the manuscript:

Marderstein, A.R. et al. (2024). Single-cell multi-omics map of human foetal blood in Down's Syndrome. Nature.

Contact: [email protected]

Table of Contents

Introduction

This repository provides the necessary scripts for conducting many scRNA-seq analyses described in our manuscript. The scripts cover a wide range of tasks, from data conversion and quality control to differential expression analysis and visualization.

Data Availability

Input data for scripts are based on the datasets that have been deposited on ArrayExpress.

The following data has been deposited on ArrayExpress:

  • scRNA-seq FASTQ raw data and CellRanger count matrices (accession number E-MTAB-13067)
  • 10x Visium FASTQ raw data, SpaceRanger count matrices, run summary metrics, and spatiality outputs (E-MTAB-13062)
  • Multiome snRNA-seq and snATAC-seq FASTQ raw data, CellRanger ARC count matrices, and ATAC fragment files (E-MTAB-13070).

Other Information

You will need to install packages that are listed in the header of scripts prior to running them.

Can't find code relevant to the analysis that you are interested in? Please look here first:

Please reach me at [email protected] if there are questions about the analysis.

Scripts

Data Conversion

  • anndata_to_seurat.R: Converts anndata objects to Seurat format for downstream analysis in R.
  • seurat_to_sce.R: Converts Seurat objects to SingleCellExperiment (SCE) format, which is often used for integration with other R packages.

Quality Control (QC) and Clustering

Marker Genes

  • cluster_markers_v2.py: Identifies cell type-specific marker genes using Seurat’s clustering and marker identification tools.
  • create_barcode_cluster_supp.R: Generates a supplementary table of cell barcodes associated with identified clusters.
  • create_cluster_markers_supp.R: Compiles the identified marker genes into a supplementary table for publication.
  • plots_v4.py: Generates UMAPs and dot plots for various cell types.

Differential Expression (DE) Analysis

  • 230108_cellbender_pseudobulks.R: Performs differential expression analysis on pseudobulks generated by CellBender to evaluate DE robustness.
  • 230108_analysis_cellbender_pseudobulks.R: Analyzes the differential expression results obtained from the CellBender-corrected data.
  • sex_de1.R: Evaluates the effect of sex and age on differential expression results by adding/dropping these variables from the DE analysis.
  • testing_de_mods_v2.R: Tests various differential expression analysis choices to assess their impact on results.
  • 240226_pb_fold_changes.R: Analyzes the fold changes from differential expression analyses with different configurations.
  • spiking_chr21.R: Assesses the impact of chromosome 21 on log-fold change correlations.
  • pseudobulk_de_v3_partB.062422update.R: Conducts DE analysis comparing liver and femur tissues.
  • downsample_de_v3.t21_v_healthy.R: Downsamples T21 samples to match D21 sample size for liver versus femur DE analysis.
  • analy_diff_exp_v4b_makingplots.062422update.R: Produces plots related to liver versus femur DE analysis.
  • go_enrich_plots.R: Conducts GO enrichment analysis on liver versus femur DE results.
  • gxe_plots.R: Generates GxE interaction plots for liver and femur DE analysis.
  • gxe_plots.makingdata.R: Prepares data for GxE interaction analysis and plotting.
  • de_hsc_vs_cyclingHSC.R: Compares differential expression between cycling HSCs and less cycling HSCs.
  • cycling_HSC_de_volcano_v3.R: Creates volcano plots for DE results between cycling and less cycling HSCs.
  • cycling_HSC_go_enrich.R: Conducts GO enrichment analysis on DE results between cycling and less cycling HSCs.
  • 240227_diff_exp_other_cell_types.R: Conducts differential expression analysis on additional cell types to compare with the main dataset.
  • logfc_chr21_vs_other.R: Compares log-fold changes of DE genes on chromosome 21 with other chromosomes.
  • scrna_seq_plot.R: Creates various plots related to scRNA-seq analyses, mostly focused on DE results.

Integration and Mapping

  • integration_combine.R: Integrates multiple datasets and combines metadata for downstream analyses.
  • ref_vs_separate.R: Compares cell type frequencies between reference and integrated datasets.
  • new_pseudobulks.R: Generates pseudobulk data from integrated cell type labels.

CellPhoneDB Analysis

  • run_cpdb.sh: Runs the CellPhoneDB analysis pipeline.
  • create_cpdb_input.R: Prepares input files for CellPhoneDB.
  • input_cellphone_db.py: Processes input data and initiates the CellPhoneDB run.
  • input_cellphone_db.subset.sh: Subsets the data for focused CellPhoneDB analysis.
  • cellphonedb_output_analy.R: Analyzes the output from CellPhoneDB to identify significant ligand-receptor interactions.
  • reformat_cpdb_input.R: Reformats input data for compatibility with CellPhoneDB.

Mitochondria-Related Analysis

  • mitosox.R: Generates data tables related to mitochondrial ROS (mitosox) for barplot visualization.
  • mitotracker.R: Creates data tables for mitochondrial tracking experiments.
  • mito.R: Generates barplots for mitochondrial-related analyses.

Somatic Enrichment Analysis

  • 291123_somatic_v3.R: Performs somatic enrichment analysis on chromosome 21 and other relevant regions.