Cortexa SplicePCA Example

In this tutorial, we will walk you through the process of analyzing your own files for the SplicePCA tool.

Use SplicePCA

As a first step, to get familiar with the SplicePCA tool, you can use the analyzed files deposited in data/. These are control and Nova2-KD samples from the developing, embryonic neocortex.

Saito, Yuhki, et al. "Differential NOVA2-mediated splicing in excitatory and inhibitory neurons regulates cortical development and cerebellar function." Neuron 101.4 (2019): 707-720.

Once these files are downloaded, you can use them for the custom analysis in SplicePCA.

Browse to Cortexa's SplicePCA
Select relevant datasets for the PCA analysis – in this case Development and NPC/neuron could be a good choice
Upload the alternative splicing files by checking Include own datasets
Define the genes on which the analysis will be done a. To do the analysis on all available genes, select the option Use all Genes b. To define a subset of genes, enter the gene symbols in the Add Gene mask
Press Start PCA, the process can take a while.

*Result of SplicePCA using the Casette Exon (SE.MATS.JCEC.txt) analyzed with Development and NPC/neuron data and visualized after Download PCA with matplotlib.

Analyze your own files

In order to minimize technical effects, you should do the analysis as described in the manuscript.

Tools:

Reference Genome:

Gencode mm39

Analysis pipeline for alternative splicing

Prototypic Pipeline

Caution

This is just a general outline of a pipeline and it has to be adapted to your parameters.

To minimize technical effects, you should perform the analysis as described in the manuscript. Follow these detailed steps to set up and run the analysis pipeline:

1. Install required tools

BBDuk (version 39.01)

Download BBDuk from SourceForge
Extract the downloaded file:
```
tar -xvzf BBMap_39.01.tar.gz
```
Add the BBDuk directory to your PATH:
```
export PATH=$PATH:/path/to/bbmap
```

STAR (version 2.7.10b)

Download STAR

wget https://github.com/alexdobin/STAR/archive/2.7.11b.tar.gz

Extract the ZIP file:
```
tar -xzf 2.7.11b.tar.gz
```
Compile STAR:
```
cd STAR-2.7.11b/source
make STAR
```

Add the STAR directory to your PATH:

export PATH=$PATH:/path/to/STAR-2.7.11b/bin/Linux_x86_64

rMATS turbo (version 4.1.2)

Clone the rMATS-turbo repository:

git clone https://github.com/Xinglab/rmats-turbo.git

Install dependencies (ensure you have Python 3.6+ and GCC installed):
```
cd rmats-turbo
./build_rmats
```
Add the rMATS directory to your PATH:
```
export PATH=$PATH:/path/to/rmats-turbo
```

Samtools (version 1.18)

Download Samtools from GitHub
Extract Samtools
```
   tar -xvjf samtools-1.18.tar.bz2
```

Install Samtools

cd samtools-1.18
./configure --prefix=/where/to/install
make
make install

RSeQC (version 5.0.1)

Install RSeQC using pip
```
   pip install RSeQC
```

2. Prepare reference genome

Download Gencode mm39 from Gencode
- Download the genome FASTA file and the GTF annotation file

Index the genome for STAR:

STAR --runMode genomeGenerate --genomeDir /path/to/star_index \
     --genomeFastaFiles /path/to/GRCm39.primary_assembly.genome.fa \
     --sjdbGTFfile /path/to/gencode.vM33.annotation.gtf \
     --sjdbOverhang 100

3. Process raw data and perform analysis

Follow these steps for each sample:

Quality control with FastQC:

fastqc -o /path/to/fastqc_output -t <threads> sample_R1.fastq.gz sample_R2.fastq.gz

Adapter trimming with BBDuk:

bbduk.sh in1=sample_R1.fastq.gz in2=sample_R2.fastq.gz \
         out1=sample_trimmed_R1.fastq.gz out2=sample_trimmed_R2.fastq.gz \
         ref=/path/to/adapters.fa \
         ktrim=r k=23 mink=11 hdist=1 tpe tbo \
         qtrim=rl trimq=10 minlen=25

Alignment with STAR:

STAR --genomeDir /path/to/star_index \
     --readFilesIn sample_trimmed_R1.fastq.gz sample_trimmed_R2.fastq.gz \
     --readFilesCommand zcat \
     --outFileNamePrefix sample_ \
     --outSAMtype BAM SortedByCoordinate \
     --limitBAMsortRAM 10000000000 \
     --runThreadN <threads>

samtools index -@ <threads> sample_Aligned.sortedByCoord.out.bam

Infer strandedness with RSeQC:

infer_experiment.py -r /path/to/genome.bed -i sample_Aligned.sortedByCoord.out.bam > sample_strandedness.txt

Count features with FeatureCounts:

featureCounts -s <strandedness> \
              -p --countReadPairs \
              -t exon \
              -g gene_name \
              -T <threads> \
              -a /path/to/gencode.vM33.annotation.gtf \
              -o sample_counts.tab \
              sample_Aligned.sortedByCoord.out.bam

Perform alternative splicing analysis with rMATS:

rmats.py --b1 sample1_Aligned.sortedByCoord.out.bam,sample2_Aligned.sortedByCoord.out.bam \
         --gtf /path/to/gencode.vM33.annotation.gtf \
         --od /path/to/rmats_output \
         --tmp /path/to/rmats_tmp \
         -t paired \
         --libType <strandedness> \
         --readLength <read_length> \
         --nthread <threads>

4. Adjust parameters

Ensure you adjust the following parameters according to your experimental setup:

<threads>: Number of threads to use for various processes
<strandedness>: Strandedness information (0 for unstranded, 1 for stranded, 2 for reversely stranded)
<read_length>: Read length of your sequencing data

5. Prepare files for SplicePCA

Use the SE.MATS.JCEC.txt file from the rMATS output for upload to SplicePCA.

6. Analyze with SplicePCA

Follow the steps in the "Use SplicePCA" section to upload and analyze your processed files.

How to Cite

Weißbach, S., Milkovits, J., Pastore, S. et al. Cortexa: a comprehensive resource for studying gene expression and alternative splicing in the murine brain. BMC Bioinformatics 25, 293 (2024). https://doi.org/10.1186/s12859-024-05919-y

and the data sets that you used, which can be found at Cortexa - About the data.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
data		data
graphics		graphics
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cortexa SplicePCA Example

Use SplicePCA

Analyze your own files

Prototypic Pipeline

1. Install required tools

BBDuk (version 39.01)

STAR (version 2.7.10b)

rMATS turbo (version 4.1.2)

Samtools (version 1.18)

RSeQC (version 5.0.1)

2. Prepare reference genome

3. Process raw data and perform analysis

4. Adjust parameters

5. Prepare files for SplicePCA

6. Analyze with SplicePCA

How to Cite

About

Releases

Packages

License

s-weissbach/cortexa_SplicePCA_example

Folders and files

Latest commit

History

Repository files navigation

Cortexa SplicePCA Example

Use SplicePCA

Analyze your own files

Prototypic Pipeline

1. Install required tools

BBDuk (version 39.01)

STAR (version 2.7.10b)

rMATS turbo (version 4.1.2)

Samtools (version 1.18)

RSeQC (version 5.0.1)

2. Prepare reference genome

3. Process raw data and perform analysis

4. Adjust parameters

5. Prepare files for SplicePCA

6. Analyze with SplicePCA

How to Cite

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages