You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In certain case, you'd like to not use our GTEx control but use your own custom control database, the recommended way is like below:
We recommend first put both your tumor bam files and control bam files into a /bam folder, run AltAnalyze, so that you will get a combined counts.original.pruned.txt count matrix for both your tumor and control samples.
After running that, please open this altanalyze_output/AlternativeOutput/Events-dPSI_0.0_rawp/PSI.tumor_vs_control.txt within which you can select the ones that are specific to your tumor samples and are absent to your own control, you can see ave-tumor, ave-control, dPSI, adjp to customize the selection criteria. The first three values are shown as Percent Spliced In (PSI) ranging from 0-1, 0 is not present, 1 is fully present. adjp indicate adjusted pvalue (the significance level).
Once you have a list of junctions from UID columns, you just need to take the first junction, see below:
CASK:ENSG00000147044:E21.1-E23.1|ENSG00000147044:E21.1-E22.1
gene id: CASK
junction: ENSG00000147044:E21.1-E23.1 (what you need)
background junction: ENSG00000147044:E21.1-E22.1 (the competing junction, you don't need)
These junctions can be used to subset your combined counts.orginal.pruned.txt dataframe based on row. Then you should also only include the tumor columns as the final df for the SNAF run.
Finally, how to turn off the GTEx filter? The key is to use a very large number like 10000 to name your n_max and name your t_min=-10000, then it basically will retain all your junctions
2024-03-26 20:55:28 starting initialization
Current loaded gtex cohort with shape (12827, 2629)
2024-03-26 20:55:50 finishing initialization
reduce valid NeoJunction from 12927 to 12927 because they are present in GTEx
Hoping this help and feel free to reach out if you need further clarification,
Frank
The text was updated successfully, but these errors were encountered:
In certain case, you'd like to not use our GTEx control but use your own custom control database, the recommended way is like below:
We recommend first put both your
tumor
bam files andcontrol
bam files into a/bam
folder, runAltAnalyze
, so that you will get a combinedcounts.original.pruned.txt
count matrix for both your tumor and control samples.Now you can run the differential alternative splicing (DAS) mode for AltAnalyze, please follow this section in the tutorial, and remember, the
groups.txt
andcomps.txt
need to be in the same level of wherealtanalyze_output
folder sit, and it's advisable to make sure you are in that folder when you are running the singularity command:https://snaf.readthedocs.io/en/latest/tutorial.html#differential-gene-splicing-analysis-and-gene-enrichment-analysis
After running that, please open this
altanalyze_output/AlternativeOutput/Events-dPSI_0.0_rawp/PSI.tumor_vs_control.txt
within which you can select the ones that are specific to your tumor samples and are absent to your own control, you can seeave-tumor
,ave-control
,dPSI
,adjp
to customize the selection criteria. The first three values are shown as Percent Spliced In (PSI) ranging from 0-1, 0 is not present, 1 is fully present.adjp
indicate adjusted pvalue (the significance level).Once you have a list of junctions from
UID
columns, you just need to take the first junction, see below:These junctions can be used to subset your combined
counts.orginal.pruned.txt
dataframe based on row. Then you should also only include the tumor columns as the finaldf
for the SNAF run.Finally, how to turn off the GTEx filter? The key is to use a very large number like
10000
to name your n_max and name yourt_min=-10000
, then it basically will retain all your junctionsAs a test:
Hoping this help and feel free to reach out if you need further clarification,
Frank
The text was updated successfully, but these errors were encountered: