Skip to content

Commit

Permalink
Updated vignette to use limma-trend in differential pathway analysis
Browse files Browse the repository at this point in the history
  • Loading branch information
rcastelo committed Apr 2, 2024
1 parent 929329c commit 27d70c0
Show file tree
Hide file tree
Showing 3 changed files with 36 additions and 3 deletions.
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Package: GSVA
Version: 1.51.7
Version: 1.51.8
Title: Gene Set Variation Analysis for Microarray and RNA-Seq Data
Authors@R: c(person("Robert", "Castelo", role=c("aut", "cre"), email="[email protected]"),
person("Justin", "Guinney", role="aut", email="[email protected]"),
Expand Down
25 changes: 24 additions & 1 deletion vignettes/GSVA.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -632,6 +632,29 @@ res <- decideTests(fit, p.value=0.01)
summary(res)
```

As Figure \@ref(fig:setsizebysigma) below shows, GSVA scores have higher
precision for larger gene sets^[Thanks to Gordon Smyth for point this out to us.].

```{r setsizebysigma, height=700, width=500, fig.cap="Residual standard deviation of GSVA scores as function of gene set size."}
gssizes <- lengths(geneIds(cgpC2BroadSets)) ## fetch gene set sizes
mt <- match(rownames(leukemia_es), names(gssizes))
gssizes <- gssizes[mt] ## rearrange gene set sizes to the rows of GSVA scores
plot(sqrt(gssizes), sqrt(fit$sigma), xlab="Sqrt(gene sets sizes)",
ylab="Sqrt(standard deviation)", las=1, pch=".", cex=3)
```

In such a setting, we can improve the analysis of differentially expressed
pathways by using the limma-trend approach [@phipson2016robust] setting the
`trend` parameter in the call to the `eBayes()` function to the vector of gene
set sizes, in one-to-one correspondence with the gene sets in the rows of the
GSVA scores container, as calculated before in `gssizes`.

```{r}
fit <- eBayes(fit, trend=gssizes)
res <- decideTests(fit, p.value=0.01)
summary(res)
```

There are `r sum(res[, 2] != 0)` MSigDB C2 differentially expressed pathways
with FDR < 5%. Figure \@ref(fig:leukemiavolcano) below shows a volcano plot
of the expression changes.
Expand All @@ -640,7 +663,7 @@ of the expression changes.
tt <- topTable(fit, coef=2, n=Inf)
DEpwys <- rownames(tt)[tt$adj.P.Val <= 0.01]
plot(tt$logFC, -log10(tt$P.Value), pch=".", cex=4, col=grey(0.75),
main="", xlab="GSVA enrichment score difference",
main="", xlab="GSVA enrichment score difference", las=1,
ylab=expression(-log[10]~~Raw~P-value))
abline(h=-log10(max(tt$P.Value[tt$adj.P.Val <= 0.01])),
col=grey(0.5), lwd=1, lty=2)
Expand Down
12 changes: 11 additions & 1 deletion vignettes/GSVA.bib
Original file line number Diff line number Diff line change
Expand Up @@ -447,5 +447,15 @@ @article{tomfohr_pathway_2005
{PMCID:} {PMC1261155}},
pages = {225},
file = {PubMed Central Full Text PDF:/home/sonja/.mozilla/firefox/m25utlhy.default/zotero/storage/CQQRZ2UB/Tomfohr et al. - 2005 - Pathway level analysis of gene expression using si.pdf:application/pdf}
},
}

@article{phipson2016robust,
title={Robust hyperparameter estimation protects against hypervariable genes and improves power to detect differential expression},
author={Phipson, Belinda and Lee, Stanley and Majewski, Ian J and Alexander, Warren S and Smyth, Gordon K},
journal={The annals of applied statistics},
volume={10},
number={2},
pages={946},
year={2016},
doi={10.1214/16-AOAS920}
}

0 comments on commit 27d70c0

Please sign in to comment.