-
Notifications
You must be signed in to change notification settings - Fork 3
Home
Joshua edited this page Apr 18, 2020
·
65 revisions
This repository wiki will be the main location for wiki updates by students in the course bcb420.
Please submit an issue on this repository with your github name for access to course materials here
Student Name | GEO ID and link | Dataset Name | Notes from RI or OW |
---|---|---|---|
Example Student | GSE70072 | Apoptosis enhancing drugs overcome innate platinum resistance in CA125 negative tumor initiating populations of high grade serous ovarian cancer | |
Mit Patel | GSE108539 | Transcriptomic analyses reveal rhythmic and CLOCK-driven pathways in human skeletal muscle | Changed dataset to one with processed raw counts (Mit) |
Shiyun Tang | GSE41816 | Gene expression profiling of MDA231, BT549, and SUM159PT cells after selumetinib treatment or DUSP4 siRNA knockdown | Does this data set have processed data or only RAW? - I change into this dataset, it has processed data at the bottom of the page - Shiyun |
Bruno Pereira | GSE31729 | Lack of effect in desensitization with intravenous immunoglobulin and rituximab in highly-sensitized patients | ✓ |
Gang Peng | GSE109161 | Comparison of transcriptional changes after CD28/CD3z and 4-1BB/CD3z chimeric antigen receptor ligation | Does this data set have processed data or only RAW? (Owen) From a quick look it looks like they have raw counts, which can be dealt with given what was went over in lecture |
Dina Issakova | GSE77938 | Comparison of transcriptional changes in human corneas with and without keratoconus, a common cause of nearsightedness | ✓ |
Yiqiu Tang | GSE107637 | RUVBL1/RUVBL2 ATPase Activity Drives PAQosome Maturation, DNA Replication and Radioresistance in Lung Cancer | ✓ |
Dong Hoon Han | GSE96578 | Transcriptional profiles of CD8+ T cells from peripheral blood of melanoma patients before and after anti-PD1 therapy | Changed to Human |
Jongmin Lim | GSE116124 | Patient-Specific iPSC-Derived Astrocytes Contribute to Non-Cell-Autonomous Neurodegeneration in Parkinson’s Disease | ✓ |
Joelle Jee | GSE121992 | Disruption of the MBD2-NuRD complex but not MBD3-NuRD induces high level HbF expression in human adult erythroid cells | ✓ |
Michael Apostolides | GSE77108 | HDAC inhibitor SAHA reverses inflammatory gene expression in diabetic endothelial cells | Good to go |
Dianna McAllister | GSE66306 | Impact of bariatric surgery on RNA-seq gene expression profiles of peripheral monocytes in humans | Good to go |
Justin Chee | GSE111972 | Transcriptional profiling of human microglia reveals grey-white matter heterogeneity and multiple sclerosis-associated changes | Good to go. |
Yuexin Yu | GSE66486 | Response of IRF7-deficient peripheral blood mononuclear cells to pH1N1 influenza virus infection | Good to go, but why not RNA-seq? |
Daniel Fusca | GSE125066 | Effect of Toxoplasma gondii efector TgIST on global transcriptome of human foreskin fibroblasts (HFFs) upon type I IFN activation | Looks like they put in several condition combinations. Looks like the IFN-B signal is strong enough to have that as your 2 class comparison, but you’ll have to take not of how the other conditions affect your results. Good to go, but feel free to choose a somewhat simpler dataset if you’d like |
Yining Ding | GSE84054 | Transcriptome profiling of ER+ breast cancer primary tumor and its tumorsphere derivative | ✓ |
Luke Zhang | GSE110021 | RNA-Seq analysis of genes and pathways involved in the TGF-β-driven transformation of fibroblasts to myofibroblasts |
Good to go. Looks good, but why not use RNA-seq? Luke: Updated – switched to an RNAseq experiment. |
Jiayan Wang | GSE113964 | Sequencing based maternal whole blood expression changes with gestational age and labor in normal pregnancy | Check to make sure there are at least two conditions that can be compared in this dataset. They have multiple platforms that they test on but it is unclear if they have any different states. |
Priyanka Narasimhan | GSE136864 | Cellular response to protein-conjugated nanoparticles |
There is no raw counts data associated with this dataset that I can see. Pri: Just switched over to another dataset, is this one okay? |
Alison Wu | GSE87517 | Gene expression profiles of leukocytes in normal breast tissues, DCIS, and HER2+ and IDC during breast tumor progression | ✓ |
Sotaro Hirai | GSE125150 | RNA-seq of human iPS derived macrophages with or without KLF1- transcription factor Activation | ✓ |
Minh An Ho | GSE113165 | Using RNA sequencing to examine age-dependent skeletal muscle transcriptome response to bed rest-induced atrophy, and age independent disuse-induced insulin resistance | ✓ |
Yi Fei Huang | GSE120200 | Gene expression profiling of neural crest progenitor cultures derived from human embryonic stem cells carrying nonsense mutations in the Polycomb gene ASXL1 [HOM] | |
Dae-Won Gong | GSE106169 | Polyol pathway links glucose metabolism to the aggressiveness of cancer cells | |
Yuexin Yu | GSE64744 | Small RNA profiling reveals deregulated PTEN/PI3K/Akt pathway in asthmatic bronchial smooth muscle cells | |
Joshua Efe | GSE72055 | Human telomerase RNA processing and quality control | |
Darren Chan | GSE120891 | Differential expression of genes in fibroblasts and epithelial cells infected with dsDNA viruses | |
Dimitrije Ratkov | GSE141220 | Nascent transcriptomics reveal cellular pro-lytic factors upregulated upstream of the latency-to-lytic switch protein of Epstein-Barr virus | |
Jiayan Wang | GSE113493 | Global transcriptional profiling changes upon knockdown of G9a in human non-small cell lung cancer cells | |
Fanxing Bu | GSE111958 | Transcriptional profile of human STAT1-/- fibroblasts expressing LY6E or empty control vector | |
Haoan Wang | GSE125664 | Serotonin-induced hyperactivity in SSRI-resistant major depressive disorder patient-derived neurons | |
Arshia Mahmoodi | GSE135511 | Gene expression profiling of multiple sclerosis brain samples | |
Bihan Zhu | GSE139242 | Transcriptome profiling of human thymic and peripheral blood CD4 + and CD8+ T cells, using RNA-seq | |
Emily Ayala | GSE81475 | Zika Virus Disrupts Phospho-TBK1 Localization and Mitosis in Human Neural Stem Cell Model Systems | |
Yuhan Hu | GSE114260 | ERα-mediated cell cycle progression is an important requisite for CDK4/6 inhibitor response in HR+ breast cancer |
- Find an annotation data set (excluding GO and Reactome which I have outlined below as an example) for human genes - any data set that adds functional, process, location, disease status ... to a set of genes.
- Record this annotation source in your journal and add it to the list of annotation sources below.
- Find out the following information:
- What sort of data is it? What sort of information does it offer us?
- When and where was it published? Was it published?
- Is this annotation set updated regularly or is it a static source?
- Where can I find this data? (link to the download web address or ftp site or publication where it can be found)
- How is the data formatted and released? Does it exist in some sort of standard file format?
- What identifiers are associated with these annotations?
Annotation Resource | Student | |
---|---|---|
Gene Ontology(GO) | Example Student | |
GENCODE | Yi Fei Huang | |
Ensembl | Alison Wu | |
Kegg | Yining Ding | |
GSEA and MSigDB | Michael Apostolides | |
UCSC | Dianna McAllister | |
OrthoDB | Daniel Fusca | |
OMIM | Dina Issakova | |
refTSS | Gang Peng | |
BioDataome | Jongmin Lim | |
UniProt | Yuexin Yu | |
HCSGD | Yuhan Hu | |
CTD | Minh An Ho | |
TCGA | Fanxing Bu | |
RefEx | Dong Hoon Han | |
RCSB_PDB | Haoan Wang | |
SIGNOR | Sotaro Hirai | |
Segway | Mit Patel | |
Allen Brain Institute | Justin Chee | |
NucMap | Darren Chan | |
HumanProteinAtlas | Emily Ayala | |
COSMIC | Luke Zhang | |
The Chromosome 7 Annotation Project | Arshia Mahmoodi | |
MalaCards | Jiayan Wang | |
CCDS | Bihan Zhu | |
KEGG | Shiyun Tang | |
GeneCards | Priyanka Narasimhan | |
KAAS (KEGG Automatic Annotation Server) | Joshua Efe |
Use this list of genes:genelist.txt as your query set and run a g:profiler enrichment analysis with the following parameters:
- Data sources : Reactome, Go biologoical process, and Wiki pathways
- Multiple hypothesis testing - Benjamini hochberg
- What is the top term returned in each data source?
- How many genes are in each of the above genesets returned? (hint, in the Detailed results tab of g:profiler results if you click on the arrows next to the stats heading you will be able to see the number of genes in a term, number of genes in your query and number of genes in your query that are also in your term)
- How many genes from our query are found in the above genesets?
- Change g:profiler settings so that you limit the size of the returned genesets. Make sure the returned genesets are between 5 and 200 genes in size. Did that change the results?
- Which of the 4 ovarian cancer expression subtypes do you think this list represents?
- Bonus: The top gene returned for this comparison is TFEC (ensembl gene id:ENSG00000105967). Is it found annotated in any of the pathways returned by g:profiler for our query? What terms is it associated with in g:profiler?
Practise using GSEA
Given the ranked list comparing mesenchymal and immunoreactive ovarian cancer subtypes(mesenchymal genes have positive scores, immunoreactive have negative scores). perform a GSEA preranked analysis using the following parameters:
- mesenchymal vs immuno rank file
- genesets from the baderlab geneset collection from February 1, 2020 containing GO biological process, no IEA and pathways.
- maximum geneset size of 200
- minimum geneset size of 15
- gene set permutation
- Explain the reasons for using each of the above parameters.
- What is the top gene set returned for the Mesenchymal sub type? What is the top gene set returned for the Immunoreactive subtype? For each of the genesets answer the below questions:
- What is its pvalue, ES, NES and FDR associated with it.
- How many genes in its leading edge?
- What is the top gene associated with this geneset.