Skip to content
Joshua edited this page Apr 18, 2020 · 65 revisions

Table of Contents

Student Wiki

This repository wiki will be the main location for wiki updates by students in the course bcb420.

Please submit an issue on this repository with your github name for access to course materials here

Student Pages

Student Name Link to Wiki/Journal Link to Repo
Example Student https://github.com/bcb420-2020/ExampleStudent/wiki https://github.com/bcb420-2020/ExampleStudent
Dina Issakova https://github.com/bcb420-2020/student_dinaIssakova/wiki https://github.com/bcb420-2020/student_dinaIssakova
Mit Patel https://github.com/bcb420-2020/student_mitp19/wiki https://github.com/bcb420-2020/student_mitp19
Dianna McAllister https://github.com/bcb420-2020/student_diannamcallister/wiki https://github.com/bcb420-2020/student_diannamcallister
Shiyun Tang https://github.com/bcb420-2020/student_sealigu/wiki https://github.com/bcb420-2020/student_sealigu
Yining Ding https://github.com/bcb420-2020/student_helen307/wiki https://github.com/bcb420-2020/student_helen307
Sotaro Hirai https://github.com/bcb420-2020/student_sotaro0214/wiki https://github.com/bcb420-2020/student_sotaro0214
Bihan Zhu https://github.com/bcb420-2020/student_Norisama/wiki https://github.com/bcb420-2020/student_Norisama
Dong Hoon Han https://github.com/bcb420-2020/student_Crystalizeee-/wiki https://github.com/bcb420-2020/student_Crystalizeee-
Jongmin Lim https://github.com/bcb420-2020/student_jmlim2/wiki https://github.com/bcb420-2020/student_jmlim2
Minh An Ho https://github.com/bcb420-2020/student_minhanho/wiki https://github.com/bcb420-2020/student_minhanho
Michael Apostolides https://github.com/bcb420-2020/student_apostonaut/wiki https://github.com/bcb420-2020/student_apostonaut
Chris Fernandez https://github.com/bcb420-2020/student_c-fern/wiki https://github.com/bcb420-2020/student_c-fern
Gang Peng https://github.com/bcb420-2020/student_Kevinpen/wiki https://github.com/bcb420-2020/student_Kevinpen
Jiayan Wang https://github.com/bcb420-2020/student_dxjasmine/wiki https://github.com/bcb420-2020/student_dxjasmine
Darren Chan https://github.com/bcb420-2020/student_DC123456789/wiki https://github.com/bcb420-2020/student_DC123456789
Luka Trkla https://github.com/bcb420-2020/student_lukatrkla/wiki https://github.com/bcb420-2020/student_lukatrkla
Fanxing Bu https://github.com/bcb420-2020/student_LoadingBFX/wiki https://github.com/bcb420-2020/student_LoadingBFX
Daniel Fusca https://github.com/bcb420-2020/student_fuscada2/wiki https://github.com/bcb420-2020/student_fuscada2
Yuexin Yu https://github.com/bcb420-2020/student_MichelleMengzhi/wiki https://github.com/bcb420-2020/student_MichelleMengzhi
Yi Fei Huang https://github.com/bcb420-2020/student_hyf97ca/wiki https://github.com/bcb420-2020/student_hyf97ca
Justin Chee https://github.com/bcb420-2020/student_cheejus2/wiki https://github.com/bcb420-2020/student_cheejus2
Dae-Won (Sean) Gong https://github.com/bcb420-2020/student_sgong101/wiki https://github.com/bcb420-2020/student_sgong101
Joshua Efe https://github.com/bcb420-2020/student_joshua/wiki https://github.com/bcb420-2020/student_joshua
Bruno Pereira https://github.com/bcb420-2020/student_brucosper/wiki https://github.com/bcb420-2020/student_brucosper
Luke Zhang https://github.com/bcb420-2020/student_LZhang98/wiki https://github.com/bcb420-2020/student_LZhang98
Haoan Wang https://github.com/bcb420-2020/student_wangtiananyiyi/wiki https://github.com/bcb420-2020/student_wangtiananyiyi
Joelle Jee https://github.com/bcb420-2020/student_JoelleJee/wiki https://github.com/bcb420-2020/student_JoelleJee
Yuhan Hu https://github.com/bcb420-2020/student_HU-YH/wiki https://github.com/bcb420-2020/student_HU-YH
Arshia Mahmoodi https://github.com/bcb420-2020/student_ArshiaMahmoodi/wiki https://github.com/bcb420-2020/student_ArshiaMahmoodi
Priyanka Narasimhan https://github.com/bcb420-2020/student_narasi15/wiki https://github.com/bcb420-2020/student_narasi15
Emily Ayala https://github.com/bcb420-2020/student_ayalaemmylou/wiki https://github.com/bcb420-2020/student_ayalaemmylou
Alison Wu https://github.com/bcb420-2020/student_alisonwu19/wiki https://github.com/bcb420-2020/student_alisonwu19
Yiqiu Tang https://github.com/bcb420-2020/student_yiqiutang-/wiki https://github.com/bcb420-2020/student_yiqiutang-
Dimitrije Ratkov https://github.com/bcb420-2020/student_ratkovdi/wiki https://github.com/bcb420-2020/student_ratkovdi

Assignment #1 - Sign up page

Student Name GEO ID and link Dataset Name Notes from RI or OW
Example Student GSE70072 Apoptosis enhancing drugs overcome innate platinum resistance in CA125 negative tumor initiating populations of high grade serous ovarian cancer
Mit Patel GSE108539 Transcriptomic analyses reveal rhythmic and CLOCK-driven pathways in human skeletal muscle Changed dataset to one with processed raw counts (Mit)
Shiyun Tang GSE41816 Gene expression profiling of MDA231, BT549, and SUM159PT cells after selumetinib treatment or DUSP4 siRNA knockdown Does this data set have processed data or only RAW? - I change into this dataset, it has processed data at the bottom of the page - Shiyun
Bruno Pereira GSE31729 Lack of effect in desensitization with intravenous immunoglobulin and rituximab in highly-sensitized patients
Gang Peng GSE109161 Comparison of transcriptional changes after CD28/CD3z and 4-1BB/CD3z chimeric antigen receptor ligation Does this data set have processed data or only RAW? (Owen) From a quick look it looks like they have raw counts, which can be dealt with given what was went over in lecture
Dina Issakova GSE77938 Comparison of transcriptional changes in human corneas with and without keratoconus, a common cause of nearsightedness
Yiqiu Tang GSE107637 RUVBL1/RUVBL2 ATPase Activity Drives PAQosome Maturation, DNA Replication and Radioresistance in Lung Cancer
Dong Hoon Han GSE96578 Transcriptional profiles of CD8+ T cells from peripheral blood of melanoma patients before and after anti-PD1 therapy Changed to Human
Jongmin Lim GSE116124 Patient-Specific iPSC-Derived Astrocytes Contribute to Non-Cell-Autonomous Neurodegeneration in Parkinson’s Disease
Joelle Jee GSE121992 Disruption of the MBD2-NuRD complex but not MBD3-NuRD induces high level HbF expression in human adult erythroid cells
Michael Apostolides GSE77108 HDAC inhibitor SAHA reverses inflammatory gene expression in diabetic endothelial cells Good to go
Dianna McAllister GSE66306 Impact of bariatric surgery on RNA-seq gene expression profiles of peripheral monocytes in humans Good to go
Justin Chee GSE111972 Transcriptional profiling of human microglia reveals grey-white matter heterogeneity and multiple sclerosis-associated changes Good to go.
Yuexin Yu GSE66486 Response of IRF7-deficient peripheral blood mononuclear cells to pH1N1 influenza virus infection Good to go, but why not RNA-seq?
Daniel Fusca GSE125066 Effect of Toxoplasma gondii efector TgIST on global transcriptome of human foreskin fibroblasts (HFFs) upon type I IFN activation Looks like they put in several condition combinations. Looks like the IFN-B signal is strong enough to have that as your 2 class comparison, but you’ll have to take not of how the other conditions affect your results. Good to go, but feel free to choose a somewhat simpler dataset if you’d like
Yining Ding GSE84054 Transcriptome profiling of ER+ breast cancer primary tumor and its tumorsphere derivative
Luke Zhang GSE110021 RNA-Seq analysis of genes and pathways involved in the TGF-β-driven transformation of fibroblasts to myofibroblasts Good to go. Looks good, but why not use RNA-seq?

Luke: Updated – switched to an RNAseq experiment.
Jiayan Wang GSE113964 Sequencing based maternal whole blood expression changes with gestational age and labor in normal pregnancy Check to make sure there are at least two conditions that can be compared in this dataset. They have multiple platforms that they test on but it is unclear if they have any different states.
Priyanka Narasimhan GSE136864 Cellular response to protein-conjugated nanoparticles There is no raw counts data associated with this dataset that I can see.
Pri: Just switched over to another dataset, is this one okay?
Alison Wu GSE87517 Gene expression profiles of leukocytes in normal breast tissues, DCIS, and HER2+ and IDC during breast tumor progression
Sotaro Hirai GSE125150 RNA-seq of human iPS derived macrophages with or without KLF1- transcription factor Activation
Minh An Ho GSE113165 Using RNA sequencing to examine age-dependent skeletal muscle transcriptome response to bed rest-induced atrophy, and age independent disuse-induced insulin resistance
Yi Fei Huang GSE120200 Gene expression profiling of neural crest progenitor cultures derived from human embryonic stem cells carrying nonsense mutations in the Polycomb gene ASXL1 [HOM]
Dae-Won Gong GSE106169 Polyol pathway links glucose metabolism to the aggressiveness of cancer cells
Yuexin Yu GSE64744 Small RNA profiling reveals deregulated PTEN/PI3K/Akt pathway in asthmatic bronchial smooth muscle cells
Joshua Efe GSE72055 Human telomerase RNA processing and quality control
Darren Chan GSE120891 Differential expression of genes in fibroblasts and epithelial cells infected with dsDNA viruses
Dimitrije Ratkov GSE141220 Nascent transcriptomics reveal cellular pro-lytic factors upregulated upstream of the latency-to-lytic switch protein of Epstein-Barr virus
Jiayan Wang GSE113493 Global transcriptional profiling changes upon knockdown of G9a in human non-small cell lung cancer cells
Fanxing Bu GSE111958 Transcriptional profile of human STAT1-/- fibroblasts expressing LY6E or empty control vector
Haoan Wang GSE125664 Serotonin-induced hyperactivity in SSRI-resistant major depressive disorder patient-derived neurons
Arshia Mahmoodi GSE135511 Gene expression profiling of multiple sclerosis brain samples
Bihan Zhu GSE139242 Transcriptome profiling of human thymic and peripheral blood CD4 + and CD8+ T cells, using RNA-seq
Emily Ayala GSE81475 Zika Virus Disrupts Phospho-TBK1 Localization and Mitosis in Human Neural Stem Cell Model Systems
Yuhan Hu GSE114260 ERα-mediated cell cycle progression is an important requisite for CDK4/6 inhibitor response in HR+ breast cancer

Homework Assignment - Due February 11, 2020

  1. Find an annotation data set (excluding GO and Reactome which I have outlined below as an example) for human genes - any data set that adds functional, process, location, disease status ... to a set of genes.
  2. Record this annotation source in your journal and add it to the list of annotation sources below.
  3. Find out the following information:
    • What sort of data is it? What sort of information does it offer us?
    • When and where was it published? Was it published?
    • Is this annotation set updated regularly or is it a static source?
    • Where can I find this data? (link to the download web address or ftp site or publication where it can be found)
    • How is the data formatted and released? Does it exist in some sort of standard file format?
    • What identifiers are associated with these annotations?

Annotation Resources

Annotation Resource Student
Gene Ontology(GO) Example Student
GENCODE Yi Fei Huang
Ensembl Alison Wu
Kegg Yining Ding
GSEA and MSigDB Michael Apostolides
UCSC Dianna McAllister
OrthoDB Daniel Fusca
OMIM Dina Issakova
refTSS Gang Peng
BioDataome Jongmin Lim
UniProt Yuexin Yu
HCSGD Yuhan Hu
CTD Minh An Ho
TCGA Fanxing Bu
RefEx Dong Hoon Han
RCSB_PDB Haoan Wang
SIGNOR Sotaro Hirai
Segway Mit Patel
Allen Brain Institute Justin Chee
NucMap Darren Chan
HumanProteinAtlas Emily Ayala
COSMIC Luke Zhang
The Chromosome 7 Annotation Project Arshia Mahmoodi
MalaCards Jiayan Wang
CCDS Bihan Zhu
KEGG Shiyun Tang
GeneCards Priyanka Narasimhan
KAAS (KEGG Automatic Annotation Server) Joshua Efe

Homework Assignment - Due February 25, 2020

Use this list of genes:genelist.txt as your query set and run a g:profiler enrichment analysis with the following parameters:

  • Data sources : Reactome, Go biologoical process, and Wiki pathways
  • Multiple hypothesis testing - Benjamini hochberg
Answer the questions below:
  1. What is the top term returned in each data source?
  2. How many genes are in each of the above genesets returned? (hint, in the Detailed results tab of g:profiler results if you click on the arrows next to the stats heading you will be able to see the number of genes in a term, number of genes in your query and number of genes in your query that are also in your term)
  3. How many genes from our query are found in the above genesets?
  4. Change g:profiler settings so that you limit the size of the returned genesets. Make sure the returned genesets are between 5 and 200 genes in size. Did that change the results?
  5. Which of the 4 ovarian cancer expression subtypes do you think this list represents?
  6. Bonus: The top gene returned for this comparison is TFEC (ensembl gene id:ENSG00000105967). Is it found annotated in any of the pathways returned by g:profiler for our query? What terms is it associated with in g:profiler?
Don't forget to update your journal with your results.

Homework assignment - Due March 10,2020

Practise using GSEA

Given the ranked list comparing mesenchymal and immunoreactive ovarian cancer subtypes(mesenchymal genes have positive scores, immunoreactive have negative scores). perform a GSEA preranked analysis using the following parameters:

and answer the following questions in your journal:
  1. Explain the reasons for using each of the above parameters.
  2. What is the top gene set returned for the Mesenchymal sub type? What is the top gene set returned for the Immunoreactive subtype? For each of the genesets answer the below questions:
    1. What is its pvalue, ES, NES and FDR associated with it.
    2. How many genes in its leading edge?
    3. What is the top gene associated with this geneset.
Don't forget to update your journal with your results.