10 — Assignment #2

Table of Contents Objectives Progress Conclusions and Outlook External Resources

Objectives

This journal entry will outline my progress while completing assignment 2.
Time estimated: 2 d; taken: 2+ d; date started: 2020-03-01; date completed:

Progress

Here is some information regarding the dataset that was used in this assignment. This dataset had to be re-created since I was not able to create a dataset with all rows having unique HUGO symbols which are defined as rownames of the dataframe.

Steps to get RStudio up and running (review):

Open up Docker QuickStart Terminal
docker pull risserlin/bcb420-base-image
docker run -e PASSWORD=pass --rm -p 8787:8787 risserlin/bcb420-base-image
Go to ip_address_of_the_docker_machine:8787 on browser to see RStudio, log in with rstudio, pass
Use docker ps to check all the running containers on whatever ports
Use control + c to terminate docker run process

Dataset: GSE136864
Note: Refer to journal entry Assignment 1 for details on the experiment.

Problem 1: Could not read csv normalized data file in R, into a table

I executed the R Notebook from A1, and added the following line to output my collected, cleaned, normalized data as a csv file. I also had to set the working directory to the Files Pane Location from Session -> Set Working Directory. When I was able to export and download the file to my local directory, I had to manually shift the HUGO symbol and Ensembl gene identifier columns to the preferred spots.

write.table(counts_filtered, file="counts_filtered.csv", sep=",")

I saw this error when I tried to load my csv into R
Error in `[.data.frame`(normalized_count_data, , 3:ncol(normalized_count_data)) : undefined columns selected

That didn't work, so I changed it to txt file, and was able to modify the columns within R.

# Modify the table, rename some columns
counts_filt <- counts_filtered
counts_filt
rownames(counts_filt) <- c()
counts_filt <- counts_filt[, c(1, 19, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)]
names(counts_filt)[1] <- "ensembl_gene_id"
names(counts_filt)[2] <- "hgnc_symbol"
#counts_filt$ensembl_gene_id <- rownames(counts_filt)
counts_filt

# write to a txt file
write.table(counts_filt, file="counts_filtered.txt", sep="")

Finally realized that I was using read.table when I should have used read.csv method. The latter helped me load my normalized data.
I had some issues with using model_matrix to assemble the factors that I wanted to use.
My heatmap also seemed to be too big as I kept getting a issues with no space

Conclusions and Outlook

I found the majority of the steps, and procedures easy to understand
Unfortunately due to lack of time, and balancing other course assignments and midterms, I was not able to complete the assignment. I have finished the majority of it however.

External Resources

This copyrighted material is licensed under a Creative Commons Attribution 4.0 International License. Follow the link to learn more.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

10 — Assignment #2

Table of Contents

Objectives

Progress

Conclusions and Outlook

External Resources

Clone this wiki locally