-
Notifications
You must be signed in to change notification settings - Fork 1
10 — Assignment #2
narasi15 edited this page Mar 4, 2020
·
16 revisions
This journal entry will outline my progress while completing assignment 2.
Time estimated: 2 d; taken: 2+ d; date started: 2020-03-01; date completed:
Here is some information regarding the dataset that was used in this assignment. This dataset had to be re-created since I was not able to create a dataset with all rows having unique HUGO symbols which are defined as rownames of the dataframe.
Steps to get RStudio up and running (review):
- Open up Docker QuickStart Terminal
docker pull risserlin/bcb420-base-image
docker run -e PASSWORD=pass --rm -p 8787:8787 risserlin/bcb420-base-image
- Go to
ip_address_of_the_docker_machine
:8787 on browser to see RStudio, log in with rstudio, pass - Use
docker ps
to check all the running containers on whatever ports - Use
control + c
to terminate docker run process
Dataset: GSE136864
Note: Refer to journal entry Assignment 1 for details on the experiment.
Problem 1: Could not read csv normalized data file in R, into a table
- I executed the R Notebook from A1, and added the following line to output my collected, cleaned, normalized data as a csv file. I also had to set the working directory to the Files Pane Location from Session -> Set Working Directory. When I was able to export and download the file to my local directory, I had to manually shift the HUGO symbol and Ensembl gene identifier columns to the preferred spots.
write.table(counts_filtered, file="counts_filtered.csv", sep=",")
Error in `[.data.frame`(normalized_count_data, , 3:ncol(normalized_count_data)) : undefined columns selected
- That didn't work, so I changed it to txt file, and was able to modify the columns within R.
# Modify the table, rename some columns
counts_filt <- counts_filtered
counts_filt
rownames(counts_filt) <- c()
counts_filt <- counts_filt[, c(1, 19, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18)]
names(counts_filt)[1] <- "ensembl_gene_id"
names(counts_filt)[2] <- "hgnc_symbol"
#counts_filt$ensembl_gene_id <- rownames(counts_filt)
counts_filt
# write to a txt file
write.table(counts_filt, file="counts_filtered.txt", sep="")
- Finally realized that I was using
read.table
when I should have usedread.csv
method. The latter helped me load my normalized data. - I had some issues with using model_matrix to assemble the factors that I wanted to use.
- My heatmap also seemed to be too big as I kept getting a issues with no space
- I found the majority of the steps, and procedures easy to understand
- Unfortunately due to lack of time, and balancing other course assignments and midterms, I was not able to complete the assignment. I have finished the majority of it however.
This copyrighted material is licensed under a Creative Commons Attribution 4.0 International License. Follow the link to learn more.