Deep Learning Intro
06/01/2022, 7:45 am
Synopsis: I will work through the Deep Learning Intro module.
Data: I watched the introduction video of the Deep Learning module I am following. The video covered the basics of convolutional neural networks, how they work, and visualizing them. Since the module is related to computational neuroscience, I also learned about performing analyses with fMRI data. The video also discussed comparisons between artificial and natural neural networks (animal brains) and how the two may assist each other in developments.
I also started the first part of tutorial 1, and I created a linear deep network of depth 1 and width 200 using pytorch.
Files:
9:20 am, 100 minutes
Decoding Neural Responses Part 1
06/06/2022, 7:45 am
Synopsis: I will start working through tutorial 1 of the deep learning training module.
Data: I wrote the code necessary for loading the dataset, visualizing the data, and splitting into train and test sets. I learned about the ReLU function and other non-linear activation functions. I added a ReLU layer to my deep network from last class. I also started working with the loss function and gradient descent in pyTorch.
Files:
9:20 am, 100 minutes
Decoding Neural Responses Part 2
06/06/2022, 7:45 am
Synopsis: I will finish working through tutorial 1 of the deep learning training module.
Data: I successfully trained my model with the parameters given in the tutorial. I also learned about neural network expressivity and the role that depth and width have on transforming data. I also reviewed the calculations that go into gradient descent/backpropagation and the difference between gradient descent and stochastic gradient descent. Finally, I learned about convolutional neural networks.
Files:
9:20 am, 100 minutes
Jupyter Notebook Setup
06/20/2022, 9:15 am
Synopsis: I will set up my Jupyter Notebook.
Data: I created and built my Jupyter Notebook with the help of the Jupyter docs. This notebook will contain both my daily log for SRP as well as all work done for the internship. I also transferred previous entries, notes, and code to my notebook.
Files:
12:34 pm, 199 minutes
Logistic Regression
06/20/2022, 1:45 pm
Synopsis: I will learn about logistic regression.
Data: I looked into logistic regression materials, including a video lecture series by Andrew Ng on YouTube and an article published by Towards Data Science. Notable aspects of study included description of a classification problem, sigmoid activation function, cost function, and gradient descent. I initially learned about logistic regression for a standard binary classification problem before extending it to multi-class classification problems. I also published my Jupyter notebook.
Files:
5:20 pm, 215 minutes
Logistic Regression Code
06/20/2022, 8:00 pm
Synopsis: I will apply logistic regression on a dataset with Python.
Data: I imported a diabetes prediction dataset from Kaggle for logistic regression. In Python, I created functions for calculating cost, gradients, and final accuracy, as well as initializing the datasets. My code also loops through and performs gradient descent a set number of times (10000) and graphs the change in cost from iteration to iteration.
Files:
9:10 pm, 70 minutes
Logistic Regression Code Analysis
06/21/2022, 9:20 am
Synopsis: I will analyze the results of my logistic regression diabetes prediction program.
Data: I continued working on my diabetes classification code. I created a function to split the dataset into training and testing sets to better evaluate the accuracy of the models. I also included code to graph side-by-side scatterplots of two selected features, coloring cases by observed diabetics from the dataset and predicted diabetics from the model. From this, I ran logistic regression on several combinations of two features; I selected these features since they were the most significant, as their weights from the initial model were the furthest from 0.
Then, I tried to create a parallel coordinate plot to visualize higher dimension datasets. I initially tried to do this with just subplots in matplotlib, but it was difficult to set the xtick labels correctly and have the legend show. This method was also relatively slow. As a result, I decided to convert my numpy arrays into a pandas dataframe and use the pandas method plotting.parallel_coordinates(). I was able to obtain a cleaner visual from this method, although the shared normalized y-axis scale makes it difficult to immediately notice differences in most features between subjects with diabetes and those without. However, the figure still looks very cluttered.
Files:
5:00 pm, 460 minutes
Logistic Regression Code Analysis and Bootstrap Methods and Permutation Testing
06/22/2022, 8:28 am
Synopsis: I will continue working on my analysis of the results of my logistic regression diabetes prediction program.
Data: I edited my analysis of the diabetes prediction program. I mainly worked on including more information about each of the models I tested. This involved listing inputs and outputs clearly and labeling provided figures/graphs. I also did some editing to clarify parts of my summary that were too vague or confusing.
Afterwards, I read about and took notes on bootstrap and permutation testing as methods of computational statistical inference. More specifically, the chapter included details about bootstrap distributions, resampling, accuracy, bootstrap confidence intervals, and permutation (significance) testing.
Files:
- Diabetes Prediction Code
- Kaggle Dataset
- Diabetes Prediction Analysis
- Bootstrap Methods and Permutation Testing Notes
3:23 pm, 415 minutes
Logistic Regression Model Permutation Test
06/23/2022, 8:39 am
Synopsis: I will create a program in Python that performs a permutation test on the diabetes data with my logistic regression model.
Data: I conducted a permutation test on the 6-dimensional logistic regresison model I had previously trained. With Python, I created functions to create permutations of the dataset by shuffling the data within each feature and calculate the p-value given a distribution and value of interest. I also wrote code to generate a histogram depicting the permutation distribution in comparison to the original test value. Afterwards, I summarized my findings and conclusions.
Files:
- Diabetes Prediction Code
- Kaggle Dataset
- Diabetes Prediction Analysis
- Bootstrap Methods and Permutation Testing Notes
12:00 pm, 201 minutes
Deep Learning Tutorials
06/24/2022, 7:30 am
Synopsis: I will learn about convolutional neural networks and normative encoding models.
Data: I continued working on Deep Learning Tutorial 2 of the Neuromatch Academy material. I was having some issues running and installing PyTorch in my notebook initially. I will finish up and start Tutorial 3 this afternoon.
Files:
11:00 am, 210 minutes
HCP Dataset Set Up and Software Installation
06/24/2022, 1:30 pm
Synopsis: I will work on gaining access to the HCP dataset. I will also finish up the Deep Learning tutorials from Neuromatch Academy.
Data: I was able to login to the virtual machine and access the README file and the dataset itself. I also worked on installing Anaconda, though there were some difficulties with that.
Files:
4:30 pm, 180 minutes
HCP Dataset
06/26/2022, 7:38 pm
Synopsis: I will work on accessing and formatting the HCP data.
Data: After trying several methods, I was able to access the remote dataset and copy it onto my local device with sshfs. I will work on formatting and outputting the dataset next.
Files: *
10:26 pm, 168 minutes