This repo holds project reports and resumes for my data science work.
- Resume holds the latest copy of my resume.
- Reports holds project reports for my academic project work at Northwestern. Here's a summary of the projects:
- Sales price prediction for Ames housing dataset using multilinear regression modeling using lm
- Wins per game predictive modeling for 186 years of baseball data using regression and decision trees using glm, broom
- Wine case purchase volume predictive modeling using poisson, negative binomial, and hurdle models using glm, caret, broom
- Model to predict donation amounts for a not-for-profit marketing campaign using a variety of machine learning modeling approaches including boosting, bagging, random forest, PCR and elastinet using caret
- Miles per gallon prediction on the ISLR::auto using flexmix modeling
- Time series forecasting for item level forecasts for Russian software firm competition (1C). Top-down approach using TSLM, ARIMA, Prophet, STLF models using xts, forecast, TSclust, mice
- Time series forecasting for the DengAI, disease spread competition, utilizing transfer entropy, method of analogues, and single layer LSTM models using xts, forecast, TransferEntropy, keras
- Text analysis of aviation safety data using tSNE, TF-IDF and structural topic modeling in R using tm, topicmodels, stm, tidytext
- Developed fully connected neural network model developed using numpy and pandas, to classify MNIST dataset
- Customer segmentation modeling using tSNE, hierarchical agglomerative clustering and k-means followed by market segmentation profiling
- Discrete choice experiment modeling using Hierarchical Bayes Multinomial Logit to select product design
- Model to predict target market for campaign using random forests and naïve bayes models