Machine Learning Projects

A collection of R and Python ML projects performed on real-world and practical datasets.

What is this?

This is a collection of R and Python of various ML projects. Widely used and practical algorithms are developed with minimum dependency. These projects are conducted as assignments for the "Applied Machine Learning" Course taught at UIUC by Dr. Forsyth. The projects include a wide variety major ML algorithms conducted on real-world and practical datasets. This README only shows some examples of these projects. For more details and source codes check the page for each project individually.

Naive Bayesian Analysis & Random Forrest

UC Irvine data on diabetes patient are classified using a manually developed naive Bayes classifier as well as klaR and SVMLight packages for R. The case of patients with missing data are dealt using various methods and results are compared.

Classifier	Accuracy	Note
Naive Bayes	0.752	Using dnorm () Function
Naive Bayes	0.748	Replacing 0 values with "NA"
Naive Bayes	0.778	Using klaR Library
SVM	0.810	Using SVMLight Library

MNIST images are classified using Random Forrest method by h2o R Package.

# of Trees	Depth = 4	Depth = 8	Depth = 16
10	0.85	0.93	0.96
20	0.87	0.94	0.96
30	0.87	0.94	0.97

Source code and details

Support Vector Machine

A support vector machine is developed and trained on "UC Irvine data on adult income" using stochastic gradient descent with various regularization constants. 0.001 is chosen as the best regularization constant w/ an accuracy of 87% on test data.

Principal Components Analysis

A PCA was performed on CIFAR-10 dataset. The first 20 principal components were computed and used for representing the images of each category. Each class is tried to be represented by PC of other classes and therefore, the similarities of all 10 classes are measured.

.

Linear and Logistic Regression

UCI dataset on features of music and latitude and longitude of their origins is analyzed. A linear regression is developed for predicting the latitudes against features and is improved using Box-Cox transformation. Also, regularized regressions are developed using glmnet Package. Logistic regression are also developed on another UCI dataset to predict a binary logistic classification.

Expectation Maximization

Images are segmented using clustering. For this image pixels are clustered and then images are segmented by mapping each pixel to the cluster center with the highest value of the posterior probability for that pixel.

Authors

Behrooz Bajestani, [email protected]
In cooperation of two other teammates from UIUC

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
CS-498-AML-HW5 @ b591fbb		CS-498-AML-HW5 @ b591fbb
Images		Images
cs-498-hw6 @ f8e85fc		cs-498-hw6 @ f8e85fc
cs-498-hw7 @ 32e69db		cs-498-hw7 @ 32e69db
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Machine Learning Projects

Table of Contents

What is this?

Naive Bayesian Analysis & Random Forrest

Support Vector Machine

Principal Components Analysis

Linear and Logistic Regression

Expectation Maximization

Authors

About

Releases

Packages

behroozmrd47/Applied-Machine-Learning-2018

Folders and files

Latest commit

History

Repository files navigation

Machine Learning Projects

Table of Contents

What is this?

Naive Bayesian Analysis & Random Forrest

Support Vector Machine

Principal Components Analysis

Linear and Logistic Regression

Expectation Maximization

Authors

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages