Python implementations of some of the fundamental Machine Learning models and algorithms from scratch.
The purpose of this project is not to produce as optimized and computationally efficient algorithms as possible but rather to present the inner workings of them in a transparent way.
$ git clone https://github.com/eriklindernoren/ML-From-Scratch
$ cd ML-From-Scratch
$ python setup.py install
$ python mlfromscratch/supervised_learning/regression.py
Figure: Polynomial ridge regression of temperature data measured in
Linköping, Sweden 2016.
$ python mlfromscratch/supervised_learning/neural_network.py
+---------+
| ConvNet |
+---------+
Input Shape: (1, 8, 8)
+----------------------+------------+--------------+
| Layer Type | Parameters | Output Shape |
+----------------------+------------+--------------+
| Conv2D | 160 | (16, 8, 8) |
| Activation (ReLU) | 0 | (16, 8, 8) |
| Dropout | 0 | (16, 8, 8) |
| BatchNormalization | 2048 | (16, 8, 8) |
| Conv2D | 4640 | (32, 8, 8) |
| Activation (ReLU) | 0 | (32, 8, 8) |
| Dropout | 0 | (32, 8, 8) |
| BatchNormalization | 4096 | (32, 8, 8) |
| Flatten | 0 | (2048,) |
| Dense | 524544 | (256,) |
| Activation (ReLU) | 0 | (256,) |
| Dropout | 0 | (256,) |
| BatchNormalization | 512 | (256,) |
| Dense | 2570 | (10,) |
| Activation (Softmax) | 0 | (10,) |
+----------------------+------------+--------------+
Total Parameters: 538570
Training: 100% [------------------------------------------------------------------------] Time: 0:01:55
Accuracy: 0.987465181058
Figure: Classification of the digit dataset using CNN.
$ python mlfromscratch/unsupervised_learning/dbscan.py
Figure: Clustering of the moons dataset using DBSCAN.
$ python mlfromscratch/unsupervised_learning/generative_adversarial_network.py
+-----------+
| Generator |
+-----------+
Input Shape: (100,)
+------------------------+------------+--------------+
| Layer Type | Parameters | Output Shape |
+------------------------+------------+--------------+
| Dense | 25856 | (256,) |
| Activation (LeakyReLU) | 0 | (256,) |
| BatchNormalization | 512 | (256,) |
| Dense | 131584 | (512,) |
| Activation (LeakyReLU) | 0 | (512,) |
| BatchNormalization | 1024 | (512,) |
| Dense | 525312 | (1024,) |
| Activation (LeakyReLU) | 0 | (1024,) |
| BatchNormalization | 2048 | (1024,) |
| Dense | 803600 | (784,) |
| Activation (TanH) | 0 | (784,) |
+------------------------+------------+--------------+
Total Parameters: 1489936
+---------------+
| Discriminator |
+---------------+
Input Shape: (784,)
+------------------------+------------+--------------+
| Layer Type | Parameters | Output Shape |
+------------------------+------------+--------------+
| Dense | 401920 | (512,) |
| Activation (LeakyReLU) | 0 | (512,) |
| Dropout | 0 | (512,) |
| Dense | 131328 | (256,) |
| Activation (LeakyReLU) | 0 | (256,) |
| Dropout | 0 | (256,) |
| Dense | 514 | (2,) |
| Activation (Softmax) | 0 | (2,) |
+------------------------+------------+--------------+
Total Parameters: 533762
Figure: Training progress of a MNIST Generative Adversarial Network.
$ python mlfromscratch/reinforcement_learning/deep_q_learning.py
+----------------+
| Deep Q-Network |
+----------------+
Input Shape: (4,)
+-------------------+------------+--------------+
| Layer Type | Parameters | Output Shape |
+-------------------+------------+--------------+
| Dense | 320 | (64,) |
| Activation (ReLU) | 0 | (64,) |
| Dense | 130 | (2,) |
+-------------------+------------+--------------+
Total Parameters: 450
Figure: Deep Q-Network solution to the CartPole-v1 environment in OpenAI gym.
$ python mlfromscratch/unsupervised_learning/genetic_algorithm.py
+--------+
| GA |
+--------+
Description: Implementation of a Genetic Algorithm which aims to produce
the user specified target string. This implementation calculates each
candidate's fitness based on the aphabetical distance between the candidate
and the target. A candidate is selected as a parent with probabilities proportional
to the candidate's fitness. Reproduction is implemented as a single-point
crossover between pairs of parents. Mutation is done by randomly assigning
new characters with uniform probability.
Parameters
----------
Target String: 'Genetic Algorithm'
Population Size: 100
Mutation Rate: 0.05
[0 Closest Candidate: 'CJqlJguPlqzvpoJmb', Fitness: 0.00]
[1 Closest Candidate: 'MCxZxdr nlfiwwGEk', Fitness: 0.01]
[2 Closest Candidate: 'MCxZxdm nlfiwwGcx', Fitness: 0.01]
[3 Closest Candidate: 'SmdsAklMHn kBIwKn', Fitness: 0.01]
[4 Closest Candidate: ' lotneaJOasWfu Z', Fitness: 0.01]
...
[292 Closest Candidate: 'GeneticaAlgorithm', Fitness: 1.00]
[293 Closest Candidate: 'GeneticaAlgorithm', Fitness: 1.00]
Answer: 'Genetic Algorithm'
$ python mlfromscratch/unsupervised_learning/apriori.py
+-------------+
| Apriori |
+-------------+
Minimum Support: 0.25
Minimum Confidence: 0.8
Transactions:
[1, 2, 3, 4]
[1, 2, 4]
[1, 2]
[2, 3, 4]
[2, 3]
[3, 4]
[2, 4]
Frequent Itemsets:
[1, 2, 3, 4, [1, 2], [1, 4], [2, 3], [2, 4], [3, 4], [1, 2, 4], [2, 3, 4]]
Rules:
1 -> 2 (support: 0.43, confidence: 1.0)
4 -> 2 (support: 0.57, confidence: 0.8)
[1, 4] -> 2 (support: 0.29, confidence: 1.0)
- Adaboost
- Bayesian Regression
- Decision Tree
- Deep Learning
- Layers
- Activation Layer
- Average Pooling Layer
- Batch Normalization Layer
- Constant Padding Layer
- Convolutional Layer
- Dropout Layer
- Flatten Layer
- Fully-Connected (Dense) Layer
- Max Pooling Layer
- Reshape Layer
- Up Sampling Layer
- Zero Padding Layer
- Model Types
- Convolutional Neural Network
- Multilayer Perceptron
- Layers
- Gradient Boosting
- K Nearest Neighbors
- Linear Discriminant Analysis
- Linear Regression
- Logistic Regression
- Multi-class Linear Discriminant Analysis
- Naive Bayes
- Perceptron
- Polynomial Regression
- Random Forest
- Ridge Regression
- Support Vector Machine
- XGBoost
- Apriori
- DBSCAN
- FP-Growth
- Gaussian Mixture Model
- Generative Adversarial Network
- Genetic Algorithm
- K-Means
- Partitioning Around Medoids
- Principal Component Analysis
Feel free to reach out if there's some implementation you would like to see here, or if you're just feeling social.