AutoML

Supervised and unsupervised learning utilizing feature engineering and using an exhaustive search of ideal pre-processing (if any), algorithms, and hyper-parameters.

This program is currently tuned towards binary classification such as those seen in medicine (eg. disease or no disease).

Environment Setup

Below are the steps to setup a new environment for running MILO-ML on a Debian based machine. The below steps are based on a fresh Debian 10.1 install. If you already have a working Python environment, you can skip this section.

Linux (Debian)

# Install dependencies
apt install build-essential nodejs libpython3.7-dev python-virtualenv rabbitmq-server libomp-dev

# Setup virtual environment
virtualenv -p python3 milo-env

# Change to this environment
source milo-env/bin/activate

# Always source this environment (optional)
printf "\nsource /home/<username>/milo-env/bin/activate" >> .bashrc

MacOSX

# Install brew
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install.sh)"

# Install dependencies
brew install rabbitmq gcc freetype node
pip3 install virtualenv

# Setup virtual environment
virtualenv -p python3 milo-env

# Change to this environment
source milo-env/bin/activate

# Always source this environment (optional)
printf "\nsource /home/<username>/milo-env/bin/activate" >> .bash_profile

Install

Perform the following steps to install the application:

git clone [email protected]:v3/milo-ml/MILO-ML/AutoML
cd AutoML
npm install

# For future updates, simply run:
npm run update

Data / Input

Currently data is expected to be in CSV format in two files:

train.csv: The complete training set which will be split into a train and test set. This data set should be balanced 50/50. If the data set is not balanced a warning will be shown (when run in via the CLI).

test.csv: A secondary data set which will be used independent of model generation and will be used to determine the generalizability of the model. This data set reflects the prevalence of disease being classified (eg. not balanced).

Command Line Interface

To run the program simply execute the following command:

# Replace TARGET with the name of the column you are targeting
python cli.py train.csv test.csv TARGET | tee report.txt

This will execute the program and send the output to both the terminal and the file report.txt. It will also output report.csv which contains the summary of all models generated.

If you do not pass a train and test spreadsheet, sample data contained within sample-data will be used.

Web Service

Running the application as a service with an HTTP API and Angular SPA front end can be done by using the following command:

npm run serve

Unit Testing

# Run all unit tests
npm test

Documentation

VuePress is used to generate the documentation which can be run as follows:

cd docs
npm run dev

Git Submodules

This repository uses a git submodule.

If you cloned the repository without using --recursive, then you can initialize and clone the submodule with the following steps.

Init the submodule
```
git submodule init
```
Update the submodule
```
git submodule update --remote
```

For more advanced usage, please refer to the git documentation: https://git-scm.com/book/en/v2/Git-Tools-Submodules

Name		Name	Last commit message	Last commit date
Latest commit History 1,519 Commits
.github/workflows		.github/workflows
.vscode		.vscode
api		api
build		build
client		client
data		data
docs		docs
ml		ml
preprocessor @ 7f5084c		preprocessor @ 7f5084c
sample-data		sample-data
scripts		scripts
ui		ui
.dockerignore		.dockerignore
.editorconfig		.editorconfig
.firebaserc		.firebaserc
.gitignore		.gitignore
.gitmodules		.gitmodules
Dockerfile		Dockerfile
GitVersion.yml		GitVersion.yml
LICENSE.md		LICENSE.md
README.md		README.md
azure-pipelines.yml		azure-pipelines.yml
cli.py		cli.py
firebase.json		firebase.json
firestore.indexes.json		firestore.indexes.json
firestore.rules		firestore.rules
notebook.ipynb		notebook.ipynb
package-lock.json		package-lock.json
package.json		package.json
public_key.pem		public_key.pem
pylintrc		pylintrc
requirements.txt		requirements.txt
server.py		server.py
uwsgi.ini		uwsgi.ini
worker.py		worker.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AutoML

Environment Setup

Linux (Debian)

MacOSX

Install

Data / Input

Command Line Interface

Web Service

Unit Testing

Documentation

Git Submodules

About

Releases

Packages

Languages

License

MILO-ML/AutoML

Folders and files

Latest commit

History

Repository files navigation

AutoML

Environment Setup

Linux (Debian)

MacOSX

Install

Data / Input

Command Line Interface

Web Service

Unit Testing

Documentation

Git Submodules

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages