This is an experimental platform-independent machine learning library. Born from the desire to implement modern machine learning algorithms by hand, this project has grown considerably and provides now basic algorithms for various classification, regression and clustering tasks.
For further information on implemented algorithms and usage examples, please consult the project's website.
Please consult the API for detailed and up-to-date information on the algorithms, e.g. the implemented hyper parameters.
- random
- logistic regression
- perceptron
- k-nearest neighbors
- decision tree
- multilayer neural network
- naive Bayes
- boosted decision tree
- random forest
- SVM with linear kernel
- SVM with non-linear kernel (see here or here)
- convolutional neural network
- recurrent neural network
- random
- linear regression
- decision tree
- Bayes
- k-nearest neighbors
- neural network regression
- SVM
- k-means
- self-organizing map
- hierarchical clustering
- expectation-maximization
- mean-shift
- extension of linear models to polynomial dependencies via feature transformation
This app uses sbt as build tool.
Download Java JDK and then sbt:
echo "deb https://dl.bintray.com/sbt/debian /" | sudo tee -a /etc/apt/sources.list.d/sbt.list
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 2EE0EA64E40A89B84B2DF73499E82A75642AC823
sudo apt-get update
sudo apt-get install sbt
# sbt new sbt/scala-seed.g8 # set up a dummy project
From sbt:
cd mllab
sbt
# compile and run the app, use ~ for automatic updates and recompilation
[~]run # run the default random classifier
run --help # get more information on options and commands
test # compile and execute tests
compile # only compile the app
console # start scala console for this project
Create test data in the data
directory:
python3 bin/create_data.py --reg linear # create dummy regression data
python3 bin/create_data.py --clf circles # create dummy classification data
Then run MLlab on it, e.g. with sbt run --clf DecisionTree --data data
Create the image yourself and publish it
docker build -t mllab . # build the image
docker run -it mllab bash # run it interactively
docker login
docker tag mllab andbot/mllab # add optional tag with `:tag`
docker push andbot/mllab
or download the latest version from docker hub:
docker pull andbot/mllab # pull it
docker run andbot/mllab # pull & run it
docker run -it mllab bash # open interactive shell - don't forget to run `./init.sh` by hand!
This will package everything in a fat jar, using sbt-assembly.
sbt assembly
Run the compiled jar e.g. with python like in examples/run_jar.py
sbt doc
This will check the code style, using scalastyle and Linter Compiler Plugin.
sbt
scalastyle # style check
[compile, run] # linter runs as compilation hook
Everyone is welcome to contribute! I would especially appreciate support in
- a web interface to try the library out: select datasets, algorithms and hyper parameters, run the analysis and do grid hyper parameter optimization
PRs and issues are always welcome.
Please write unit tests for your methods.
This code is developed and maintained by me. List of contributors in alphabetical order:
- Simon Spannagel
- maybe you? 😆