- Upgrade python version to include 3.12 and 3.13 - Issue #144 by @sarahmish
- Add python 3.11 to MLBlocks - Issue #143 by @sarahmish
- Support python 3.9 and 3.10 - Issue #141 by @sarahmish
- Update
numpy
dependency and isolate tests - Issue #139 by @sarahmish
- Update NumPy dependency - Issue #136 by @sarahmish
- Support dynamic inputs and outputs - Issue #134 by @pvk-developer
- Stop pipeline fitting after the last block - Issue #131 by @sarahmish
- Add memory debug and profiling - Issue #130 by @pvk-developer
- Update Python support - Issue #129 by @csala
- Get execution time for each block - Issue #127 by @sarahmish
- Allow loading a primitive or pipeline directly from the JSON path - Issue #114 by @csala
- Pipeline Diagrams - Issue #113 by @erica-chiu
- Get Pipeline Inputs - Issue #112 by @erica-chiu
- Ability to return intermediate context - Issue #110 by @csala
- Support for static or class methods - Issue #107 by @csala
- Improved intermediate outputs management - Issue #105 by @csala
- Allow passing fit and produce arguments as
init_params
- Issue #96 by @csala - Support optional fit and produce args and arg defaults - Issue #95 by @csala
- Isolate primitives from their hyperparameters dictionary - Issue #94 by @csala
- Add functions to explore the available primitives and pipelines - Issue #90 by @csala
- Add primitive caching - Issue #22 by @csala
- Support flat hyperparameter dictionaries - Issue #92 by @csala
- Load pipelines by name and register them as
entry_points
- Issue #88 by @csala - Implement partial re-fit -Issue #61 by @csala
- Move argument parsing to MLBlock - Issue #86 by @csala
- Allow getting intermediate outputs - Issue #58 by @csala
- New primitives discovery system based on
entry_points
. - Conditional Hyperparameters filtering in MLBlock initialization.
- Improved logging and exception reporting.
- Add a new multi-table dataset.
- Add Unit Tests up to 50% coverage.
- Improve documentation.
- Fix minor bug in newsgroups dataset.
- Add new methods to Dataset class.
- Add documentation for the datasets module.
- Implement save and load methods for MLPipelines
- Add more datasets
- Add mlblocks.datasets module with demo data download functions.
- Extensive documentation, including multiple pipeline examples.
A new MLBlocks API and Primitive format.
This is a summary of the changes:
- Primitives JSONs and Python code has been moved to a different repository, called MLPrimitives
- Optional usage of multiple JSON primitive folders.
- JSON format has been changed to allow more flexibility and features:
- input and output arguments, as well as argument types, can be specified for each method
- both classes and function as primitives are supported
- multitype and conditional hyperparameters fully supported
- data modalities and primitive classifiers introduced
- metadata such as documentation, description and author fields added
- Parsers are removed, and now the MLBlock class is responsible for loading and reading the JSON primitive.
- Multiple blocks of the same primitive are supported within the same pipeline.
- Arbitrary inputs and outputs for both pipelines and blocks are allowed.
- Shared variables during pipeline execution, usable by multiple blocks.
- Disable some NetworkX functions for incompatibilities with some types of graphs.
- Improve the NetworkX primitives.
- Add String Vectorization and Datetime Featurization primitives.
- Refactor some Keras primitives to work with single dimension
y
arrays and be compatible withpickle
. - Add XGBClassifier and XGBRegressor primitives.
- Add some
keras.applications
pretrained networks as preprocessing primitives. - Add helper class to allow function primitives.
- Support passing hyperparams as nested dicts.
- Add LSTM classifier and regressor primitives.
- Add OneHotEncoder and MultiLabelEncoder primitives.
- Add several NetworkX graph featurization primitives.
- Add
community.best_partition
primitive.
- Add LightFM primitive.
- Allow passing
init_params
onMLPipeline
creation. - Fix bug with MLHyperparam types and Keras.
- Rename
produce_params
aspredict_params
. - Add SingleCNN Classifier and Regressor primitives.
- Simplify and improve Trivial Predictor
- Improve RandomForest primitive ranges
- Improve DFS primitive
- Add Tree Based Feature Selection primitives
- Fix bugs in TrivialPredictor
- Improved documentation
- Fix bug in TrivialMedianPredictor
- Fix bug in OneHotLabelEncoder
- New project structure and primitives for integration into MIT-TA2.
- MIT-TA2 default pipelines and single table pipelines fully working.
- First release on PyPI.