Skip to content

Releases: dotnet/machinelearning

ML.NET 1.3.1

06 Aug 10:50
d1d5e1f
Compare
Choose a tag to compare

New Features

  • Deep Neural Networks Training (PREVIEW) (#4057)
    Introduces in-preview 0.15.1 Microsoft.ML.DNN package that enables full DNN model retraining and transfer learning in .NET using C# bindings for tensorflow provided by Tensorflow .NET. The goal of this package is to allow high level DNN training and scoring tasks such as image classification, text classification, object detection, etc using simple yet powerful APIs that are framework agnostic but currently they only uses Tensorflow as the backend. The below APIs are in early preview and we hope to get customer feedback that we can incorporate in the next iteration.

    DNN stack

    public static DnnEstimator RetrainDnnModel(
              this ModelOperationsCatalog catalog,
              string[] outputColumnNames,
              string[] inputColumnNames,
              string labelColumnName,
              string tensorFlowLabel,
              string optimizationOperation,
              string modelPath,
              int epoch = 10,
              int batchSize = 20,
              string lossOperation = null,
              string metricOperation = null,
              string learningRateOperation = null,
              float learningRate = 0.01f,
              bool addBatchDimensionInput = false,
              DnnFramework dnnFramework = DnnFramework.Tensorflow)
    
    public static DnnEstimator ImageClassification(
              this ModelOperationsCatalog catalog,
              string featuresColumnName,
              string labelColumnName,
              string outputGraphPath = null,
              string scoreColumnName = "Score",
              string predictedLabelColumnName = "PredictedLabel",
              string checkpointName = "_retrain_checkpoint",
              Architecture arch = Architecture.InceptionV3,
              DnnFramework dnnFramework = DnnFramework.Tensorflow,
              int epoch = 10,
              int batchSize = 20,
              float learningRate = 0.01f,
              bool measureTrainAccuracy = false)

    Design specification

    Image classification (Inception V3) sample

    Image classification (Resnet V2 101) sample

  • Database Loader (PREVIEW) (#4035)
    Introduces Database loader that enables training on databases. This loader supports any relational database supported by System.Data in .NET Framework or .NET Core, meaning that you can use many RDBMS such as SQL Server, Azure SQL Database, Oracle, PostgreSQL, MySQL, etc. This feature is in early preview and can be accessed via Microsoft.ML.Experimental nuget.

    Design specification

    Sample

    public static DatabaseLoader CreateDatabaseLoader(this DataOperationsCatalog catalog,
              params DatabaseLoader.Column[] columns)

Bug Fixes

Serious

  • SaveOnnxCommand appears to ignore predictors when saving a model to ONNX format: This broke export to ONNX functionality. (3974)

  • Unable to use fasterrcnn onnx model. (3963)

  • PredictedLabel is always true for Anomaly Detection: This bug disabled scenarios like fraud detection using binary classification/PCA. (#4039)

  • Update build certifications: This bug broke the official builds because of outdated certificates that were being used. (#4059)

Other

  • Stop LightGbm Warning for Default Metric Input: Fixes warning, LightGBM Warning Unknown parameter metric= is produced when the default metric is used. (#3965)

Samples

Breaking Changes

None

Enhancements

  • Farewell to the Static API (4009)

  • AVX and FMA intrinsics in Factorization Machine (3940)

CLI and AutoML API

  • Bug fixes.

Remarks

ML.NET v1.2.0

03 Jul 05:44
1c1d3a4
Compare
Choose a tag to compare

General Availability

  • Microsoft.ML.TimeSeries

    • Anomaly detection algorithms (Spike and Change Point):
      • Independent and identically distributed.
      • Singular spectrum analysis.
      • Spectral residual from Azure Anomaly Detector/Kensho team.
    • Forecasting models:
      • Singular spectrum analysis.
    • Prediction Engine for online learning
      • Enables updating time series model with new observations at scoring so that the user does not have to re-train the time series with old data each time.

    Samples

  • Microsoft.ML.OnnxTransformer
    Enables scoring of ONNX models in the learning pipeline. Uses ONNX Runtime v0.4.

    Sample

  • Microsoft.ML.TensorFlow
    Enables scoring of TensorFlow models in the learning pipeline. Uses TensorFlow v1.13. Very useful for image and text classification. Users can featurize images or text using DNN models and feed the result into a classical machine learning model like a decision tree or logistic regression trainer.

    Samples

New Features

  • Tree-based featurization (#3812)

    Generating features using tree structure has been a popular technique in data mining. Useful for capturing feature interactions when creating a stacked model, dimensionality reduction, or featurizing towards an alternative label. ML.NET's tree featurization trains a tree-based model and then maps input feature vector to several non-linear feature vectors. Those generated feature vectors are:

    • The leaves it falls into. It's a binary vector with ones happens at the indexes of reached leaves,
    • The paths that the input vector passes before hitting the leaves, and
    • The reached leaves values.

    Here are two references.

    Samples

  • Microsoft.Extensions.ML integration package. (#3827)

    This package makes it easier to use ML.NET with app models that support Microsoft.Extensions - i.e. ASP.NET and Azure Functions.

    Specifically it contains functionality for:

    • Dependency Injection
    • Pooling PredictionEngines
    • Reloading models when the file or URI has changed
    • Hooking ML.NET logging to Microsoft.Extensions.Logging

Bug Fixes

Serious

  • Time series Sequential Transform needs to have a binding mechanism: This bug made it impossible to use time series in NimbusML. (#3875)

  • Build errors resulting from upgrading to VS2019 compilers: The default CMAKE_C_FLAG for debug configuration sets /ZI to generate a PDB capable of edit and continue. In the new compilers, this is incompatible with /guard:cf which we set for security reasons. (#3894)

  • LightGBM Evaluation metric parameters: In LightGbm EvaluateMetricType where if a user specified EvaluateMetricType.Default, the metric would not get added to the options Dictionary, and LightGbmWrappedTraining would throw because of that. (#3815)

  • Change default EvaluationMetric for LightGbm: In ML.NET, the default EvaluationMetric for LightGbm is set to EvaluateMetricType.Error for multiclass, EvaluationMetricType.LogLoss for binary etc. This leads to inconsistent behavior from the user's perspective. (#3859)

Other

  • CustomGains should allow multiple values in argument attribute. (#3854)

Breaking Changes

None

Enhancements

  • Fixes the Hardcoded Sigmoid value from -0.5 to the value specified during training. (#3850)

  • Fix TextLoader constructor and add exception message. (#3788)

  • Introduce the FixZero argument to the LogMeanVariance normalizer. (#3916)

  • Ensembles trainer now work with ITrainerEstimators instead of ITrainers. (#3796)

  • LightGBM Unbalanced Data Argument. (#3925)

  • Tree based trainers implement ICanGetSummaryAsIDataView. (#3892)

  • CLI and AutoML API

    • Internationalization fixes to generate proper ML.NET C# code. (#3725)
    • Automatic Cross Validation for small datasets, and CV stability fixes. (#3794)
    • Code cleanup to match .NET style. (#3823)

Documentation and Samples

  • Samples for applying ONNX model to in-memory images. (#3851)
  • Reformatted all ~200 samples to 85 character width so the horizontal scrollbar does not appear on docs webpage. (#3930, 3941, 3949, 3950, 3947, 3943, 3942, 3946, 3948)

Remarks

  • Roughly 200 Github issues were closed, the count decreased from ~550 to 351. Most of the issues got resolved due to the release of stable API and availability of samples.

ML.NET v1.1.0

04 Jun 22:41
d5c4e94
Compare
Choose a tag to compare

New Features

  • Image type support in IDataView
    PR#3263 added support for in-memory image as a type in IDataView. Previously it was not possible to use an image directly in IDataView, and the user had to specify the file path as a string and load the image using a transform. The feature resolved the following issues: 3162, 3723, 3369, 3274, 445, 3460, 2121, 2495, 3784.

    Image type support in IDataView was a much requested feature by the users.

    Sample to convert gray scale image in-Memory | Sample for custom mapping with in-memory using custom type

  • Super-Resolution based Anomaly Detector (preview, please provide feedback)
    PR#3693 adds a new anomaly detection algorithm to the Microsoft.ML.TimeSeries nuget. This algorithm is based on Super-Resolution using Deep Convolutional Networks and also got accepted in KDD'2019 conference as an oral presentation. One of the advantages of this algorithm is that it does not require any prior training and based on benchmarks using grid parameter search to find upper bounds it out performs the Independent and identically distributed(IID) and Singular Spectrum Analysis(SSA) based anomaly detection algorithms in accuracy. This contribution comes from the Azure Anomaly Detector team.

    Algo Precision Recall F1 #TruePositive #Positives #Anomalies Fine tuned parameters
    SSA (requires training) 0.582 0.585 0.583 2290 3936 3915 Confidence=99, PValueHistoryLength=32, Season=11, and use half the data of each series to do the training.
    IID 0.668 0.491 0.566 1924 2579 3915 Confidence=99, PValueHistoryLength=56
    SR 0.601 0.670 0.634 2625 4370 3915 WindowSize=64, BackAddWindowSize=5, LookaheadWindowSize=5, AveragingWindowSize=3, JudgementWindowSize=64, Threshold=0.45

    Sample for anomaly detection by SRCNN | Sample for anomaly detection by SRCNN using batch prediction

  • Time Series Forecasting (preview, please provide feedback)
    PR#1900 introduces a framework for time series forecasting models and exposes an API for Singular Spectrum Analysis(SSA) based forecasting model in the Microsoft.ML.TimeSeries nuget. This framework allows to forecast w/o confidence intervals, update model with new observations and save/load the model to/from persistent storage. This closes following issues 929 and 3151 and was a much requested feature by the github community since September 2018. With this change Microsoft.ML.TimeSeries nuget is feature complete for RTM.

    Sample for forecasting | Sample for forecasting using confidence intervals

Bug Fixes

Serious

  • Math Kernel Library fails to load with latest libomp: Fixed by PR#3721 this bug made it impossible for anyone to check code into master branch because it was causing build failures.

  • Transform Wrapper fails at deserialization: Fixed by
    PR#3700 this bug affected first party(1P) customer. A model trained using NimbusML(Python bindings for ML.NET) and then loaded for scoring/inferencing using ML.NET will hit this bug.

  • Index out of bounds exception in KeyToVector transformer: Fixed by PR#3763 this bug closes following github issues: 3757,1751,2678. It affected first party customer and also github users.

Other

  • Download images only when not present on disk and print warning messages when converting unsupported pixel format by PR#3625
  • ML.NET source code does not build in VS2019 by PR#3742
  • Fix SoftMax precision by utilizing double in the internal calculations by PR#3676
  • Fix to the official build due to API Compat tool change by PR#3667
  • Check for number of input columns in concat transform by PR#3809

Breaking Changes

None

Enhancements

  • API Compat tool by PR#3623 ensures future changes to ML.NET will not break the stable API released in 1.0.0.
  • Upgrade the TensorFlow version from 1.12.0 to 1.13.1 by PR#3758
  • API for saving time series model to stream by PR#3805

Documentation and Samples

  • L1-norm and L2-norm regularization documentation by PR#3586
  • Sample for data save and load from text and binary files by PR#3745
  • Sample for LoadFromEnumerable with a SchemaDefinition by PR#3696
  • Sample for LogLossPerClass metric for multiclass trainers by PR#3724
  • Sample for WithOnFitDelegate by PR#3738
  • Sample for loading data using text loader using various techniques by PR#3793

Remarks

ML.NET v1.0.0

03 May 23:34
Compare
Choose a tag to compare

ML.NET is now 1.0.0. 🍰

This is our stable API. In this final sprint we have worked mainly on improving the documentation. Please let us know what you like about ML.NET and what we can improve to make your use of machine learning easier in .NET. With this release we are committed to staying backward compatible.

Release Notes
Download and Install

ML.NET v1.0.0-preview

03 Apr 00:21
62a5b34
Compare
Choose a tag to compare
ML.NET v1.0.0-preview Pre-release
Pre-release

This is the RC1 release for ML.NET version 1.0.0. The work on the API project has been concluded. The focus before releasing version 1.0.0 would be to enhance documentation and samples as well as addressing any critical issues. Please note that NuGets have now 1.0.0-preview as well as 0.12.0-preview versions depending on which one will become stable release. Also IDataView is now in Microsoft.ML namespace. As always thank you so much for being an awesome community of Machine Learning enthusiasts.

Release Notes
Download and Install

ML.NET v0.11

06 Mar 00:22
f7043df
Compare
Choose a tag to compare
ML.NET v0.11 Pre-release
Pre-release

A lot more API clean up as well as many fixes are packed in this release! We are quickly approaching RC1 release for ML.NET and our first priority is to complete the API related work. Thank you for being patient while we get closer to our stable surface. We are super excited to work through the remaining issues and ship V0.1. In fact we are so excited that in the release notes was mentioned that FastTree has a new package now. That is partially true as you can see in our nightly builds but 0.11 still does not have a separate package. oh well! :)

Release Notes
Download and Install

ML.NET v0.10

05 Feb 21:13
f41f774
Compare
Choose a tag to compare
ML.NET v0.10 Pre-release
Pre-release

More API clean up as well as many fixes are in this release. We are preparing for our stable API in 1.0 release and greatly appreciate the community feedback and engagement. Please note that IDataView is now in Microsoft.Data.DataView #2220. Also please note that #2239 has changed the order of parameters and your existing code needs to be updated.

Release Notes
Download and Install

ML.NET v0.9

09 Jan 00:18
941d9fc
Compare
Choose a tag to compare
ML.NET v0.9 Pre-release
Pre-release

This release brings many fixes as well as significant API clean up. We have removed the API that was marked obsolete. Explainability features of ML.NET have also got some improvements as originally planned. Thanks to all the great support as we improve the API for 1.0 release.

Release Notes
Download and Install

ML.NET v0.8

04 Dec 18:02
51ea627
Compare
Choose a tag to compare
ML.NET v0.8 Pre-release
Pre-release

ML.NET 0.8 is here with some very exciting features. Explainability, stateful time series, implicit feedback in recommendations and better debuggability as well as many bug fixes are included in this release. Please note that the legacy API has been marked obsolete and will be removed in the next release. Many thanks to the awesome users and community contributors for your continuous support.

Release Notes
Download and Install

ML.NET v0.7

06 Nov 21:52
abba48a
Compare
Choose a tag to compare
ML.NET v0.7 Pre-release
Pre-release

ML.NET 0.7 brings multiple enhancements such as anomaly detection, matrix factorization, x86 builds, as well as custom transforms. We continue to refine our API with many exciting extensions. Thanks to everyone for your massive support and contributions in this release.

Release Notes
Download and Install