Release Version 2016.10.05.0 · mldbai/mldb

MLDB is the Machine Learning Database. It's the best way to get machine learning or AI into your applications or personal projects. Head on over to MLDB.ai to try it right now or see Running MLDB for installation details.

We're happy to announce the immediate availability of MLDB version 2016.10.05.0.

This release contains 141 new commits and modified 903 files. On top of many bug fixes and performance improvements, here are some of the highlights of this release:

New MongoDB interface

A big new feature is support for importing and exporting data to and from MongoDB, a popular NoSQL database. Although MongoDB can be very useful for certain use cases, it doesn't have any machine learning capabilities. We want to make it as easy as possible for our users to get their data in MLDB. So we have added the following new MLDB entities that make it easy to interface with MongoDB:

mongodb.import procedure: used to import a MongoDB collection into an MLDB dataset
mongodb.dataset dataset: read only MLDB dataset based on a MongoDB collection
mongodb.record dataset: write-only MLDB dataset that writes to a MongoDB collection
mongodb.query function: function to perform an MLDB SQL query against a MongoDB collection

Updated TensorFlow to 0.10.0

We updated the TensorFlow version shipped with MLDB to version 0.10.0. The new version includes many bug fixes and performance improvements. We're now also shipping MLDB with different TensorFlow kernels, each optimized for different instruction sets. So for instance, the kernel with AVX2 instructions will be used if it the processor on which MLDB is run supports it.

If you're interested in deep learning, make sure to checkout the Tensorflow Image Recognition Tutorial and the Transfer Learning with Tensorflow demo to see how easy
it is to run trained models with MLDB.

Updated V8 to Release 5.0

We have updated V8, the Javascript engine used in MLDB, to Release 5.0. This brings in a lot of improvements and new features, like improved ECMAScript 2015 (ES6), as well as increasing performance. It now also compiles for the ARM architecture, which is an important step as we're working towards having MLDB run on embedded architectures.

An example of what this benefits is the jseval function, that makes it possible to execute arbitrary JavaScript code inline in an SQL query.
Check out the Executing JavaScript Code Directly in SQL Queries Using the jseval Function Tutorial for great examples of how jseval can be used.

Fixes and improvements to import procedures

The SELECT statement of the import.text procedure has been improved to support the CASE keyword. The adds extra flexibility to process data as it is being imported.

We also fixed a bug when using the NAMED clause with the import.json procedure that could cause undesired behaviour.

Updates to the classifier configuration

We have improved the user experience around configuring supervised algorithms in two ways.

First, we have clarified the documentation by creating a new Classifier configuration section that contains the information related to the configuration of supervised models. When using one of the two procedures that can be used to train models, the classifier.train and classifier.experiment, all the information you need to configure your algorithm now lives in one place.

Second, we have made the training more robust to configuration errors by having better validation of elements meant to control hyper-parameters. Incorrect parameters will now trigger errors.

New vector space functions

We added two new vector space functions:

First, the new reshape(val, shape) function takes an n-dimensional embedding and reinterprets it as an N-dimensional embedding of the provided shape containing all of the elements. This allows, for example, a 1-dimensional vector to be reinterpreted as a 2-dimensional array. The shape argument is an embedding containing the size of each dimension.

Second, the new shape(val) takes an n-dimensional embedding and returns the size of each dimension as an array.

Other changes and fixes

The COLUMN EXPR expression now supports the STRUCTURED keyword. By default, COLUMN EXPR returns a flattened representation. Adding the STRUCTURED keyword will return the structured representation.
The tsne.train procedure now has a learningRate configuration option.
Improved speed and fixes to JOIN operations.
Columns and rows can now be named with an empty string.
When evaluating a model using the classifier.test or the classifier.experimeny procedure, the F1-score was returned in a key named f. The name has been renamed to f1Score.
The HTTP layer now correctly handles the HTTP 1.1 100 CONTINUE request header.
Fixed the ordering of paths when mixing Unicode and digits.
The user function fetcher is now available as a built-in function fetch.
Fixed a bug with the levenshtein_distance() function where it did not work properly with UTF-8 characters.
Fixed an issue with the Javascript plugin's serveStaticFolder() function where the path to serve would not be considered relative to the plugin's installation directory.
Added the optional argument sortField to the string_agg(expr, separator [, sortField]) function, that allows to sort the returned by the sortField.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Version 2016.10.05.0