Release BigDL release 0.5.0 · intel-analytics/BigDL-2.x

Highlights

Bring in a Keras-like API(Scala and Python). User can easily run their Keras code (training and inference) on Apache Spark through BigDL. For more details, see this link.
Support load Tensorflow dynamic models(e.g. LSTM, RNN) in BigDL and support more Tensorflow operations, see this page.
Support combining data preprocessing and neural network layers in the same model (to make model deployment easy )
Speedup various modules in BigDL (BCECriterion, rmsprop, LeakyRelu, etc.)
Add DataFrame-based image reader and transformer

New Features

Tensor can be converted to OpenCVMat
Bring in a new Keras-like API for scala and python
Support load Tensorflow dynamic models(e.g. LSTM, RNN)
Support load more Tensorflow operations(InvertPermutation, ConcatOffset, Exit, NextIteration, Enter, RefEnter, LoopCond, ControlTrigger, TensorArrayV3,TensorArrayGradV3, TensorArrayGatherV3, TensorArrayScatterV3, TensorArrayConcatV3, TensorArraySplitV3, TensorArrayReadV3, TensorArrayWriteV3, TensorArraySizeV3, StackPopV2, StackPop, StackPushV2, StackPush, StackV2, Stack)
ResizeBilinear support NCHW
ImageFrame support load Hadoop sequence file
ImageFrame support gray image
Add Kv2Tensor Operation(Scala)
Add PGCriterion to compute the negative policy gradient given action distribution, sampled action and reward
Support gradual increase learning rate in LearningrateScheduler
Add FixExpand and add more options to AspectScale for image preprocessing
Add RowTransformer(Scala)
Support to add preprocessors to Graph, which allows user combine preprocessing and trainable model into one model
Resnet on cifar-10 example support load images from HDFS
Add CategoricalColHashBucket operation(Scala)
Predictor support Table as output
Add BucketizedCol operation(Scala)
Support using DenseTensor and SparseTensor together to create Sample
Add CrossProduct Layer (Scala)
Provide an option to allow user bypass the exception in transformer
DenseToSparse layer support disable backward propagation
Add CategoricalColVocaList Operation(Scala)
Support imageframe in python optimizer
Support get executor number and executor cores in python
Add IndicatorCol Operation(Scala)
Add TensorOp, which is an operation with Tensor[T]-formatted input and output, and provides shortcuts to build Operations for tensor transformation by closures. (Scala)
Provide a docker file to make it easily to setup testing environment of BigDL
Add CrossCol Operation(Scala)
Add MkString Operation(Scala)
Add a prediction service interface for concurrent calls and accept bytes input
Add SparseTensor.cast & SparseTensor.applyFun
Add DataFrame-based image reader and transformer
Support load tensoflow model files saved by tf.saved_model API
SparseMiniBatch supporting multiple TensorDataTypes

Enhancement

ImageFrame support serialization
A default implementation of zeroGradParameter is added to AbstractModule
Improve the style of the document website
Models in different threads share weights in model training
Speed up leaky relu
Speed up Rmsprop
Speed up BCECriterion
Support Calling Java Function in Python Executor and ModelBroadcast in Python
Add detail instructions to run-on-ec2
Optimize padding mechanism
Fix maven compiling warnings
Check duplicate layers in the container
Refine the document which introduce how to automatically Deploy BigDL on Dataproc cluster
Refactor adding extra jars/python packages for python user. Now only need to set env variable BIGDL_JARS & BIGDL_PACKAGES
Implement appendColumn and avoid the error caused by API mismatch between different Spark version
Add python inception training on ImageNet example
Update "can't find locality partition for partition ..." to warning message

API change

Move DataFrame-based API to dlframe package
Refine the Container hierarchy. The add method(used in Sequential, Concat…) is moved to a subclass DynamicContainer
Refine the serialization code hierarchy
Dynamic Graph has been an internal class which is only used to run tensorflow models
Operation is not allowed to use outside Graph
The getParamter method as final and private[bigdl], which should be only used in model training
remove the updateParameter method, which is only used in internal test
Some Tensorflow related operations are marked as internal, which should be only used when running Tensorflow models

Bug Fix

Fix Sparse sample batch bug. It should add another dimension instead of concat the original tensor
Fix some activation or layers don’t work in TimeDistributed and RnnCell
Fix a bug in SparseTensor resize method
Fix a bug when convert SparseTensor to DenseTensor
Fix a bug in SpatialFullConvolution
Fix a bug in Cosine equal method
Fix optimization state mess up when call optimizer.optimize() multiple times
Fix a bug in Recurrent forward after invoking reset
Fix a bug in inplace leakyrelu
Fix a bug when save/load bi-rnn layers
Fix getParameters() in submodule will create new storage when parameters has been shared by parent module
Fix some incompatible syntax between python 2.7 and 3.6
Fix save/load graph will loss stop gradient information
Fix a bug in SReLU
Fix a bug in DLModel
Fix sparse tensor dot product bug
Fix Maxout ser issue
Fix some serialization issue in some customized faster rcnn model
Fix and refine some example document instructions
Fix a bug in export_tf_checkpoint.py script
Fix a bug in set up python package.
Fix picklers initialization issues
Fix some race condition issue in Spark 1.6 when broadcasting model
Fix Model.load in python return type is wrong
Fix a bug when use pyspark-with-bigdl.sh to run jobs on Yarn
Fix empty tensor call size and stride not throw null exception

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BigDL release 0.5.0

Highlights

New Features

Enhancement

API change

Bug Fix