Release BigDL release 0.3.0 · intel-analytics/BigDL-2.x

Highlights

New protobuf-based model storage format
Support model quantization
Support sparse tensor and model
Easier and broader Tensorflow model load support
More layers/operations
Apache Spark 2.2 support

New Features

Models & Layers & Operations & Loss function
- Support convlstm3D model
- Support Variational Auto Encoder
- Support Unet
- Support PTB model
- Add SpatialWithinChannelLRN layer
- Add 3D-deconv layer
- Add BifurcateSplitTable layer
- Add KLD criterion
- Add Gaussian layer
- Add Sampler layer
- Add RNN decoder layer
- Support NHWC data format in 2D-conv, 2D-pooling layers
- Support same/valid padding type in 2D-conv and 2D-pooling layers
- Support dynamic execution flow in Graph
- Graph node can pass nested tensors
- Layer/Operation can support different input and output numeric tensor
- Start to support operations in BigDL, add following operations: LogicalNot, LogicalOr, LogicalAnd, 1D Max Pooling, Squeeze, Prod, Sum, Reshape, Identity, ReLU, Equals, Greater, Less, Switch, Merge, Floor, L2Loss, RandomUniform, Rank, MatMul, SoftMax, Conv2d, Add, Assert, Onehot, Assign, Cast, ExpandDims, MaxPool, Realdiv, BiasAdd, Pad, Tile, StridedSlice, Transpose, Negative, AssignGrad, BiasAddGrad, Deconv2D, Conv2DBackFilter CrossEntropy, MaxPoolGrad, NoOp, RandomUniform, ReluGrad, Select, Sum, Pow, BroadcastGradientArgs, Control Dependency
- Start to support sparse layers in BigDL, add following sparse layers: SparseLinear, SparseJoinTable, DenseToSparse
Tensor
- Support sparse tensor
- Support scalar (0-D tensor)
- Tensor support more numeric type: boolean, short, int, long, string, char, bytestring
- Tensor don’t display full content in toString when there’re too many elements
API change
- Expose evaluate API to python
- Add a predictClass API to model to simplify the code when user want to use model in classification
- Change model.test to model.evaluate in Python
- Refine Recurrent, BiRecurrent and RnnCell API
- Sample.features from ndarray to JTensor/List[JTensor]
- Sample.label from ndarray to JTensor
Install & Deploy
- Support Apache Spark 2.2
- Add script to run BigDL on Google DataProc platform
- Refine run-example.sh scripts to run bigdl examples on AWS with build-in Spark
- Pip install will now auto install spark-2.2
- Add a docker file
Model Save/Load
- New model persistent format(protobuf based) to provide a better user experience when save/load bigdl models
- Support load more operations from Tensorflow
- Support read tensor content from Tensorflow checkpoint
- Support load a subset of Tensorflow graph
- Support load Tensorflow preprocessing graph(read/parse tfrecord data, image decoders and queues)
- Automatically convert data in Tensorflow queue to RDD and feeding model training in BigDL
- Support load deconv layer from caffe and Tensorflow
- Support save/load SpatialCrossLRN torch module
Training
- Allow user to modify the optimization algorithm status when resuming the training in Python
- Allow user to specify optimization algorithms, learning rate and learning rate decay when use BigDL in Spark * ML pipeline
- Allow user to stop gradient on some layers in backpropagation
- Allow user to freeze layer parameters in training
- Add ML pipeline python API, user can use BigDL with ML pipeline in python code

Enhancement

Support model quantization. User can speed up model inference by quantize the model
Display bigdl model in Tensorboard
User can easily convert a sequential model to graph model by invoking new added toGraph method
Remove unnecessary contiguous check in 3D conv
Support global average pooling
Support regularizer in 3D convolution layer
Add regularizer for convlstmpeephole3d
Throw more meaningful messages in layers and criterions
Migrate GRU/LSTM/RNN/LSTM-Peehole definition from sequence to graph
Switch to pytest for python unit tests
Speed up tanh layer
Speed up sigmoid layer
Speed up recurrent layer
Support batch normalization in recurrent
Speedup Python ndarray to scala tensor convertion
Improve gradient sync performance in distributed training
Speedup tensor dot operation with mkl dot
Speedup copy operation in recurrent container
Speedup logsoftmax
Move classes.lst and img_class.lst to the model example folder, so user can easier to find them.
Ensure spark.speculation is set to false to get a better performance in training
Easier to turn on performance data in distributed training log
Optimize memory usage when broadcasting the model
Support mllib vector as feature for BigDL
Support create multiple tensors Sample in python
Support resizing in BytesToBGRImg

Bug Fix

Fix TemporalConv layer cannot return parameter table
Fix some bugs when loading dilated group convolution from caffe
Fix some bugs when loading caffe v1 layers
Fix a bug in TimeDistributed layer
Fix get incorrect execution time in recurrent layers
Fix inplace layer clear state bug
Fix incorrect training data sample count under some input
Remove label check in BytesToGreyImg
Fix a bug in concat table when it contains no layer
Fix a bug in maptable
Fix some typos in document
Use newInstance method to obtain FileSystem

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BigDL release 0.3.0

Highlights

New Features

Enhancement

Bug Fix