Releases
v0.3.0
Highlights
New protobuf-based model storage format
Support model quantization
Support sparse tensor and model
Easier and broader Tensorflow model load support
More layers/operations
Apache Spark 2.2 support
New Features
Models & Layers & Operations & Loss function
Support convlstm3D model
Support Variational Auto Encoder
Support Unet
Support PTB model
Add SpatialWithinChannelLRN layer
Add 3D-deconv layer
Add BifurcateSplitTable layer
Add KLD criterion
Add Gaussian layer
Add Sampler layer
Add RNN decoder layer
Support NHWC data format in 2D-conv, 2D-pooling layers
Support same/valid padding type in 2D-conv and 2D-pooling layers
Support dynamic execution flow in Graph
Graph node can pass nested tensors
Layer/Operation can support different input and output numeric tensor
Start to support operations in BigDL, add following operations: LogicalNot, LogicalOr, LogicalAnd, 1D Max Pooling, Squeeze, Prod, Sum, Reshape, Identity, ReLU, Equals, Greater, Less, Switch, Merge, Floor, L2Loss, RandomUniform, Rank, MatMul, SoftMax, Conv2d, Add, Assert, Onehot, Assign, Cast, ExpandDims, MaxPool, Realdiv, BiasAdd, Pad, Tile, StridedSlice, Transpose, Negative, AssignGrad, BiasAddGrad, Deconv2D, Conv2DBackFilter CrossEntropy, MaxPoolGrad, NoOp, RandomUniform, ReluGrad, Select, Sum, Pow, BroadcastGradientArgs, Control Dependency
Start to support sparse layers in BigDL, add following sparse layers: SparseLinear, SparseJoinTable, DenseToSparse
Tensor
Support sparse tensor
Support scalar (0-D tensor)
Tensor support more numeric type: boolean, short, int, long, string, char, bytestring
Tensor don’t display full content in toString when there’re too many elements
API change
Expose evaluate API to python
Add a predictClass API to model to simplify the code when user want to use model in classification
Change model.test to model.evaluate in Python
Refine Recurrent, BiRecurrent and RnnCell API
Sample.features from ndarray to JTensor/List[JTensor]
Sample.label from ndarray to JTensor
Install & Deploy
Support Apache Spark 2.2
Add script to run BigDL on Google DataProc platform
Refine run-example.sh scripts to run bigdl examples on AWS with build-in Spark
Pip install will now auto install spark-2.2
Add a docker file
Model Save/Load
New model persistent format(protobuf based) to provide a better user experience when save/load bigdl models
Support load more operations from Tensorflow
Support read tensor content from Tensorflow checkpoint
Support load a subset of Tensorflow graph
Support load Tensorflow preprocessing graph(read/parse tfrecord data, image decoders and queues)
Automatically convert data in Tensorflow queue to RDD and feeding model training in BigDL
Support load deconv layer from caffe and Tensorflow
Support save/load SpatialCrossLRN torch module
Training
Allow user to modify the optimization algorithm status when resuming the training in Python
Allow user to specify optimization algorithms, learning rate and learning rate decay when use BigDL in Spark * ML pipeline
Allow user to stop gradient on some layers in backpropagation
Allow user to freeze layer parameters in training
Add ML pipeline python API, user can use BigDL with ML pipeline in python code
Enhancement
Support model quantization. User can speed up model inference by quantize the model
Display bigdl model in Tensorboard
User can easily convert a sequential model to graph model by invoking new added toGraph method
Remove unnecessary contiguous check in 3D conv
Support global average pooling
Support regularizer in 3D convolution layer
Add regularizer for convlstmpeephole3d
Throw more meaningful messages in layers and criterions
Migrate GRU/LSTM/RNN/LSTM-Peehole definition from sequence to graph
Switch to pytest for python unit tests
Speed up tanh layer
Speed up sigmoid layer
Speed up recurrent layer
Support batch normalization in recurrent
Speedup Python ndarray to scala tensor convertion
Improve gradient sync performance in distributed training
Speedup tensor dot operation with mkl dot
Speedup copy operation in recurrent container
Speedup logsoftmax
Move classes.lst and img_class.lst to the model example folder, so user can easier to find them.
Ensure spark.speculation is set to false to get a better performance in training
Easier to turn on performance data in distributed training log
Optimize memory usage when broadcasting the model
Support mllib vector as feature for BigDL
Support create multiple tensors Sample in python
Support resizing in BytesToBGRImg
Bug Fix
Fix TemporalConv layer cannot return parameter table
Fix some bugs when loading dilated group convolution from caffe
Fix some bugs when loading caffe v1 layers
Fix a bug in TimeDistributed layer
Fix get incorrect execution time in recurrent layers
Fix inplace layer clear state bug
Fix incorrect training data sample count under some input
Remove label check in BytesToGreyImg
Fix a bug in concat table when it contains no layer
Fix a bug in maptable
Fix some typos in document
Use newInstance method to obtain FileSystem
You can’t perform that action at this time.