Skip to content

BigDL release 0.5.0

Compare
Choose a tag to compare
@liu-shaojun liu-shaojun released this 07 Mar 02:16
· 21545 commits to main since this release
a6c583c

Highlights

  • Bring in a Keras-like API(Scala and Python). User can easily run their Keras code (training and inference) on Apache Spark through BigDL. For more details, see this link.
  • Support load Tensorflow dynamic models(e.g. LSTM, RNN) in BigDL and support more Tensorflow operations, see this page.
  • Support combining data preprocessing and neural network layers in the same model (to make model deployment easy )
  • Speedup various modules in BigDL (BCECriterion, rmsprop, LeakyRelu, etc.)
  • Add DataFrame-based image reader and transformer

New Features

  • Tensor can be converted to OpenCVMat
  • Bring in a new Keras-like API for scala and python
  • Support load Tensorflow dynamic models(e.g. LSTM, RNN)
  • Support load more Tensorflow operations(InvertPermutation, ConcatOffset, Exit, NextIteration, Enter, RefEnter, LoopCond, ControlTrigger, TensorArrayV3,TensorArrayGradV3, TensorArrayGatherV3, TensorArrayScatterV3, TensorArrayConcatV3, TensorArraySplitV3, TensorArrayReadV3, TensorArrayWriteV3, TensorArraySizeV3, StackPopV2, StackPop, StackPushV2, StackPush, StackV2, Stack)
  • ResizeBilinear support NCHW
  • ImageFrame support load Hadoop sequence file
  • ImageFrame support gray image
  • Add Kv2Tensor Operation(Scala)
  • Add PGCriterion to compute the negative policy gradient given action distribution, sampled action and reward
  • Support gradual increase learning rate in LearningrateScheduler
  • Add FixExpand and add more options to AspectScale for image preprocessing
  • Add RowTransformer(Scala)
  • Support to add preprocessors to Graph, which allows user combine preprocessing and trainable model into one model
  • Resnet on cifar-10 example support load images from HDFS
  • Add CategoricalColHashBucket operation(Scala)
  • Predictor support Table as output
  • Add BucketizedCol operation(Scala)
  • Support using DenseTensor and SparseTensor together to create Sample
  • Add CrossProduct Layer (Scala)
  • Provide an option to allow user bypass the exception in transformer
  • DenseToSparse layer support disable backward propagation
  • Add CategoricalColVocaList Operation(Scala)
  • Support imageframe in python optimizer
  • Support get executor number and executor cores in python
  • Add IndicatorCol Operation(Scala)
  • Add TensorOp, which is an operation with Tensor[T]-formatted input and output, and provides shortcuts to build Operations for tensor transformation by closures. (Scala)
  • Provide a docker file to make it easily to setup testing environment of BigDL
  • Add CrossCol Operation(Scala)
  • Add MkString Operation(Scala)
  • Add a prediction service interface for concurrent calls and accept bytes input
  • Add SparseTensor.cast & SparseTensor.applyFun
  • Add DataFrame-based image reader and transformer
  • Support load tensoflow model files saved by tf.saved_model API
  • SparseMiniBatch supporting multiple TensorDataTypes

Enhancement

  • ImageFrame support serialization
  • A default implementation of zeroGradParameter is added to AbstractModule
  • Improve the style of the document website
  • Models in different threads share weights in model training
  • Speed up leaky relu
  • Speed up Rmsprop
  • Speed up BCECriterion
  • Support Calling Java Function in Python Executor and ModelBroadcast in Python
  • Add detail instructions to run-on-ec2
  • Optimize padding mechanism
  • Fix maven compiling warnings
  • Check duplicate layers in the container
  • Refine the document which introduce how to automatically Deploy BigDL on Dataproc cluster
  • Refactor adding extra jars/python packages for python user. Now only need to set env variable BIGDL_JARS & BIGDL_PACKAGES
  • Implement appendColumn and avoid the error caused by API mismatch between different Spark version
  • Add python inception training on ImageNet example
  • Update "can't find locality partition for partition ..." to warning message

API change

  • Move DataFrame-based API to dlframe package
  • Refine the Container hierarchy. The add method(used in Sequential, Concat…) is moved to a subclass DynamicContainer
  • Refine the serialization code hierarchy
  • Dynamic Graph has been an internal class which is only used to run tensorflow models
  • Operation is not allowed to use outside Graph
  • The getParamter method as final and private[bigdl], which should be only used in model training
  • remove the updateParameter method, which is only used in internal test
  • Some Tensorflow related operations are marked as internal, which should be only used when running Tensorflow models

Bug Fix

  • Fix Sparse sample batch bug. It should add another dimension instead of concat the original tensor
  • Fix some activation or layers don’t work in TimeDistributed and RnnCell
  • Fix a bug in SparseTensor resize method
  • Fix a bug when convert SparseTensor to DenseTensor
  • Fix a bug in SpatialFullConvolution
  • Fix a bug in Cosine equal method
  • Fix optimization state mess up when call optimizer.optimize() multiple times
  • Fix a bug in Recurrent forward after invoking reset
  • Fix a bug in inplace leakyrelu
  • Fix a bug when save/load bi-rnn layers
  • Fix getParameters() in submodule will create new storage when parameters has been shared by parent module
  • Fix some incompatible syntax between python 2.7 and 3.6
  • Fix save/load graph will loss stop gradient information
  • Fix a bug in SReLU
  • Fix a bug in DLModel
  • Fix sparse tensor dot product bug
  • Fix Maxout ser issue
  • Fix some serialization issue in some customized faster rcnn model
  • Fix and refine some example document instructions
  • Fix a bug in export_tf_checkpoint.py script
  • Fix a bug in set up python package.
  • Fix picklers initialization issues
  • Fix some race condition issue in Spark 1.6 when broadcasting model
  • Fix Model.load in python return type is wrong
  • Fix a bug when use pyspark-with-bigdl.sh to run jobs on Yarn
  • Fix empty tensor call size and stride not throw null exception