Skip to content

Releases: intelligent-machine-learning/dlrover

Release 0.2.1

11 Oct 09:38
48fa032
Compare
Choose a tag to compare

DLRover:

ElasticJob:

  • Autotuning batch size without restarting the job.
  • Automatically detect the straggler (slow worker).

TFPlus

TFPlus 0.1.0 has been released, see detail in https://github.com/intelligent-machine-learning/dlrover/tree/master/tfplus

Kv Variable (Core Embedding Capability)

  • High-performance Embedding Ops
  • Kv Variable low level APIs (4 in total)
    • tfplus.get_kv_variable
    • embedding_lookup
    • embedding_lookup_sparse
    • safe_embedding_lookup_sparse
  • Dynamic expansion and partitioning of Embedding weights
  • Support for both single-machine training and PS/Worker cluster training

High-performance Optimizers

  • Common optimizers compatible with Kv Variable
    • Adam
    • Adagrad
  • In-house deep learning optimizers based on Sparse Group Lasso
    • Group Adam
    • Group Adagrad