Release r1.15.5-deeprec2302 · DeepRec-AI/DeepRec

Major Features and Improvements

Support same saver graph for EmbeddingVariable on GPU/CPU devices.
Support save and restore parameters in HBM storage of EmbeddingVariable.
Add GPU apply ops of Adam, AdamAsync, AdamW for multi-tier storage of EmbeddingVariable.
Place output of KvResourceIsInitializedOp on CPU.
Support GroupEmbedding to pack multiple feature columns lookup/apply.
Optimize HBM-DRAM storage of EmbeddingVariable with intra parallelism and fine-grained synchronization.
Support not saving filtered features when saving checkpoint.
Support localized mode fusion in GroupEmbedding.
Support to avoid preloaded IDs being eliminated in multi-tier embedding's cache.
Support COMPACT layout to reduce memory cost in EmbeddingVariable.
Support to ignore version when restore Embedding Variable with TF_EV_RESET_VERSION.
Support restore custom dimension of Embedding Variable.
Support merge and delete checkpoint files of SSDHash storage.

Add list of GPU Ops for forward backward joint optimization.
Optimize FusedBatchNormGrad on CPU device.
Support NCHW format input for FusedBatchNormOp.
Use new asynchronous evaluation in Eigen to FusedBatchNorm.
Add exponential_avg_factor attribute to FusedBatchNorm* kernels.
Change AliUniqueGPU kernel implementation to AsyncOpKernel.
Support computing exponential running mean and variance in fused_batch_norm.
Upgrade oneDNN to 2.7 and ACL to 22.08.
Use global cache for MKL primitives for ARM.
Disable optimizing batch norm as sequence of post ops on AArch64.
Restore re-mapper and fix BatchMatmul and FactoryKeyCreator under AArch64 + ACL.

Do not cudaSetDevice to invisible GPU in CreateDevices.
Fix concurrency issue caused by not reference to same lock in multi-tier storage.
Fix parse input request bug.
Fix the bug when saving empty GPU EmbeddingVariable.
Fix the concurrency issue between feature eviction and embedding lookup in asynchronous training.

alideeprec/deeprec-release:deeprec2302-cpu-py38-ubuntu20.04

alideeprec/deeprec-release:deeprec2302-gpu-py38-cu116-ubuntu20.04