v0.8.0
New Features
In Training Embedding Pruning (ITEP) for more efficient RecSys training
Provides a representation of In-Training Embedding Pruning, which is used internally at Meta for more efficient RecSys training by decreasing memory footprint of embedding tables. Pull Request: #2074 introduces the modules into TorchRec, with tests showing how to use them.
Mean Pooling
Mean pooling enabled on embeddings for row-wise and table-row-wise sharding types in TorchRec. Mean pooling mode done through TBE (table-batched embedding) won’t be accurate for row-wise and table-row-wise sharding types, which modify the input due to sharding. This feature efficiently calculates the divisor using caching and overlapping in input dist to implement mean pooling, which had proved to be much more performant than out-of-library implementations. PR: #1772
Changelog
Torch.export (non-strict) compatibility with KJT/JT/KT, EBC/Quantized EBC, sharded variants #1815 #1816 #1788 #1850 #1976 and dynamic shapes #2058
torch.compile support with TorchRec #2045 #2018 #1979
TorchRec serialization with non-strict torch.export for regenerating eager sparse modules (EBC) from IR for sharding #1860 #1848 with meta functionalization when torch.exporting #1974
More benchmarking for TorchRec modules/data types #2094 #2033 #2001 #1855
More VBE support (data parallel sharding) #2093 (EmbeddingCollection) #2047 #1849
RegroupAsDict module for performance improvements with caching #2007
Train Pipeline improvements #1967 #1969 #1971
Bug Fixes and library improvements