Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Summary: Pull Request resolved: pytorch#1868 ## Background Using [Efficient Memory Planning for Deep Neural Networks](https://arxiv.org/pdf/2001.03288.pdf) as a reference, there are generally two approaches to solving the memory planning problem for a neural network: * **Shared Objects Approach** * Match tensors to “shared objects” (i.e. shared memory allocation), based on tensor sizes and lifetimes * The goal is then to minimize the total size of each shared object * **Memory Offset Calculation Approach** * Assign each tensor to a specific offset in a large memory arena shared by all tensors, based on tensor sizes and lifetimes * The goal is then to minimize the total size of the overall memory arena Though the **Memory Offset Calculation Approach** can produce more optimal solutions, it cannot be used for GPU textures because a texture memory cannot be divided. **To plan memory for GPU textures, memory planning must be solved using the Shared Object Approach**. Note that a solution to the Shared Objects problem can be converted to a solution for the Memory Offses problem by materializing the shared objects as buffers within a memory arena. ## Context Currently, memory planning algorithms implemented for `exir`'s `MemoryPlanningPass` output solutions to the memory planning problem in the **Memory Offsets** format. The `greedy` algorithm solves memory planning as a Shared Objects problem, but then converts the solution to the memory offset format. This changeset introduces the `mem_obj_id` field to `TensorSpec`, which memory planning algorithms can use to record shared object ids. ## Review Guide * The `greedy` memory planning algorithm now records `mem_obj_id` in addition to `mem_offset` * `verify_storage_reuse()` of `Verifier` class now checks whether `mem_obj_id` is valid, if it is set Reviewed By: ydwu4 Differential Revision: D53496787 fbshipit-source-id: 0c2f81a00c9254b5af3d94e42a2dc3d34da1da6e
- Loading branch information