From 94ff2eea3ea20f0b46f8ebf7a01bf25ccd6b8dbd Mon Sep 17 00:00:00 2001 From: Alireza Furutanpey <61883626+rezafuru@users.noreply.github.com> Date: Fri, 29 Mar 2024 17:42:23 +0100 Subject: [PATCH 1/2] Update projects.rst Extend list of projects with FrankenSplit --- docs/source/projects.rst | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/docs/source/projects.rst b/docs/source/projects.rst index 81ca2a77..de08f600 100644 --- a/docs/source/projects.rst +++ b/docs/source/projects.rst @@ -27,6 +27,16 @@ It is pip-installable and published as a PyPI package i.e., you can install it b Papers ***** +FrankenSplit: Efficient Neural Feature Compression With Shallow Variational Bottleneck Injection for Mobile Edge Computing +---- +* Author(s): Alireza Furutanpey, Philipp Raith, Schahram Dustdar +* Venue: IEEE Transactions on Mobile Computing +* PDF: `Paper `_ +* Code: `GitHub `_ + +**Abstract**: The rise of mobile AI accelerators allows latency-sensitive applications to execute lightweight Deep Neural Networks (DNNs) on the client side. However, critical applications require powerful models that edge devices cannot host and must therefore offload requests, where the high-dimensional data will compete for limited bandwidth. Split Computing (SC) alleviates resource inefficiency by partitioning DNN layers across devices, but current methods are overly specific and only marginally reduce bandwidth consumption. This work proposes shifting away from focusing on executing shallow layers of partitioned DNNs. Instead, it advocates concentrating the local resources on variational compression optimized for machine interpretability. We introduce a novel framework for resource-conscious compression models and extensively evaluate our method in an environment reflecting the asymmetric resource distribution between edge devices and servers. Our method achieves 60% lower bitrate than a state-of-the-art SC method without decreasing accuracy and is up to 16x faster than offloading with existing codec standards. + + torchdistill Meets Hugging Face Libraries for Reproducible, Coding-Free Deep Learning Studies: A Case Study on NLP ---- * Author(s): Yoshitomo Matsubara From 8f360abce3a9c1862e1b4ca6a552f906856d3d2a Mon Sep 17 00:00:00 2001 From: Alireza Furutanpey <61883626+rezafuru@users.noreply.github.com> Date: Fri, 29 Mar 2024 22:26:22 +0100 Subject: [PATCH 2/2] Update projects.rst Fix link --- docs/source/projects.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/projects.rst b/docs/source/projects.rst index de08f600..c1ab3fc8 100644 --- a/docs/source/projects.rst +++ b/docs/source/projects.rst @@ -32,7 +32,7 @@ FrankenSplit: Efficient Neural Feature Compression With Shallow Variational Bott * Author(s): Alireza Furutanpey, Philipp Raith, Schahram Dustdar * Venue: IEEE Transactions on Mobile Computing * PDF: `Paper `_ -* Code: `GitHub `_ +* Code: `GitHub `_ **Abstract**: The rise of mobile AI accelerators allows latency-sensitive applications to execute lightweight Deep Neural Networks (DNNs) on the client side. However, critical applications require powerful models that edge devices cannot host and must therefore offload requests, where the high-dimensional data will compete for limited bandwidth. Split Computing (SC) alleviates resource inefficiency by partitioning DNN layers across devices, but current methods are overly specific and only marginally reduce bandwidth consumption. This work proposes shifting away from focusing on executing shallow layers of partitioned DNNs. Instead, it advocates concentrating the local resources on variational compression optimized for machine interpretability. We introduce a novel framework for resource-conscious compression models and extensively evaluate our method in an environment reflecting the asymmetric resource distribution between edge devices and servers. Our method achieves 60% lower bitrate than a state-of-the-art SC method without decreasing accuracy and is up to 16x faster than offloading with existing codec standards.