Interesting ETH paper on the shortcomings of serverless computing for data analytics #19
Replies: 1 comment
-
And a companion paper: Ephemeral Per-query Engines for Serverless Analytics Michael Wawrzoniak, Rodrigo Bruno, Ana Klimovic and Gustavo Alonso We challenge the common assumption that queries are submitted to a pre-configured, already running engine and put forward the idea of dynamically instantiating a chosen data processing engine upon query submission by leveraging Function-as-a-Service (FaaS) platforms. We demonstrate the idea by running unmodified data processing engines (we use Apache Drill as an initial example) on real-world serverless FaaS platforms and show that such engines can be instantiated on demand when a query arrives. We aim to eventually support a wide range of queries and workloads. Wide access to such functionality would be a game changer in data processing. First, it would enable pay-per-query models supporting sporadic, interactive data analysis on arbitrary engines. Second, it would significantly increase the flexibility for data processing by enabling the possibility of dynamically choosing the actual engine, its configuration, and the resource allocation on a per-query basis. Logically, this amounts to dynamically attaching a query engine to the query rather than sending the query to a pre-configured and already deployed engine. In this paper we elaborate on this vision, outline the design of the MetaQ prototype that we are building to explore the idea, demonstrate that it is realistic through initial experiments, and discuss its many exciting practical implications. https://anakli.inf.ethz.ch/papers/ephemeral_per_query_engines_serverless_SDA_VLDB23.pdf |
Beta Was this translation helpful? Give feedback.
-
Rethinking Serverless Computing: from the Programming Model to the Platform Design
Gustavo Alonso, Ana Klimovic, Tom Kuchler and Michael Wawrzoniak
Serverless computing offers a number of advantages over conventional, Virtual Machine (VM) based deployments on the cloud, e.g., greater elasticity, simplicity of use and management, finer granularity billing, and rapid deployment and start up times. Naturally, there is a growing interest in exploring how to run applications in this new environment and data analytics is not an exception. Unfortunately, current serverless platforms are limited along several dimensions, which makes things quite difficult from the perspective of data analytics. In this paper we explore what serverless has to offer today, what is missing, and what can be done to make serverless a better computing platform in general and for data analytics in particular.
https://anakli.inf.ethz.ch/papers/rethinking_serverless_SDA_VLDB23.pdf
Beta Was this translation helpful? Give feedback.
All reactions