Interesting ETH paper on the shortcomings of serverless computing for data analytics #19

carlosmalt · 2023-08-25T23:42:23Z

carlosmalt
Aug 25, 2023
Maintainer

Rethinking Serverless Computing: from the Programming Model to the Platform Design

Gustavo Alonso, Ana Klimovic, Tom Kuchler and Michael Wawrzoniak

Serverless computing offers a number of advantages over conventional, Virtual Machine (VM) based deployments on the cloud, e.g., greater elasticity, simplicity of use and management, finer granularity billing, and rapid deployment and start up times. Naturally, there is a growing interest in exploring how to run applications in this new environment and data analytics is not an exception. Unfortunately, current serverless platforms are limited along several dimensions, which makes things quite difficult from the perspective of data analytics. In this paper we explore what serverless has to offer today, what is missing, and what can be done to make serverless a better computing platform in general and for data analytics in particular.

https://anakli.inf.ethz.ch/papers/rethinking_serverless_SDA_VLDB23.pdf

carlosmalt · 2023-08-25T23:49:58Z

carlosmalt
Aug 25, 2023
Maintainer Author

And a companion paper:

Ephemeral Per-query Engines for Serverless Analytics

Michael Wawrzoniak, Rodrigo Bruno, Ana Klimovic and Gustavo Alonso

We challenge the common assumption that queries are submitted to a pre-configured, already running engine and put forward the idea of dynamically instantiating a chosen data processing engine upon query submission by leveraging Function-as-a-Service (FaaS) platforms. We demonstrate the idea by running unmodified data processing engines (we use Apache Drill as an initial example) on real-world serverless FaaS platforms and show that such engines can be instantiated on demand when a query arrives. We aim to eventually support a wide range of queries and workloads. Wide access to such functionality would be a game changer in data processing. First, it would enable pay-per-query models supporting sporadic, interactive data analysis on arbitrary engines. Second, it would significantly increase the flexibility for data processing by enabling the possibility of dynamically choosing the actual engine, its configuration, and the resource allocation on a per-query basis. Logically, this amounts to dynamically attaching a query engine to the query rather than sending the query to a pre-configured and already deployed engine. In this paper we elaborate on this vision, outline the design of the MetaQ prototype that we are building to explore the idea, demonstrate that it is realistic through initial experiments, and discuss its many exciting practical implications.

https://anakli.inf.ethz.ch/papers/ephemeral_per_query_engines_serverless_SDA_VLDB23.pdf

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Skyhook Data Management

Interesting ETH paper on the shortcomings of serverless computing for data analytics #19

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Skyhook Data Management

Interesting ETH paper on the shortcomings of serverless computing for data analytics #19

carlosmalt Aug 25, 2023 Maintainer

Replies: 1 comment

carlosmalt Aug 25, 2023 Maintainer Author

carlosmalt
Aug 25, 2023
Maintainer

carlosmalt
Aug 25, 2023
Maintainer Author