KOMPASS (Kubernetes Orchestrated Multitenant Parallel Autoscaling Spark Server)

KOMPASS provides one-click deployment of your team's Apache Spark jobs using automated and resilient scaling of a multi-user, cloud based Spark server built on top of AWS, Kubernetes, and Prometheus.

Link

Useage

Prerequisites

The following packages must be installed and configured locally:

terraform
docker
kubectl
aws-cli

for more information, see the links to the specific project sites below.

An operational Apache Spark 2.4.0 docker image must be pubished to your dockerhub account and accessible by the deployed EKS cluster. The build is automated in the relevant Apache Spark source distributions as detailed in the link below. The following folders need to be copied to the frontend/Docker-Image folder: bin, sbin, jars, examples, or the whole distribution folder can be placed there. Future versions will automate this.

Set-up

Edit the dockerhub user name to reflect your public repository. To leverage the autodeployment run

./setup-kompass.sh

This run terraform to build out the infastructure, compile and publish the front end docker image, deploy the associated services, and deploy the autoscaling features.

Run

Run kubectl get svc kompass-service to obtain your ip address for accessing the front end. Enter this into a web browser. From the input fields, you can select the number of nodes, spark example java class, and modifier to add to the call to the examples.jar file. Submitting the form will run a Spark application and write the stdout from spark-submit to the web browser.

Introduction

KOMPASS allows multiple users to run many different Spark applications on the same EKS resource in an efficient way. It autoscales the number of EC2 instances and Spark clusters in order to meet demand at any given time. Custom metrics from Prometheus are used to predict the upcoming resource needs so that an accomodating number of instances are available when Spark applications need to be run, and then scales back the number of instances to save AWS costs. It is autodeployable with infastructure as code and fully containerized, so developers can spend more time on their Spark applications and less time leveraging the infastruture and dependencies necessary to run them.

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
autoscaling		autoscaling
frontend		frontend
iac		iac
locust		locust
monitoring		monitoring
.gitignore		.gitignore
README.md		README.md
setup.sh		setup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

KOMPASS (Kubernetes Orchestrated Multitenant Parallel Autoscaling Spark Server)

Useage

Prerequisites

Set-up

Run

Introduction

Architecture

Authors

Acknowledgments

About

Releases

Packages

Languages

josiahbjorgaard/KOMPASS

Folders and files

Latest commit

History

Repository files navigation

KOMPASS (Kubernetes Orchestrated Multitenant Parallel Autoscaling Spark Server)

Useage

Prerequisites

Set-up

Run

Introduction

Architecture

Authors

Acknowledgments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages