Skip to content

Apache Spark with native support for Nomad as a scheduler

License

Notifications You must be signed in to change notification settings

aashishs101/nomad-spark

 
 

Repository files navigation

Apache Spark on Nomad

This repository is a fork of Apache Spark that natively supports using HashiCorp's Nomad as Spark's cluster manager (as an alternative to Hadoop YARN and Mesos). When running on Nomad, the Spark executors that run tasks for your Spark application, and optionally the application driver itself, run as Nomad tasks in a Nomad job.

Sample spark-submit command when using Nomad:

spark-submit \
  --class org.apache.spark.examples.JavaSparkPi \
  --master nomad \
  --deploy-mode cluster \
  --conf spark.executor.instances=4 \
  --conf spark.nomad.sparkDistribution=https://s3.amazonaws.com/nomad-spark/spark-2.1.0-bin-nomad.tgz \
  https://s3.amazonaws.com/nomad-spark/spark-examples_2.11-2.1.0-SNAPSHOT.jar 100

The ultimate goal is to integrate Nomad into Spark directly, either natively or via a backend/scheduler plugin interface.

Benefits of Spark on Nomad

Nomad's design is heavily inspired by Google's work on both Borg and Omega. This has enabled a set of features that make Nomad well-suited to run analytical applications. Particularly relevant are its native support for batch workloads and parallelized, high throughput scheduling (more on scheduler internals here).

Nomad is easy to set up and use. It consists of a single binary/process, has a simple and intuitive data model, utilizes a declarative job specification and supports high availability and multi-datacenter federation out-of-the-box. Nomad also integrates seamlessly with HashiCorp's other runtime tools: Consul and Vault.

Getting Started

To get started, see Nomad's official Apache Spark Integration Guide. You can also use Nomad's example Terraform configuration and embedded Spark quickstart to give the integration a test drive on AWS. Builds are currently available for Spark 2.1.0 and 2.1.1.

About

Apache Spark with native support for Nomad as a scheduler

Resources

License

Security policy

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Scala 77.8%
  • Java 9.9%
  • Python 7.5%
  • R 3.4%
  • Shell 0.5%
  • JavaScript 0.5%
  • Other 0.4%