Apache Spark on Nomad

This repository is a fork of Apache Spark that natively supports using HashiCorp's Nomad as Spark's cluster manager (as an alternative to Hadoop YARN and Mesos). When running on Nomad, the Spark executors that run tasks for your Spark application, and optionally the application driver itself, run as Nomad tasks in a Nomad job.

Sample spark-submit command when using Nomad:

spark-submit \
  --class org.apache.spark.examples.JavaSparkPi \
  --master nomad \
  --deploy-mode cluster \
  --conf spark.executor.instances=4 \
  --conf spark.nomad.sparkDistribution=https://s3.amazonaws.com/nomad-spark/spark-2.1.0-bin-nomad.tgz \
  https://s3.amazonaws.com/nomad-spark/spark-examples_2.11-2.1.0-SNAPSHOT.jar 100

The ultimate goal is to integrate Nomad into Spark directly, either natively or via a backend/scheduler plugin interface.

Benefits of Spark on Nomad

Nomad's design is heavily inspired by Google's work on both Borg and Omega. This has enabled a set of features that make Nomad well-suited to run analytical applications. Particularly relevant are its native support for batch workloads and parallelized, high throughput scheduling (more on scheduler internals here).

Nomad is easy to set up and use. It consists of a single binary/process, has a simple and intuitive data model, utilizes a declarative job specification and supports high availability and multi-datacenter federation out-of-the-box. Nomad also integrates seamlessly with HashiCorp's other runtime tools: Consul and Vault.

Getting Started

To get started, see Nomad's official Apache Spark Integration Guide. You can also use Nomad's example Terraform configuration and embedded Spark quickstart to give the integration a test drive on AWS. Builds are currently available for Spark 2.1.0 and 2.1.1.

Name		Name	Last commit message	Last commit date
Latest commit History 19,350 Commits
.github		.github
R		R
assembly		assembly
bin		bin
build		build
common		common
conf		conf
core		core
data		data
dev		dev
docs		docs
examples		examples
external		external
graphx		graphx
launcher		launcher
licenses		licenses
mllib-local		mllib-local
mllib		mllib
project		project
python		python
repl		repl
resource-managers		resource-managers
sbin		sbin
sql		sql
streaming		streaming
tools		tools
.gitattributes		.gitattributes
.gitignore		.gitignore
.travis.yml		.travis.yml
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
appveyor.yml		appveyor.yml
pom.xml		pom.xml
scalastyle-config.xml		scalastyle-config.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Apache Spark on Nomad

Benefits of Spark on Nomad

Getting Started

About

Releases

Packages

Languages

License

aashishs101/nomad-spark

Folders and files

Latest commit

History

Repository files navigation

Apache Spark on Nomad

Benefits of Spark on Nomad

Getting Started

About

Resources

License

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages