The Scala examples require a recent JVM and sbt to run. Typing sbt run
from the lr
subdirectory (not this directory) should download missing dependencies and run the examples. Note that having sane package dependency management and resolution is one of the many advantages of working in Scala (and on the JVM, more generally). The code for these examples should continue to build and run without problems for many years, irrespective of any developments in the libraries that the code depends on.
See the sbt
web page for installation instructions. But also note that coursier can be used to install a complete Scala development environment (including JVMs, sbt
, the Scala compiler, etc.), so that is also worth considering.
To run a particular example, do, eg.
sbt "runMain rwmh"
Note that sbt
is also designed to be used interactively. eg. do sbt
to get an sbt
prompt, and then type run
at the sbt
prompt.
The Apache Spark example requires a Spark installation in addition to sbt
. Running sbt assembly
from the spark
subdirectory will produce a jar
that can be submitted to a Spark cluster using spark-submit
. See the Spark docs for more information on installing and using Spark clusters.
Note that Spark is intended for working with very large datasets. On small datasets it will be much slower than non-Spark Scala code.
If you want to learn more about Scala, the (free) on-line video series, Scala at light speed from Rock the JVM is quite a good place to start. For more on scientific and statistical computing, follow up with my on-line course, Scala for statistical computing and data science.