Skip to content
Ioannis N. Athanasiadis edited this page Jul 5, 2017 · 14 revisions

Scala API

Magellan is hosted on Spark Packages.

When launching the Spark Shell, Magellan can be included like any other spark package using the --packages option:

> $SPARK_HOME/bin/spark-shell --packages harsha2010:magellan:1.0.4-s_2.11

For this example, you may want to import the following:

import magellan.{Point, Polygon}
import org.apache.spark.sql.magellan.dsl.expressions._
import org.apache.spark.sql.types._

Data Structures

Point

val points = sc.parallelize(Seq(
    (-1.0, -1.0),
    (-1.0, 1.0),
    (1.0, -1.0))).
    toDF("x", "y").
    select(point($"x", $"y").as("point"))

points.show()

+-----------------+
|            point|
+-----------------+
|Point(-1.0, -1.0)|
| Point(-1.0, 1.0)|
| Point(1.0, -1.0)|
+-----------------+

Polygon

case class PolygonRecord(polygon: magellan.Polygon)

val square = Array(
  Point(1.0, 1.0),
  Point(1.0, -1.0),
  Point(-1.0, -1.0),
  Point(-1.0, 1.0),
  Point(1.0, 1.0))

val rect = Array(
  Point(0.0, 0.0),
  Point(2.0, 0.0),
  Point(2.0, 1.0),
  Point(0.0, 1.0),
  Point(0.0, 0.0))

val polygons = sc.parallelize(Seq(
  PolygonRecord(Polygon(Array(0), square)),
  PolygonRecord(Polygon(Array(0), rect))
)).toDF()

polygons.show()

+--------------------+
|             polygon|
+--------------------+
|Polygon(5, Vector...|
|Polygon(5, Vector...|
+--------------------+

Predicates

within

polygons.select(point(0.5, 0.5).within($"polygon")).collect() // true, true
polygons.select(point(0.5, 0.0).within($"polygon")).collect() // true, false

intersects

points.join(polygons).where($"point".intersects($"polygon")).show()

+-----------------+--------------------+
|            point|             polygon|
+-----------------+--------------------+
|Point(-1.0, -1.0)|Polygon(5, Vector...|
| Point(-1.0, 1.0)|Polygon(5, Vector...|
| Point(1.0, -1.0)|Polygon(5, Vector...|
+-----------------+--------------------+
Clone this wiki locally