Documentation: User Guide
Community: Slack Channel, StackOverflow tag, Email
Contributing: Contribution Guide
Issue Tracker: Jira
License: Apache 2.0
InsightEdge is a Spark distribution on top of in-memory Data Grid. A single platform for analytical and transactional workloads.
- Exposes Data Grid as Spark RDDs
- Saves Spark RDDs to Data Grid
- Full DataFrames and Dataset API support with persistence
- Geospatial API for RDD and DataFrames. Geospatial indexes.
- Transparent integration with SparkContext using Scala implicits
- Data Grid side filtering with ability apply indexes
- Running SQL queries in Spark over Data Grid
- Data locality between Spark and Data Grid nodes
- Storing MLlib models in Data Grid
- Continuously saving Spark Streaming computation to Data Grid
- Off-Heap persistence
- Interactive Web Notebook
- Python support
InsightEdge is built using Apache Maven.
First, compile and install InsightEdge Core libraries:
# without unit tests
mvn clean install -DskipTests=true
# with unit tests
mvn clean install
To build InsightEdge zip distribution you need the following binary dependencies:
- insightedge-datagrid 12.3.0: download a copy of the XAP 12.x Open Source Edition
- insightedge-examples: use the same branch as in this repo, find build instructions in repository readme
- insightedge-zeppelin: use the same branch as in this repo, run
./dev/change_scala_version.sh 2.11
, then build withmvn clean install -DskipTests -P spark-2.1 -P scala-2.11 -P build-distr -Dspark.version=2.1.1
- Apache Spark 2.3.0: download zip
Package InsightEdge distribution:
mvn clean package -P package-open -DskipTests=true -Ddist.spark=<path to spark.tgz> -Ddist.xap=file:///<path to xap.zip> -Ddist.zeppelin=<path to zeppelin.tar.gz> -Ddist.examples.target=<path to examples target>
The archive is generated under insightedge-packager/target/open
directory. The archive content is under insightedge-packager/target/contents-community
.
To run integration tests refer to the wiki page
Build the project and start InsightEdge demo mode with
cd insightedge-packager/target/contents-community
./bin/insightedge -demo
It starts Zeppelin at http://127.0.0.1:9090 with InsightEdge tutorial and example notebooks you can play with. The full documentation is available at website.