Documentation: User Guide
Community: Slack Channel, StackOverflow tag, Email
Contributing: Contribution Guide
Issue Tracker: Jira
License: Apache 2.0
InsightEdge is a Spark distribution on top of in-memory Data Grid. A single platform for analytical and transactional workloads.
- Exposes Data Grid as Spark RDDs
- Saves Spark RDDs to Data Grid
- Full DataFrames API support with persistence
- Geospatial API for RDD and DataFrames. Geospatial indexes.
- Transparent integration with SparkContext using Scala implicits
- Data Grid side filtering with ability apply indexes
- Running SQL queries in Spark over Data Grid
- Data locality between Spark and Data Grid nodes
- Storing MLlib models in Data Grid
- Continuously saving Spark Streaming computation to Data Grid
- Off-Heap persistence
- Interactive Web Notebook
- Python support
InsightEdge is built using Apache Maven.
First, compile and install InsightEdge Core libraries:
# without unit tests
mvn clean install -DskipTests=true
# with unit tests
mvn clean install
To build InsightEdge zip distribution you need the following binary dependencies:
- insightedge-datagrid 12.0.0: find build instructions in repository readme or download release from the website
- insightedge-examples: use the same branch as in this repo, find build instructions in repository readme
- insightedge-zeppelin: use the same branch as in this repo, build with
mvn clean install -DskipTests -P spark-1.6 -P build-distr
- Apache Spark 1.6.1: download zip
Package InsightEdge distribution:
mvn clean package -P package-community -DskipTests=true -Ddist.spark=<path to spark.tgz> -Ddist.xap=<path to xap.zip> -Ddist.zeppelin=<path to zeppelin.tar.gz> -Ddist.examples=<path to examples.zip>
The archive is generated under insightedge-packager/target/community
directory. The archive content is under insightedge-packager/target/contents-community
.
To run integration tests refer to the wiki page
Build the project and start InsightEdge demo mode with
cd insightedge-packager/target/contents-community
./sbin/insightedge.sh --mode demo
It starts Zeppelin at http://127.0.0.1:8090 with InsightEdge tutorial and example notebooks you can play with. The full documentation is available at website.