A thin layer build for supporting R language with TiSpark
- Download TiSparkR source code and build a binary package(run
R CMD build R
in TiSpark root directory). Install it to your local R library(e.g. viaR CMD INSTALL TiSparkR_1.0.0.tar.gz
) - Build or download TiSpark dependency jar
tispark-core-1.0-RC1-jar-with-dependencies.jar
here. cd
to your Spark home directory, and run
./bin/sparkR --jars /where-ever-it-is/tispark-core-${version}-jar-with-dependencies.jar
Note that you should replace the TiSpark
jar path with your own.
- Use as below in your R console:
# import tisparkR library
> library(TiSparkR)
# create a TiContext instance
> ti <- createTiContext(spark)
# Map TiContext to database:tpch_test
> tidbMapDatabase(ti, "tpch_test")
# Run a sql query
> customers <- sql("select * from customer")
# Print schema
> printSchema(customers)
root
|-- c_custkey: long (nullable = true)
|-- c_name: string (nullable = true)
|-- c_address: string (nullable = true)
|-- c_nationkey: long (nullable = true)
|-- c_phone: string (nullable = true)
|-- c_acctbal: decimal(15,2) (nullable = true)
|-- c_mktsegment: string (nullable = true)
|-- c_comment: string (nullable = true)
# Run a count query
> count <- sql("select count(*) from customer")
# Print count result
> head(count)
count(1)
1 150