Distributed Statistical Computing (大数据分布式计算——教学讲义以及案例)

developed by

Feng Li
School of Statistics and Mathematics
Central University of Finance and Economics
[email protected]

由中央财经大学统计与数学学院李丰建设。

Course Homepage （课程主页）

https://feng.li/distcomp/

Books (中文配套教材)

Distributed Statistical Computing for Big Data and Case Studies (大数据分布式计算与案例) ISBN：9787300230276
- Available at JD.COM
New version (In Preparation)
- HTML format
- Source

Contents (目录)

Teaching slides and demo code (with Jupyter Notebook format)

Quick View

You could view all the notebooks in this repository via the Jupyter Notebook Viewer

Run the demo locally

Requirements to run the notebook interactively

Python (>= 3.6.0)
- findspark (invoke Spark from Python Session)
- numpy, scipy, pandas
Hadoop (>= 2.7.0)
Hive (>= 2.3.3)
Spark (>= 2.3.1)
Jupyter Notebook (>= 5.0)
- RISE (for Jupyter slides)
  
  Use Alt+R to enter slideshow mode
- Bash Kernel (for Linux and Hadoop, Hive, Spark batch mode)
- IPython kernel for Python 3 (for Interactive PySpark Sessions)
- HiveQL Kernel (for Interactive Hive Sessions)
- Spark Toree (for Interactive Spark Scala Sessions)

Name		Name	Last commit message	Last commit date
Latest commit History 107 Commits
L00-Linux-Basics		L00-Linux-Basics
L01-Introduction-to-Distributed-Computing		L01-Introduction-to-Distributed-Computing
L02-Understanding-MapReduce		L02-Understanding-MapReduce
L03-Statistical-Modeling-with-MapReduce		L03-Statistical-Modeling-with-MapReduce
L04-Hive		L04-Hive
L05-Introduction-to-Spark		L05-Introduction-to-Spark
L06-Data-Processing-with-Spark		L06-Data-Processing-with-Spark
L07-Machine-Learning-with-Spark		L07-Machine-Learning-with-Spark
L08-Modelling-Streaming-Data-with-Spark		L08-Modelling-Streaming-Data-with-Spark
L09-Spark-with-Scala		L09-Spark-with-Scala
L10-Spark-Visualization		L10-Spark-Visualization
L11-Advanced-Statistical-Modelling-with-Spark		L11-Advanced-Statistical-Modelling-with-Spark
book-examples		book-examples
data		data
distcomp-tutorial		distcomp-tutorial
.gitattributes		.gitattributes
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
setup_rise.py		setup_rise.py
setup_spark.sh		setup_spark.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Distributed Statistical Computing (大数据分布式计算——教学讲义以及案例)

Course Homepage （课程主页）

Books (中文配套教材)

Contents (目录)

Teaching slides and demo code (with Jupyter Notebook format)

Quick View

Run the demo locally

Distributed Statistical Computing Cases in markdown and tex format (out dated).

About

Releases 1

Packages

Languages

feng-li/Distributed-Statistical-Computing

Folders and files

Latest commit

History

Repository files navigation

Distributed Statistical Computing (大数据分布式计算——教学讲义以及案例)

Course Homepage （课程主页）

Books (中文配套教材)

Contents (目录)

Teaching slides and demo code (with Jupyter Notebook format)

Quick View

Run the demo locally

Distributed Statistical Computing Cases in markdown and tex format (out dated).

About

Topics

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages