Skip to content

Teaching Materials for Distributed Statistical Computing (大数据分布式计算教学材料)

Notifications You must be signed in to change notification settings

feng-li/Distributed-Statistical-Computing

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Distributed Statistical Computing (大数据分布式计算——教学讲义以及案例)

developed by

Feng Li
School of Statistics and Mathematics
Central University of Finance and Economics
[email protected]

由中央财经大学统计与数学学院李丰建设。

Course Homepage (课程主页)

https://feng.li/distcomp/

Books (中文配套教材)

  • Distributed Statistical Computing for Big Data and Case Studies (大数据分布式计算与案例) ISBN:9787300230276

  • New version (In Preparation)

Contents (目录)

Teaching slides and demo code (with Jupyter Notebook format)

Quick View

You could view all the notebooks in this repository via the Jupyter Notebook Viewer

Run the demo locally

Requirements to run the notebook interactively

  • Python (>= 3.6.0)

    • findspark (invoke Spark from Python Session)
    • numpy, scipy, pandas
  • Hadoop (>= 2.7.0)

  • Hive (>= 2.3.3)

  • Spark (>= 2.3.1)

  • Jupyter Notebook (>= 5.0)

Distributed Statistical Computing Cases in markdown and tex format (out dated).