Skip to content

Commit

Permalink
Docker image in doc site (#1117)
Browse files Browse the repository at this point in the history
* update

* Update index.md

* Update index.md

* Update README.md
  • Loading branch information
cynthia-liu authored and glorysdj committed Jan 24, 2019
1 parent 212a894 commit 59006b7
Show file tree
Hide file tree
Showing 2 changed files with 299 additions and 6 deletions.
151 changes: 151 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,12 @@ In addition, Analytics Zoo also provides a rich set of analytics and AI support

- [Reference use cases](#reference-use-cases): a collection of end-to-end *reference use cases* (e.g., anomaly detection, sentiment analysis, fraud detection, image augmentation, object detection, variational autoencoder, etc.)

- [Docker images and builders](#docker-images-and-builders)
- [Analytics-Zoo in Docker](#analytics-zoo-in-docker)
- [How to build it](#how-to-build-it)
- [How to use the image](#how-to-use-the-image)
- [Notice](#notice)

## _Distributed TensorFlow and Keras on Spark/BigDL_
To make it easy to build and productionize the deep learning applications for Big Data, Analytics Zoo provides a unified analytics + AI platform that seamlessly unites Spark, TensorFlow, Keras and BigDL programs into an integrated pipeline (as illustrated below), which can then transparently run on a large-scale Hadoop/Spark clusters for distributed training and inference. (Please see more details [here](https://analytics-zoo.github.io/master/#ProgrammingGuide/tensorflow/)).

Expand Down Expand Up @@ -301,3 +307,148 @@ Using *Analytics Zoo Image Classification API* (including a set of pretrained de

## _Reference use cases_
Analytics Zoo provides a collection of end-to-end reference use cases, including *time series anomaly detection*, *sentiment analysis*, *fraud detection*, *image similarity*, etc. (See more details [here](https://analytics-zoo.github.io/master/#ProgrammingGuide/usercases-overview/))

## _Docker images and builders_

### _Analytics-Zoo in Docker_

**By default, the Analytics-Zoo image has installed below packages:**
- git
- maven
- Oracle jdk 1.8.0_152 (in /opt/jdk1.8.0_152)
- python 2.7.6
- pip
- numpy
- scipy
- pandas
- scikit-learn
- matplotlib
- seaborn
- jupyter
- wordcloud
- moviepy
- requests
- tensorflow_
- spark-${SPARK_VERSION} (in /opt/work/spark-${SPARK_VERSION})
- Analytics-Zoo distribution (in /opt/work/analytics-zoo-${ANALYTICS_ZOO_VERSION})
- Analytics-Zoo source code (in /opt/work/analytics-zoo)

**The work dir for Analytics-Zoo is /opt/work.**
- download-analytics-zoo.sh is used for downloading Analytics-Zoo distributions.
- start-notebook.sh is used for starting the jupyter notebook. You can specify the environment settings and spark settings to start a specified jupyter notebook.
- analytics-Zoo-${ANALYTICS_ZOO_VERSION} is the Analytics-Zoo home of Analytics-Zoo distribution.
- analytics-zoo-SPARK_x.x-x.x.x-dist.zip is the zip file of Analytics-Zoo distribution.
- spark-${SPARK_VERSION} is the Spark home.
- analytics-zoo is cloned from https://github.com/intel-analytics/analytics-zoo, contains apps, examples using analytics-zoo.

### _How to build it_

**By default, you can build a Analytics-Zoo:default image with latest nightly-build Analytics-Zoo distributions:**

```bash
sudo docker build --rm -t intelanalytics/analytics-zoo:default .
```

**If you need http and https proxy to build the image:**
```bash
sudo docker build \
--build-arg http_proxy=http://your-proxy-host:your-proxy-port \
--build-arg https_proxy=https://your-proxy-host:your-proxy-port \
--rm -t intelanalytics/analytics-zoo:default .
```

**You can also specify the ANALYTICS_ZOO_VERSION and SPARK_VERSION to build a specific Analytics-Zoo image:**
```bash
sudo docker build \
--build-arg http_proxy=http://your-proxy-host:your-proxy-port \
--build-arg https_proxy=https://your-proxy-host:your-proxy-port \
--build-arg ANALYTICS_ZOO_VERSION=0.3.0 \
--build-arg BIGDL_VERSION=0.6.0 \
--build-arg SPARK_VERSION=2.3.1 \
--rm -t intelanalytics/analytics-zoo:0.3.0-bigdl_0.6.0-spark_2.3.1 .
```

### _How to use the image_
**To start a notebook directly with a specified port(e.g. 12345). You can view the notebook on http://[host-ip]:12345**
```bash
sudo docker run -it --rm -p 12345:12345 \
-e NotebookPort=12345 \
-e NotebookToken="your-token" \
intelanalytics/analytics-zoo:default
sudo docker run -it --rm --net=host \
-e NotebookPort=12345 \
-e NotebookToken="your-token" \
intelanalytics/analytics-zoo:default
sudo docker run -it --rm -p 12345:12345 \
-e NotebookPort=12345 \
-e NotebookToken="your-token" \
intelanalytics/analytics-zoo:0.3.0-bigdl_0.6.0-spark_2.3.1
sudo docker run -it --rm --net=host \
-e NotebookPort=12345 \
-e NotebookToken="your-token" \
intelanalytics/analytics-zoo:0.3.0-bigdl_0.6.0-spark_2.3.1
```

**If you need http and https proxy in your environment:**
```bash
sudo docker run -it --rm -p 12345:12345 \
-e NotebookPort=12345 \
-e NotebookToken="your-token" \
-e http_proxy=http://your-proxy-host:your-proxy-port \
-e https_proxy=https://your-proxy-host:your-proxy-port \
intelanalytics/analytics-zoo:default
sudo docker run -it --rm --net=host \
-e NotebookPort=12345 \
-e NotebookToken="your-token" \
-e http_proxy=http://your-proxy-host:your-proxy-port \
-e https_proxy=https://your-proxy-host:your-proxy-port \
intelanalytics/analytics-zoo:default
sudo docker run -it --rm -p 12345:12345 \
-e NotebookPort=12345 \
-e NotebookToken="your-token" \
-e http_proxy=http://your-proxy-host:your-proxy-port \
-e https_proxy=https://your-proxy-host:your-proxy-port \
intelanalytics/analytics-zoo:0.3.0-bigdl_0.6.0-spark_2.3.1
sudo docker run -it --rm --net=host \
-e NotebookPort=12345 \
-e NotebookToken="your-token" \
-e http_proxy=http://your-proxy-host:your-proxy-port \
-e https_proxy=https://your-proxy-host:your-proxy-port \
intelanalytics/analytics-zoo:0.3.0-bigdl_0.6.0-spark_2.3.1
```

**You can also start the container first**
```bash
sudo docker run -it --rm --net=host \
-e NotebookPort=12345 \
-e NotebookToken="your-token" \
intelanalytics/analytics-zoo:default bash
```

**In the container, after setting proxy and ports, you can start the Notebook by:**
```bash
/opt/work/start-notebook.sh
```

### _Notice_
**If you need nightly build version of Analytics-Zoo, please pull the image form Dockerhub with:**
```bash
sudo docker pull intelanalytics/analytics-zoo:latest
```

**Please follow the readme in each app folder to test the jupyter notebooks !!!**

**With 0.3+ version of Anaytics-Zoo Docker image, you can specify the runtime conf of spark**
```bash
sudo docker run -itd --net=host \
-e NotebookPort=12345 \
-e NotebookToken="1234qwer" \
-e http_proxy=http://your-proxy-host:your-proxy-port \
-e https_proxy=https://your-proxy-host:your-proxy-port \
-e RUNTIME_DRIVER_CORES_ENV=4 \
-e RUNTIME_DRIVER_MEMORY=20g \
-e RUNTIME_EXECUTOR_CORES=4 \
-e RUNTIME_EXECUTOR_MEMORY=20g \
-e RUNTIME_TOTAL_EXECUTOR_CORES=4 \
intelanalytics/analytics-zoo:latest
```
154 changes: 148 additions & 6 deletions docs/docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,12 @@ In addition, Analytics Zoo also provides a rich set of analytics and AI support

- [Reference use cases](#reference-use-cases): a collection of end-to-end *reference use cases* (e.g., anomaly detection, sentiment analysis, fraud detection, image augmentation, object detection, variational autoencoder, etc.)

- [Docker images and builders](#docker-images-and-builders)
- [Analytics-Zoo in Docker](#analytics-zoo-in-docker)
- [How to build it](#how-to-build-it)
- [How to use the image](#how-to-use-the-image)
- [Notice](#notice)

## _Distributed TensorFlow and Keras on Spark/BigDL_
To make it easy to build and productionize the deep learning applications for Big Data, Analytics Zoo provides a unified analytics + AI platform that seamlessly unites Spark, TensorFlow, Keras and BigDL programs into an integrated pipeline (as illustrated below), which can then transparently run on a large-scale Hadoop/Spark clusters for distributed training and inference. (Please see more details [here](https://analytics-zoo.github.io/master/#ProgrammingGuide/tensorflow/)).

Expand Down Expand Up @@ -303,20 +309,156 @@ Using *Analytics Zoo Image Classification API* (including a set of pretrained de
## _Reference use cases_
Analytics Zoo provides a collection of end-to-end reference use cases, including *time series anomaly detection*, *sentiment analysis*, *fraud detection*, *image similarity*, etc. (See more details [here](https://analytics-zoo.github.io/master/#ProgrammingGuide/usercases-overview/))

## _Docker images and builders_

### _Analytics-Zoo in Docker_

**By default, the Analytics-Zoo image has installed below packages:**
- git
- maven
- Oracle jdk 1.8.0_152 (in /opt/jdk1.8.0_152)
- python 2.7.6
- pip
- numpy
- scipy
- pandas
- scikit-learn
- matplotlib
- seaborn
- jupyter
- wordcloud
- moviepy
- requests
- tensorflow_
- spark-${SPARK_VERSION} (in /opt/work/spark-${SPARK_VERSION})
- Analytics-Zoo distribution (in /opt/work/analytics-zoo-${ANALYTICS_ZOO_VERSION})
- Analytics-Zoo source code (in /opt/work/analytics-zoo)

**The work dir for Analytics-Zoo is /opt/work.**
- download-analytics-zoo.sh is used for downloading Analytics-Zoo distributions.
- start-notebook.sh is used for starting the jupyter notebook. You can specify the environment settings and spark settings to start a specified jupyter notebook.
- analytics-Zoo-${ANALYTICS_ZOO_VERSION} is the Analytics-Zoo home of Analytics-Zoo distribution.
- analytics-zoo-SPARK_x.x-x.x.x-dist.zip is the zip file of Analytics-Zoo distribution.
- spark-${SPARK_VERSION} is the Spark home.
- analytics-zoo is cloned from https://github.com/intel-analytics/analytics-zoo, contains apps, examples using analytics-zoo.

### _How to build it_

**By default, you can build a Analytics-Zoo:default image with latest nightly-build Analytics-Zoo distributions:**

```bash
sudo docker build --rm -t intelanalytics/analytics-zoo:default .
```

**If you need http and https proxy to build the image:**
```bash
sudo docker build \
--build-arg http_proxy=http://your-proxy-host:your-proxy-port \
--build-arg https_proxy=https://your-proxy-host:your-proxy-port \
--rm -t intelanalytics/analytics-zoo:default .
```

**You can also specify the ANALYTICS_ZOO_VERSION and SPARK_VERSION to build a specific Analytics-Zoo image:**
```bash
sudo docker build \
--build-arg http_proxy=http://your-proxy-host:your-proxy-port \
--build-arg https_proxy=https://your-proxy-host:your-proxy-port \
--build-arg ANALYTICS_ZOO_VERSION=0.3.0 \
--build-arg BIGDL_VERSION=0.6.0 \
--build-arg SPARK_VERSION=2.3.1 \
--rm -t intelanalytics/analytics-zoo:0.3.0-bigdl_0.6.0-spark_2.3.1 .
```

### _How to use the image_
**To start a notebook directly with a specified port(e.g. 12345). You can view the notebook on http://[host-ip]:12345**
```bash
sudo docker run -it --rm -p 12345:12345 \
-e NotebookPort=12345 \
-e NotebookToken="your-token" \
intelanalytics/analytics-zoo:default

sudo docker run -it --rm --net=host \
-e NotebookPort=12345 \
-e NotebookToken="your-token" \
intelanalytics/analytics-zoo:default

sudo docker run -it --rm -p 12345:12345 \
-e NotebookPort=12345 \
-e NotebookToken="your-token" \
intelanalytics/analytics-zoo:0.3.0-bigdl_0.6.0-spark_2.3.1

sudo docker run -it --rm --net=host \
-e NotebookPort=12345 \
-e NotebookToken="your-token" \
intelanalytics/analytics-zoo:0.3.0-bigdl_0.6.0-spark_2.3.1
```

**If you need http and https proxy in your environment:**
```bash
sudo docker run -it --rm -p 12345:12345 \
-e NotebookPort=12345 \
-e NotebookToken="your-token" \
-e http_proxy=http://your-proxy-host:your-proxy-port \
-e https_proxy=https://your-proxy-host:your-proxy-port \
intelanalytics/analytics-zoo:default

sudo docker run -it --rm --net=host \
-e NotebookPort=12345 \
-e NotebookToken="your-token" \
-e http_proxy=http://your-proxy-host:your-proxy-port \
-e https_proxy=https://your-proxy-host:your-proxy-port \
intelanalytics/analytics-zoo:default

sudo docker run -it --rm -p 12345:12345 \
-e NotebookPort=12345 \
-e NotebookToken="your-token" \
-e http_proxy=http://your-proxy-host:your-proxy-port \
-e https_proxy=https://your-proxy-host:your-proxy-port \
intelanalytics/analytics-zoo:0.3.0-bigdl_0.6.0-spark_2.3.1

sudo docker run -it --rm --net=host \
-e NotebookPort=12345 \
-e NotebookToken="your-token" \
-e http_proxy=http://your-proxy-host:your-proxy-port \
-e https_proxy=https://your-proxy-host:your-proxy-port \
intelanalytics/analytics-zoo:0.3.0-bigdl_0.6.0-spark_2.3.1
```

**You can also start the container first**
```bash
sudo docker run -it --rm --net=host \
-e NotebookPort=12345 \
-e NotebookToken="your-token" \
intelanalytics/analytics-zoo:default bash
```

**In the container, after setting proxy and ports, you can start the Notebook by:**
```bash
/opt/work/start-notebook.sh
```

### _Notice_
**If you need nightly build version of Analytics-Zoo, please pull the image form Dockerhub with:**
```bash
sudo docker pull intelanalytics/analytics-zoo:latest
```







**Please follow the readme in each app folder to test the jupyter notebooks !!!**

**With 0.3+ version of Anaytics-Zoo Docker image, you can specify the runtime conf of spark**
```bash
sudo docker run -itd --net=host \
-e NotebookPort=12345 \
-e NotebookToken="1234qwer" \
-e http_proxy=http://your-proxy-host:your-proxy-port \
-e https_proxy=https://your-proxy-host:your-proxy-port \
-e RUNTIME_DRIVER_CORES_ENV=4 \
-e RUNTIME_DRIVER_MEMORY=20g \
-e RUNTIME_EXECUTOR_CORES=4 \
-e RUNTIME_EXECUTOR_MEMORY=20g \
-e RUNTIME_TOTAL_EXECUTOR_CORES=4 \
intelanalytics/analytics-zoo:latest
```



Expand Down

0 comments on commit 59006b7

Please sign in to comment.