Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add model-visual page #1039

Open
wants to merge 10 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
107 changes: 104 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@

VisualDL, a visualization analysis tool of PaddlePaddle, provides a variety of charts to show the trends of parameters, and visualizes model structures, data samples, histograms of tensors, PR curves , ROC curves and high-dimensional data distributions. It enables users to understand the training process and the model structure more clearly and intuitively so as to optimize models efficiently.

VisualDL provides various visualization functions, including **tracking metrics in real-time, visualizing the model structure, displaying the data sample, visualizing the relationship between hyperparameters and model metrics, presenting the changes of distributions of tensors, showing the pr curves, projecting high-dimensional data to a lower dimensional space and more.** Additionally, VisualDL provides VDL.service, which enables developers easily to save, track and share visualization results of experiments. For specific guidelines of each function, please refer to [**VisualDL User Guide**](./docs/components/README.md). For up-to-date experience, please feel free to try our [**Online Demo**](https://www.paddlepaddle.org.cn/paddle/visualdl/demo). Currently, VisualDL iterates rapidly and new functions will be continuously added.
VisualDL provides various visualization functions, including **tracking metrics in real-time, visualizing the model structure, displaying the data sample, visualizing the relationship between hyperparameters and model metrics, presenting the changes of distributions of tensors, showing the pr curves, projecting high-dimensional data to a lower dimensional space, visualize the model and more.** Additionally, VisualDL provides VDL.service, which enables developers easily to save, track and share visualization results of experiments. For specific guidelines of each function, please refer to [**VisualDL User Guide**](./docs/components/README.md). For up-to-date experience, please feel free to try our [**Online Demo**](https://www.paddlepaddle.org.cn/paddle/visualdl/demo). Currently, VisualDL iterates rapidly and new functions will be continuously added.

Browsers supported by VisualDL are:

Expand Down Expand Up @@ -392,6 +392,108 @@ Developers can compare multiple experiments by specifying and uploading the path
<img src="https://user-images.githubusercontent.com/28444161/119247155-e9c0c280-bbb9-11eb-8175-58a9c7657a9c.gif" width="85%"/>
</p>

### Model Visual
The data distribution and key statistical information of each layer of the model network are visualized from multiple angles with rich views, which is convenient to quickly understand the rationality of the current model network design and realize the rapid positioning of model anomalies.

<p align="center">
<img src="https://user-images.githubusercontent.com/95737959/147730782-5e659c39-26f4-4766-b657-6e530f33cf47.gif" width="85%"/>
</p>

The steps to use this function are as follows:

#### 1、Random sampling of the network node data
Use the paddle1.8.5 versions which support to random sample of the network node data: http://gitlab.baidu.com/paddle-distributed/wheel/blob/master/release_1.8/paddle_whl_release_1.8.5_20210902.whl


```python
join_save_params = []
for param in join_model._train_program.list_vars():
if param.persistable:
if "_generat" not in param.name:
join_save_params.append(param.name)
if "fc_" in param.name or "conv_" in param.name:
join_save_params.append(param.name + "@GRAD")
elif "RENAME" not in param.name:
if "fc_" in param.name or "dropout_4.tmp_0" in param.name or "concat_" in param.name:
join_save_params.append(param.name)
if "sequence_pool_" in param.name and "tmp_1" not in param.name:
join_save_params.append(param.name)

join_program._fleet_opt["dump_prob"] = 0.2
join_program._fleet_opt["dump_fields"] = ["slot1", "slot2"]
join_program._fleet_opt["dump_param"] = join_save_params
join_program._fleet_opt["dump_fields_path"] = config.output_path + "/random_dump/join/" + config.start_day + "/" + "delta-%s" % pass_index

#If there are multiple stages(like join\update) and each stage needs to dump, set its own dump_fields_path
update_model._train_program._fleet_opt["dump_fields_path"] = "%s/random_dump/update/%s/%s" % (config.output_path, day.data_day, '_'.join(datas))
```
Here is a full runnable demo reference: https://github.com/TsLu/PaddleDemo/blob/main/randump/random_dump.py
The format of random dump data is as follows: https://github.com/TsLu/PaddleDemo/blob/main/randump/random_dump/join/20211125/delta-1/part-000-00009

```python
#sample key \t neuron name:number of neuron nodes:output value of each neuron \t ...
dcefve concat_0.tmp_0:2:1:1 sequence_pool_6.tmp_0@GRAD:16:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0 sequence_pool_3.tmp_0@GRAD:12:0:0:0:0:0:0:0:0:0:0:0:0 sequence_pool_1.tmp_0@GRAD:12:0:0:0:0:0:0:0:0:0:0:0:0 concat_0.tmp_0:2:1:1 user_emb:140:0.0983945:-0.402422:-0.304479:0.48722:-0.423722:0.49905:-0.36198:-0.141344:0.164492:0.203659:-0.166241:0.371955:-0.338783:-0.39251:0.158664:-0.133492:0.200509:-0.23503:-0.149515:-0.247849:0.0900903:-0.250218:-0.29327:-0.449013:-0.289186:-0.296609:0.36998:0.309947:0.468418:0.0150231:0.178822:-0.0795117:-0.108979:0.494221:-0.442487:-0.286313:0.391469:-0.39494:-0.162585:-0.158422:-0.182274:0.431848:-0.268552:-0.28416:0.333334:0.360513:0.318403:-0.364475:0.439969:-0.246897:0.0332158:0.358267:-0.0748573:-0.435962:-0.302861:-0.388489:0.271488:0.0127385:-0.0989884:-0.271535:-0.254238:-0.33684:0.389732:-0.222312:-0.20576:-0.253779:-0.166874:-0.19071:0.25096:0.105208:0.487118:-0.334612:-0.0503092:-0.473779:0.193285:0.0745487:-0.45893:-0.024402:0.0913379:-0.0261859:-0.0188701:0.120137:0.116529:-0.0141518:-0.119165:-0.198176:-0.159524:-0.378288:-0.341906:-0.128065:0.166849:-0.0154788:-0.177214:-0.287362:-0.239857:-0.136312:0.107463:0.356079:0.278596:0.117707:-0.162731:-0.198466:-0.175281:-0.00143227:-0.13731:-0.074105:-0.123823:-0.0376647:-0.11276:-0.0496815:-0.172825:-0.429263:0.0284473:0.182517:0.26848:-0.215857:0.349042:-0.373334:-0.218745:-0.0499232:0.155349:-0.123708:0.478668:-0.214383:0.494542:0.0422934:-0.452487:-0.014959:-0.0854984:-0.094967:-0.150888:0.483285:-0.365631:-0.366048:-0.47845:-0.282711:0.25745:0.367952:0.388146:0.188527
```

#### 2、Using model visual to process sampled neuron datas
The data processing interface is used to process the neuron datas:
```python
from visualdl.thirdparty.process_data import ModelAnalysis
params = {
"hadoop_bin": "/home/work/hadoop/bin/hadoop",
"ugi": "**",
"debug_input": "afs://***/random_dump/join/20211015",
# "debug_input": "/home/work/testuser/visualdl/data/random_dump/join/20211028", #local dump data
"delta_num": "8",
"join_pbtxt": "/home/work/test_download/train/join_main_program.pbtxt",
"update_pbtxt": "/home/work/test_download/train/update_main_program.pbtxt"
}
model_analysis = ModelAnalysis(params)
model_analysis()
```
Parameter details:
| parameter | meaning |
| --------------- | ------------------------------------------------------------ |
| hadoop_bin | If the data of the sampled datas stored on AFS, you need to specify the hadoop path. If it is a local path, you do not need to set|
| ugi | If the data of the sampled datas stored on AFS, you need to specify the hadoop ugi. If it is a local path, you do not need to set |
| debug_input | The data storage path of the sampled network datas, fill in the AFS path or local path |
| delta_num | Number of trained batches |
| join_pbtxt | Model trained join network, only local path |
| update_pbtxt | Model trained update network, only local path, If not has this stage, you don't have to fill it in |
| data_dir | The local folder path which is used to store the processed intermediate data |


#### 3、Check the network node data using visualdl
##### Use the command line to launch the VisualDL panel:
```python
visualdl --logdir <dir_1, dir_2, ... , dir_n> --data_dir <data_dir> --host <host> --port <port> --cache-timeout <cache_timeout> --language <language> --public-path <public_path> --api-only
```
Parameter details:
| parameter | meaning |
| --------------- | ------------------------------------------------------------ |
| --logdir | Set one or more directories of the log. All the logs in the paths or subdirectories will be displayed on the VisualDL Board indepentently. |
| --data_dir | The local folder path which is used to store the processed intermediate data,same whith the step2 |
| --host | Specify IP address. The default value is 127.0.0.1. Specify it as 0.0.0.0 or public IP address so that other machines can visit VisualDL Board.|
| --port | Set the port. The default value is 8040. |
| --cache-timeout | Cache time of the backend. During the cache time, the front end requests the same URL multiple times, and then the returned data are obtained from the cache. The default cache time is 20 seconds. |
| --language | The language of the VisualDL panel. Language can be specified as 'en' or 'zh', and the default is the language used by the browser. |
| --public-path | The URL path of the VisualDL panel. The default path is '/app', meaning that the access address is 'http://<host>:<port>/app'.|
| --api-only | Decide whether or not to provide only API. If this parameter is set, VisualDL will only provides API service without displaying the web page, and the API address is 'http://<host>:<port>/<public_path>/api'. Additionally, If the public_path parameter is not specified, the default address is 'http://<host>:<port>/api'. |

##### Launch in Python Script
Developers can start the VisualDL panel in Python script as follows:

```python
visualdl.server.app.run(logdir,
data_dir="datapath",
host="127.0.0.1",
port=8080,
cache_timeout=20,
language=None,
public_path=None,
api_only=False,
open_browser=False)
```

### VDL.service

Expand All @@ -401,14 +503,13 @@ Developers can compare multiple experiments by specifying and uploading the path
<img src=https://user-images.githubusercontent.com/48054808/93731055-fbeafb00-fbfd-11ea-80f4-bbfd08a0fc35.png width="85%"/>
</p>


## Frequently Asked Questions

If you are confronted with some problems when using VisualDL, please refer to [our FAQs](./docs/faq.md).

## Contribution

VisualDL, in which Graph is powered by [Netron](https://github.com/lutzroeder/netron), is an open source project supported by [PaddlePaddle](https://www.paddlepaddle.org/) and [ECharts](https://echarts.apache.org/).
VisualDL, in which Graph is powered by [Netron](https://github.com/lutzroeder/netron) and [AntV G6](https://github.com/antvis/G6), is an open source project supported by [PaddlePaddle](https://www.paddlepaddle.org/) and [ECharts](https://echarts.apache.org/).

Developers are warmly welcomed to use, comment and contribute.

Expand Down
Loading