Skip to content

Commit

Permalink
add accelerate (#36)
Browse files Browse the repository at this point in the history
add accelerate
  • Loading branch information
wangshuai09 authored Jul 16, 2024
1 parent c2fa3e4 commit e93616c
Show file tree
Hide file tree
Showing 4 changed files with 111 additions and 5 deletions.
11 changes: 6 additions & 5 deletions index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@

sources/pytorch/index.rst
sources/llamafactory/index.rst
sources/accelerate/index.rst
sources/transformers/index.rst

.. warning::
Expand Down Expand Up @@ -136,7 +137,7 @@
</div>
<div class="flex-grow"></div>
<div class="flex space-x-4 text-blue-600">
<a href="#">官方链接</a>
<a href="https://github.com/AUTOMATIC1111/stable-diffusion-webui">官方链接</a>
<span class="split">|</span>
<a href="#">安装指南</a>
<span class="split">|</span>
Expand Down Expand Up @@ -186,16 +187,16 @@
<div class="img w-16 h-16 rounded-md mr-4" style="background-image: url('_static/images/huggingface.png')"></div>
<div>
<h2 class="text-lg font-semibold">Accelerate</h2>
<p class="text-gray-600 desc">图像和音频生成等扩散模型工具链</p>
<p class="text-gray-600 desc">适用于Pytorch的多GPUs训练工具链</p>
</div>
</div>
<div class="flex-grow"></div>
<div class="flex space-x-4 text-blue-600">
<a href="#">官方链接</a>
<a href="https://github.com/huggingface/accelerate">官方链接</a>
<span class="split">|</span>
<a href="#">安装指南</a>
<a href="sources/accelerate/install.html">安装指南</a>
<span class="split">|</span>
<a href="#">快速上手</a>
<a href="sources/accelerate/quick_start.html">快速上手</a>
</div>
</div>
</div>
Expand Down
8 changes: 8 additions & 0 deletions sources/accelerate/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
Accelerate
==============

.. toctree::
:maxdepth: 2

install.rst
quick_start.rst
28 changes: 28 additions & 0 deletions sources/accelerate/install.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
安装指南
==============

本教程面向使用 Accelerate & 昇腾的开发者,帮助完成昇腾环境下 Accelerate 的安装。

Accelerate 下载安装
--------------------

.. note::

阅读本篇前,请确保已按照 :doc:`安装教程 <./install>` 准备好昇腾环境!
或者直接使用具备昇腾环境的镜像 `cosdt/cann:8.0.rc1-910b-ubuntu22.04 <https://hub.docker.com/layers/cosdt/cann/8.0.rc1-910b-ubuntu22.04/images/sha256-29ef8aacf6b2babd292f06f00b9190c212e7c79a947411e213135e4d41a178a9?context=explore>`_,
更多的版本可至 `cosdt/cann <https://hub.docker.com/r/cosdt/cann/tags>`_ 获取。

启动镜像
:::::::::::::::::

.. code-block:: shell
docker run -itd --network host -v /usr/local/dcmi:/usr/local/dcmi -v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi -v /usr/local/Ascend/driver:/usr/local/Ascend/driver -v /etc/ascend_install.info:/etc/ascend_install.info --device /dev/davinci7 --device /dev/davinci_manager --device /dev/devmm_svm --device /dev/hisi_hdc --shm-size 16G --name accelerate cosdt/cann:8.0.rc1-910b-ubuntu22.04 bash
安装 Accelerate 及依赖包
::::::::::::::::::::::::::

.. code-block:: shell
pip install torch==2.2.0 torch_npu==2.2.0 accelerate -i https://pypi.tuna.tsinghua.edu.cn/simple
69 changes: 69 additions & 0 deletions sources/accelerate/quick_start.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
快速开始
============

.. note::
阅读本篇前,请确保已按照 :doc:`安装指南 <./install>` 准备好昇腾环境及 Accelerate !

本教程以一个简单的 NLP 模型为例,讲述如何使用 Accelerate 在昇腾 NPU 上进行模型的训练。

前置准备
------------

本篇将使用到 HuggingFace 其他工具链及 scikit-learn 库,请使用以下指令安装:

.. code-block::
pip install datasets evaluate transformers scikit-learn -i https://pypi.tuna.tsinghua.edu.cn/simple
本篇样例代码为 Accelrate 官方样例,需提前进行下载

.. code-block::
git clone https://github.com/huggingface/accelerate.git
模型训练
------------

.. code-block::
:linenos:
# 替换HF域名,方便国内用户进行数据及模型的下载
export HF_ENDPOINT=https://hf-mirror.com
# 进入项目目录
cd accelerate/examples
# 模型训练
python nlp_example.py
出现如下日志代表训练成功:

::

Downloading builder script: 5.75kB [00:01, 3.69kB/s]
tokenizer_config.json: 100%|████████████████████████████████████████████████████████████████████████████████████████| 49.0/49.0 [00:00<00:00, 237kB/s]
config.json: 570B [00:00, 2.23MB/s]
vocab.txt: 79.5kB [00:12, 3.45kB/s]Error while downloading from https://hf-mirror.com/bert-base-cased/resolve/main/vocab.txt: HTTPSConnectionPool(host='hf-mirror.com', port=443): Read timed out.
Trying to resume download...
vocab.txt: 213kB [00:07, 15.5kB/s]]
vocab.txt: 91.4kB [00:32, 2.81kB/s]
tokenizer.json: 436kB [00:19, 22.8kB/s]
Downloading readme: 35.3kB [00:01, 26.4kB/s]
Downloading data: 100%|█████████████████████████████████████████████████████████████████████████████████████████████| 649k/649k [00:02<00:00, 288kB/s]
Downloading data: 100%|██████████████████████████████████████████████████████████████████████████████████████████| 75.7k/75.7k [00:00<00:00, 77.8kB/s]
Downloading data: 100%|█████████████████████████████████████████████████████████████████████████████████████████████| 308k/308k [00:01<00:00, 204kB/s]
Generating train split: 100%|███████████████████████████████████████████████████████████████████████████| 3668/3668 [00:00<00:00, 27701.23 examples/s]
Generating validation split: 100%|████████████████████████████████████████████████████████████████████████| 408/408 [00:00<00:00, 73426.42 examples/s]
Generating test split: 100%|███████████████████████████████████████████████████████████████████████████| 1725/1725 [00:00<00:00, 246370.91 examples/s]
Map: 100%|███████████████████████████████████████████████████████████████████████████████████████████████| 3668/3668 [00:01<00:00, 3378.05 examples/s]
Map: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 408/408 [00:00<00:00, 3553.72 examples/s]
Map: 100%|███████████████████████████████████████████████████████████████████████████████████████████████| 1725/1725 [00:00<00:00, 5109.03 examples/s]
model.safetensors: 100%|███████████████████████████████████████████████████████████████████████████████████████████| 436M/436M [02:42<00:00, 2.68MB/s]
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-cased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
You're using a BertTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.
epoch 0: {'accuracy': 0.8014705882352942, 'f1': 0.8439306358381503}
epoch 1: {'accuracy': 0.8578431372549019, 'f1': 0.8975265017667845}
epoch 2: {'accuracy': 0.8700980392156863, 'f1': 0.9087779690189329}

0 comments on commit e93616c

Please sign in to comment.