Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs - Cherry Pick #116

Merged
merged 1 commit into from
Mar 18, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion LICENSE.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
MIT License

Copyright (c) 2022 - 2023 Advanced Micro Devices, Inc. All rights reserved.
Copyright (c) 2022 - 2024 Advanced Micro Devices, Inc. All rights reserved.

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
Expand Down
15 changes: 15 additions & 0 deletions docs/examples.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
.. meta::
:description: rocAL documentation and API reference library
:keywords: rocAL, ROCm, API, documentation

.. _examples:

********************************************************************
Examples
********************************************************************

Use the links below to see more examples:

* `Image Processing <https://github.com/ROCm/rocAL/tree/master/docs/examples/image_processing>`_
* `Pytorch <https://github.com/ROCm/rocAL/tree/master/docs/examples/pytorch>`_

31 changes: 31 additions & 0 deletions docs/how-to/architecture.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
.. meta::
:description: rocAL documentation and API reference library
:keywords: rocAL, ROCm, API, documentation

.. _architecture:

********************************************************************
Architecture Components
********************************************************************

The rocAL architecture comprises rocAL Master-Graph and ROCm Performance Primitive (RPP) as major components.

rocAL Master-Graph
===================

The rocAL pipeline is built on top of rocAL Master-Graph. The architectural components of rocAL Master-Graph are described below:

**Loader and Processing Modules:** The rocAL Master-Graph consists of two main architectural components, a loader module to load data and a processing module to process data. The loader module is clearly separated from the processing module for a seamless execution without any blockages. The Prefetch queue helps to load data ahead of time and can be configured with user-defined parameters. The Output routine runs in parallel with the load routine, as both have separate queues for storing the result.

.. figure:: ../data/ch2_arch.png

rocAL Master-Graph Architecture

**rocAL Pipeline:** The rocAL pipeline holds great significance, as it contains all the information required to create a rocAL graph with data loader, augmentation nodes, and the output format. Once a rocAL pipeline is created, the user can build, run, and call an iterator to get the next batch of data into the pipeline. The user can install the rocAL pipeline using the rocAL Python package. It supports many operators for data loading and data augmentation.

ROCm Performance Primitive (RPP) Library
=========================================

RPP is a comprehensive high-performance computer vision library optimized for the AMD CPU and GPU with HIP and OpenCL backends. It is available under the AMD ROCm software platform. It provides low-level functionality for all rocAL operators for single, image, and tensor datatypes. RPP provides an extensive library for vision augmentations that includes vision functions, color augmentations, filter augmentations, geometric distortions, and a few more features.

For more information on RPP along with the list of supported kernels, see `ROCm Performance Primitives <https://github.com/ROCm/rpp>`_.
234 changes: 234 additions & 0 deletions docs/how-to/framework.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,234 @@
.. meta::
:description: rocAL documentation and API reference library
:keywords: rocAL, ROCm, API, documentation

.. _framework:

********************************************************************
ML Framework Integration
********************************************************************

rocAL improves the pipeline efficiency by preprocessing the data and parallelizing the data loading on the CPU and running trainings on the GPU. To separate the data loading from the training, rocAL provides TensorFlow and PyTorch iterators and readers as a plugin. The integration process with PyTorch and TensorFlow is described in the sections below.

.. _pytorch:

PyTorch Integration
===========================

This section demonstrates how to use rocAL with PyTorch for training. Follow the steps below to get started.

Build PyTorch Docker
--------------------------------

Build a rocAL PyTorch docker by following the steps here.

Create Data-loading Pipeline
----------------------------------------

Follow these steps:

1. Import libraries for `rocAL <https://github.com/ROCm/rocAL/blob/master/docs/examples/pytorch/test_training.py#L28>`_.

.. code-block:: python
:caption: Import libraries

from amd.rocal.plugin.pytorch import ROCALClassificationIterator
from amd.rocal.pipeline import Pipeline
import amd.rocal.fn as fn
import amd.rocal.types as types


2. See a rocAL pipeline for PyTorch below. It reads data from the dataset using a fileReader and uses image_slice to decode the raw images. The other required augmentation operations are also defined in the `pipeline <https://github.com/ROCm/rocAL/blob/master/docs/examples/pytorch/test_training.py#L38>`_.

.. code-block:: python
:caption: Pipeline for PyTorch

def trainPipeline(data_path, batch_size, num_classes, one_hot, local_rank, world_size, num_thread, crop, rocal_cpu, fp16):
pipe = Pipeline(batch_size=batch_size, num_threads=num_thread, device_id=local_rank, seed=local_rank+10,
rocal_cpu=rocal_cpu, tensor_dtype = types.FLOAT16 if fp16 else types.FLOAT, tensor_layout=types.NCHW,
prefetch_queue_depth = 7)
with pipe:
jpegs, labels = fn.readers.file(file_root=data_path, shard_id=local_rank, num_shards=world_size, random_shuffle=True)
rocal_device = 'cpu' if rocal_cpu else 'gpu'
# decode = fn.decoders.image(jpegs, output_type=types.RGB,file_root=data_path, shard_id=local_rank, num_shards=world_size, random_shuffle=True)
decode = fn.decoders.image_slice(jpegs, output_type=types.RGB,
file_root=data_path, shard_id=local_rank, num_shards=world_size, random_shuffle=True)
res = fn.resize(decode, resize_x=224, resize_y=224)
flip_coin = fn.random.coin_flip(probability=0.5)
cmnp = fn.crop_mirror_normalize(res, device="gpu",
output_dtype=types.FLOAT,
output_layout=types.NCHW,
crop=(crop, crop),
mirror=flip_coin,
image_type=types.RGB,
mean=[0.485,0.456,0.406],
std=[0.229,0.224,0.225])
if(one_hot):
_ = fn.one_hot(labels, num_classes)
pipe.set_outputs(cmnp)
print('rocal "{0}" variant'.format(rocal_device))
return pipe


3. Import libraries for PyTorch.

.. code-block:: python
:caption: Import libraries for PyTorch

import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim


4. Call the training pipeline with rocAL classification data `loader <https://github.com/ROCm/rocAL/blob/master/docs/examples/pytorch/test_training.py#L78>`_.

.. code-block:: python
:caption: Call the training pipeline

Def get_pytorch_train_loader(self):
print(“in get_pytorch_train_loader function”)
pipe_train = trainPipeline(self.data_path, self.batch_size, self.num_classes, self.one_hot, self.local_rank,
self.world_size, self.num_thread, self.crop, self.rocal_cpu, self.fp16)
pipe_train.build()
train_loader = ROCALClassificationIterator(pipe_train, device=”cpu” if self.rocal_cpu else “cuda”, device_id = self.local_rank)


5. Run the `training script <https://github.com/ROCm/rocAL/blob/master/docs/examples/pytorch/test_training.py#L179>`_.

.. code-block:: python
:caption: Run the training pipeline

# Training loop
for epoch in range(10): # loop over the dataset multiple times
print(“\n epoch:: “,epoch)
running_loss = 0.0

for i, (inputs,labels) in enumerate(train_loader, 0):

sys.stdout.write(“\r Mini-batch “ + str(i))
# print(“Images”,inputs)
# print(“Labels”,labels)
inputs, labels = inputs.to(device), labels.to(device)


6. To see and run a sample training script, refer to `rocAL PyTorch example <https://github.com/ROCm/rocAL/tree/master/docs/examples/pytorch>`_.

.. _tensorflow:

TensorFlow Integration
===============================

This section demonstrates how to use rocAL with TensorFlow for training. Follow the steps below to get started.

Build TensorFlow Docker
--------------------------------------

Build a rocAL TensorFlow docker by following the steps here.

Create Data-loading Pipeline
----------------------------------------

Follow these steps:

1. Import libraries for `rocAL_pybind <https://github.com/ROCm/rocAL/blob/master/rocAL_pybind/examples/tf_petsTrainingExample/train_withROCAL_withTFRecordReader.py#L22>`_.

.. code-block:: python
:caption: Import libraries

from amd.rocal.plugin.tf import ROCALIterator
from amd.rocal.pipeline import Pipeline
import amd.rocal.fn as fn
import amd.rocal.types as types


2. See a rocAL pipeline for TensorFlow below. It reads data from the TFRecords using TFRecord Reader and uses ``fn.decoders.image`` to decode the raw `images <https://github.com/ROCm/rocAL/blob/master/rocAL_pybind/examples/tf_petsTrainingExample/train_withROCAL_withTFRecordReader.py#L128>`_.

.. code-block:: python
:caption: Pipeline for TensorFlow

trainPipe = Pipeline(batch_size=TRAIN_BATCH_SIZE, num_threads=1, rocal_cpu=RUN_ON_HOST, tensor_layout = types.NHWC)
with trainPipe:
inputs = fn.readers.tfrecord(path=TRAIN_RECORDS_DIR, index_path = "", reader_type=TFRecordReaderType, user_feature_key_map=featureKeyMap,
features={
'image/encoded':tf.io.FixedLenFeature((), tf.string, ""),
'image/class/label':tf.io.FixedLenFeature([1], tf.int64, -1),
'image/filename':tf.io.FixedLenFeature((), tf.string, "")
}
)
jpegs = inputs["image/encoded"]
images = fn.decoders.image(jpegs, user_feature_key_map=featureKeyMap, output_type=types.RGB, path=TRAIN_RECORDS_DIR)
resized = fn.resize(images, resize_x=crop_size[0], resize_y=crop_size[1])
flip_coin = fn.random.coin_flip(probability=0.5)
cmn_images = fn.crop_mirror_normalize(resized, crop=(crop_size[1], crop_size[0]),
mean=[0,0,0],
std=[255,255,255],
mirror=flip_coin,
output_dtype=types.FLOAT,
output_layout=types.NHWC,
pad_output=False)
trainPipe.set_outputs(cmn_images)
trainPipe.build()


3. Import libraries for `TensorFlow <https://github.com/ROCm/rocAL/blob/master/rocAL_pybind/examples/tf_petsTrainingExample/train_withROCAL_withTFRecordReader.py#L174>`_.

.. code-block:: python
:caption: Import libraries for TensorFlow

import tensorflow.compat.v1 as tf
tf.compat.v1.disable_v2_behavior()
import tensorflow_hub as hub
Call the train pipeline
trainIterator = ROCALIterator(trainPipe)
Run the training Session
i = 0
with tf.compat.v1.Session(graph = train_graph) as sess:
sess.run(tf.compat.v1.global_variables_initializer())
while i < NUM_TRAIN_STEPS:


for t, (train_image_ndArray, train_label_ndArray) in enumerate(trainIterator, 0):
train_label_one_hot_list = get_label_one_hot(train_label_ndArray)


4. To see and run a sample training script, refer to `rocAL TensorFlow example <https://github.com/ROCm/MIVisionX/tree/master/rocAL/rocAL_pybind/examples/tf_petsTrainingExample>`_.


.. _ml-perf:

Run MLPerf Resnet50 classification training with rocAL
=======================================================

#. Ensure you have downloaded ``ILSVRC2012_img_val.tar`` (6.3GB) and ``ILSVRC2012_img_train.tar`` (138 GB) files and unzip into ``train`` and ``val`` folders
#. Build `MIVisionX Pytorch docker <https://github.com/ROCm/rocAL/blob/master/docker/README.md>`_

* Run the docker image

.. code-block:: shell

sudo docker run -it -v <Path-To-Data-HostSystem>:/data -v /<Path-to-GitRepo>:/dockerx -w /dockerx --privileged --device=/dev/kfd --device=/dev/dri --group-add video --shm-size=4g --ipc="host" --network=host <docker-name>

.. note::
Refer to the `docker <https://github.com/ROCm/MIVisionX#docker>`_ page for prerequisites and information on building the docker image.

Optional: Map localhost directory on the docker image

* Option to map the localhost directory with imagenet dataset folder to be accessed on the docker image.
* Usage: ``-v {LOCAL_HOST_DIRECTORY_PATH}:{DOCKER_DIRECTORY_PATH}``

#. Install rocAL ``python_pybind`` plugin as described above
#. Clone `MLPerf <https://github.com/rrawther/MLPerf-mGPU>`_ repo and checkout ``mlperf-v1.1-rocal`` branch

.. code-block:: shell

git clone -b mlperf-v1.1-rocal https://github.com/rrawther/MLPerf-mGPU

#. Modify ``RN50_AMP_LARS_8GPUS_NCHW.sh`` or ``RN50_AMP_LARS_8GPUS_NHWC.sh`` to reflect correct path for imagenet directory
#. Run appropriate script as needed:

.. code-block:: shell

./RN50_AMP_LARS_8GPUS_NCHW.sh
(or)
./RN50_AMP_LARS_8GPUS_NHWC.sh

18 changes: 18 additions & 0 deletions docs/how-to/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
.. meta::
:description: rocAL documentation and API reference library
:keywords: rocAL, ROCm, API, documentation

.. _how-to:

********************************************************************
How to
********************************************************************

This section provides guides on how to use the rocAL library and its
different utilities.

* :ref:`overview`
* :ref:`architecture`
* :ref:`using-with-cpp`
* :ref:`using-with-python`
* :ref:`framework`
Loading
Loading