Chapter 5: Framework Integration

rocAL improves the pipeline efficiency by preprocessing the data and parallelizing the data loading on the CPU and running trainings on the GPU. To separate the data loading from the training, rocAL provides TensorFlow and PyTorch iterators and readers as a plugin. The integration process with PyTorch and TensorFlow is described in the sections below.

5.1 PyTorch Integration

This section demonstrates how to use rocAL with PyTorch for training. Follow the steps below to get started.

5.1.1 Build PyTorch Docker

Build a rocAL PyTorch docker by following the steps here.

5.1.2 Create Data-loading Pipeline

Follow these steps:

  1. Import libraries for rocAL.
from amd.rocal.plugin.pytorch import ROCALClassificationIterator
from amd.rocal.pipeline import Pipeline
import amd.rocal.fn as fn
import amd.rocal.types as types
  1. See a rocAL pipeline for PyTorch below. It reads data from the dataset using a fileReader and uses image_slice to decode the raw images. The other required augmentation operations are also defined in the pipeline.
def trainPipeline(data_path, batch_size, num_classes, one_hot, local_rank, world_size, num_thread, crop, rocal_cpu, fp16):
    pipe = Pipeline(batch_size=batch_size, num_threads=num_thread, device_id=local_rank, seed=local_rank+10, 
                rocal_cpu=rocal_cpu, tensor_dtype = types.FLOAT16 if fp16 else types.FLOAT, tensor_layout=types.NCHW, 
                prefetch_queue_depth = 7)
    with pipe:
        jpegs, labels = fn.readers.file(file_root=data_path, shard_id=local_rank, num_shards=world_size, random_shuffle=True)
        rocal_device = 'cpu' if rocal_cpu else 'gpu'
        # decode = fn.decoders.image(jpegs, output_type=types.RGB,file_root=data_path, shard_id=local_rank, num_shards=world_size, random_shuffle=True)
        decode = fn.decoders.image_slice(jpegs, output_type=types.RGB,
                                                    file_root=data_path, shard_id=local_rank, num_shards=world_size, random_shuffle=True)
        res = fn.resize(decode, resize_x=224, resize_y=224)
        flip_coin = fn.random.coin_flip(probability=0.5)
        cmnp = fn.crop_mirror_normalize(res, device="gpu",
                                         crop=(crop, crop),
            _ = fn.one_hot(labels, num_classes)
    print('rocal "{0}" variant'.format(rocal_device))
    return pipe
  1. Import libraries for PyTorch.
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
  1. Call the training pipeline with rocAL classification data loader.
Def get_pytorch_train_loader(self):
        print(“in get_pytorch_train_loader function”)   
        pipe_train = trainPipeline(self.data_path, self.batch_size, self.num_classes, self.one_hot, self.local_rank, 
                                    self.world_size, self.num_thread, self.crop, self.rocal_cpu, self.fp16)
        train_loader = ROCALClassificationIterator(pipe_train, device=”cpu” if self.rocal_cpu else “cuda”, device_id = self.local_rank)
  1. Run the training.
# Training loop
    for epoch in range(10):  # loop over the dataset multiple times
        print(“\n epoch:: “,epoch)
        running_loss = 0.0

        for i, (inputs,labels) in enumerate(train_loader, 0):

            sys.stdout.write(“\r Mini-batch “ + str(i))
            # print(“Images”,inputs)
            # print(“Labels”,labels)
            inputs, labels =,
  1. Run the training as shown here.

To see a sample training script, click here.

5.2 TensorFlow Integration

This section demonstrates how to use rocAL with TensorFlow for training. Follow the steps below to get started.

5.2.1 Build TensorFlow Docker

Build a rocAL TensorFlow docker by following the steps here.

5.2.2 Create Data-loading Pipeline

Follow these steps:

  1. Import libraries for rocAL.
from import ROCALIterator
from amd.rocal.pipeline import Pipeline
import amd.rocal.fn as fn
import amd.rocal.types as types
  1. See a rocAL pipeline for TensorFlow below. It reads data from the TFRecords using TFRecord Reader and uses fn.decoders.image to decode the raw images.
trainPipe = Pipeline(batch_size=TRAIN_BATCH_SIZE, num_threads=1, rocal_cpu=RUN_ON_HOST, tensor_layout = types.NHWC)
    with trainPipe:
        inputs = fn.readers.tfrecord(path=TRAIN_RECORDS_DIR, index_path = "", reader_type=TFRecordReaderType, user_feature_key_map=featureKeyMap,
            'image/encoded', tf.string, ""),
            'image/class/label'[1], tf.int64,  -1),
            'image/filename', tf.string, "")
        jpegs = inputs["image/encoded"]
        images = fn.decoders.image(jpegs, user_feature_key_map=featureKeyMap, output_type=types.RGB, path=TRAIN_RECORDS_DIR)
        resized = fn.resize(images, resize_x=crop_size[0], resize_y=crop_size[1])
        flip_coin = fn.random.coin_flip(probability=0.5)
        cmn_images = fn.crop_mirror_normalize(resized, crop=(crop_size[1], crop_size[0]),
  1. Import libraries for TensorFlow.
import tensorflow.compat.v1 as tf
import tensorflow_hub as hub
Call the train pipeline
 trainIterator = ROCALIterator(trainPipe)  
Run the training Session
 i = 0
    with tf.compat.v1.Session(graph = train_graph) as sess:
        while i < NUM_TRAIN_STEPS:

            for t, (train_image_ndArray, train_label_ndArray) in enumerate(trainIterator, 0):
                train_label_one_hot_list = get_label_one_hot(train_label_ndArray)
  1. Run the training as shown here.

To see a sample training script, click here.