Skip to content

Latest commit

 

History

History
 
 

DPU-Integration

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

Edge AI Tutorials

DPU Integration Tutorial

Introduction

This tutorial demonstrates how to build a custom system that utilizes the 1.3.0 version of Xilinx® Deep Learning Processor (DPU) IP to accelerate machine learning algorithms using the following development flow:

  1. Build the hardware platform in the Vivado® Design Suite.

  2. Generate the Linux platform in PetaLinux.

  3. Use Xilinx SDK to build two machine learning applications that take advantage of the DPU.

Note: The Ultra96 will be the targeted hardware platform. The DPU IP and yocto recipes are based on the ZCU102 DPU v1.3.0 TRD, which can be downloaded here.

Requirements for Using the Xilinx DPU

This section lists the software and hardware tools required to use the Xilinx® Deep Learning Processor (DPU) IP to accelerate machine learning algorithms.

Software Requirements

  • Vivado® Design Suite 2018.2

  • Board files for Ultra96 v1 should be installed

  • Xilinx SDK 2018.2

  • PetaLinux 2018.2

Note: This tutorial is known to work with Vivado/Petalinux/SDK v2018.3, but 2018.2 will provided the best experience at this time. To use with 2018.3, you’ll need to make the following three changes:

  1. Edit the u96_dpu_bd.tcl script to specify 2018.3.

  2. Change the petalinux-image.bbappend to petalinux-image-full.bbappend.

  3. The protobuf package isn’t needed.

Hardware Requirements

  • The Ultra96 board

  • 12V power supply for Ultra96

  • MicroUSB to USB-A cable

  • AES-ACC-USB-JTAG board

  • DisplayPort monitor

  • Mini-display port cable suitable for the chosen monitor

  • A blank, FAT32 formatted microSD card

  • USB webcam

Project Archive

Download and extract the full tutorial archive from this repository and move the DPU Integration/reference-files sub-directory to your working area. Rename this directory to "dpu_integration_lab". You should end up with a directory structure as shown in the following figure:

Directory Structure

The folders are:

  • files: Petalinux/Yocto recipes, source code for SDK, etc.

  • hsi: Directory for handing off .hdf files from the Vivado Design Suite to PetaLinux

  • ip_repo: Repository for the DPU IP

  • prebuilts: Includes a pre-built .hdf file exported from the Vivado Design Suite, and a complete set of files to boot from the SD card and run applications

  • sdk_workspace: Empty Eclipse workspace to be used for Xilinx SDK application development

  • vivado: The Vivado Design Suite working directory includes an archived project for Ultra96 as well as a .tcl script to create a working .bd

  • sdcard: Staging area for creating the SD Card image

From here, the location of the root lab directory will be referred to as <PROJ ROOT>.

TIP: There is a file called commands.txt in the files directory, that has most of the commands required for the lab. Copy and paste the file from this location to save time.

Project Overview

The high-level tool flow is shown in the following figure:

Project Overview

Vivado Design Suite

  1. Create a new project for the Ultra96.

  2. Add the DPU IP to the project.

  3. Use a .tcl script to hook up the block design in the IP integrator.

  4. Examine the DPU configuration and connections.

  5. Generate the bitstream.

  6. Export the .hdf file.

PetaLinux

  1. Create a new PetaLinux project with the "Template Flow."

  2. Add some new Yocto Recipes and recipe modifications.

  3. Import the .hdf file from the Vivado Design Suite.

  4. Configure some Ultra96-specifc hardware options.

  5. Add some necessary packages to the root filesystem.

  6. Update the device-tree to add the DPU.

  7. Build the project.

  8. Create a boot image.

Xilinx SDK

  1. Create new application projects for resnet50 and face detection.

  2. Import the application source code and model .elfs generated by dnnc.

  3. Update the application settings to point to sysroot, include needed libraries, etc.

  4. Build the applications.

Building the Hardware Platform in the Vivado Design Suite

Step 1: Create a project in the Vivado® Design Suite

  1. cd into the Vivado directory and launch Vivado.
cd <PROJ ROOT>/vivado/
vivado
  1. Create a new project based on the Ultra96 boards files:

    • Project Name: project_1

    • Project Location: <PROJ ROOT>/vivado

    • Do not specify sources

    • Select Ultra96v1 Evaluation Platform

    Note: Make sure you select the v1 option. The U96v1 Board Files are not a part of the standard Vivado installation. They must be installed separately. It is assumed that this step is already completed.

  2. Click Finish.

Step 2: Add the IP repository (containing the DPU IP) to the IP catalog

  1. Click IP Catalog in the Project Manager.

  2. Right-click Vivado Repository and select Add Repository.

  3. Select /ip_repo Note: You should see a message indicating that one repository and one IP is added.

Step 3: Create the Block Design

  1. Open the TCL Console tab, cd to the <PROJ ROOT>/vivado directory, and source the .tcl script that has been provided to create the IP integrator block design for you:

    source u96_dpu_bd.tcl
    
  2. When the block design is complete, right-click on the design_1 in the Sources tab and select Create HDL Wrapper.

  3. Accept the default options.

Block Design

  1. Analyze the components and connections in the block design before continuing.

Step 4: Copy the pre-built .hdf to the hsi directory (Optional)

To use the pre-built option, execute the following command to copy the pre-built .hdf into the project:

cd <PROJ ROOT>
cp prebuilts/design_1_wrapper.hdf hsi`

Note: To save time, we can skip building the Vivado project during this lab session and manually export a pre-built .hdf file to the directory where the PetaLinux flow expects it.

Step 5: Generate the bitstream

  1. Click Generate Bitstream.

  2. Accept the defaults.

Note: This step will take about 45 minutes, depending on the machine.

Step 6: Export hardware

When the bitstream generation process is complete, do the following steps to export the .hdf for use by PetaLinux:

  1. Click File > Export > Export Hardware.

  2. Make sure to include the bitstream.

  3. Export the hardware platform to <PROJ ROOT>/hsi.

  4. Click OK.

    Export Hardware

Generating the Linux Platform in PetaLinux

You can begin with the PetaLinux flow, once the hardware definition file (.hdf) is exported from the Vivado® Design Suite. At this point, you should have exported the .hdf to the <PROJ ROOT>/hsi directory.

Tip: To speed up text entry, use commands.txt file from the <PROJ ROOT>/files to copy and paste most of the commands. It is highly recommended that you copy and paste the commands to avoid command-line errors.

Note: All the commands are not available in the commands.txt file. Make sure you see this lab document for proper sequencing.

Step 1: Create a PetaLinux project

Use the following command to create a new PetaLinux project based on the Zynq® UltraScale+ template in a new directory named petalinux. This project is not based on an existing BSP.

source /opt/xilinx/petalinux/2018.2/settings.sh
cd  <PROJ ROOT>
petalinux-create -t project -n petalinux --template zynqMP
cd petalinux

Step 2: Copy recipes to the PetaLinux project

In this step, you will add or edit some Yocto recipes to customize the kernel and rootfs and add the dnndk files.

Note: Make sure to cd in the PetaLinux directory first.

  1. Add a recipe for OpenCV v3.1. This is the version that is required by the DPU libraries, but PetaLinux builds v3.3 by default.
cp -rp ../files/recipes-support project-spec/meta-user
  1. Add a bbappend for the protobuf package to change the branch that its source is pulled from. This is needed due to the OpenCV v3.1 change.
cp -rp ../files/recipes-devtools project-spec/meta-user
  1. Add a bbappend to modify the LINUX_VERSION_EXTENSION of the kernel. This is required to make the pre-built dpu kernel module (dpu.ko) “version magic” match the kernel that we built. This step will not be necessary once the DPU kernel sources are integrated into the kernel build. Without this change, dpu.ko will fail to be inserted at boot.
cp -rp ../files/recipes-kernel project-spec/meta-user
  1. Add a recipe to add the DPU utilities, libraries, and header files into the root file system.
cp -rp ../files/recipes-apps/dnndk/ project-spec/meta-user/recipes-apps/
  1. Add a recipe to build the DPU driver kernel module.
cp -rp ../files/recipes-modules project-spec/meta-user
  1. Add a recipe to create hooks for adding an “austostart” script to run automatically during Linux init.
cp -rp ../files/recipes-apps/autostart project-spec/meta-user/recipes-apps/
  1. Add a bbappend for the base-files recipe to do various things like auto insert the DPU driver, auto mount the SD card, modify the PATH, etc.
cp -rp ../files/recipes-core/base-files/ project-spec/meta-user/recipes-core/
  1. Modify the PetaLinux Yocto configuration to use OpenCV v3.1 instead of v3.3.
cp ../files/petalinuxbsp.conf project-spec/meta-user/conf/

Step 3: Configure PetaLinux to install the dnndk files

vi project-spec/meta-user/recipes-core/images/petalinux-image.bbappend

Add the following lines:

  IMAGE_INSTALL_append = " dnndk"
  IMAGE_INSTALL_append = " autostart"
  IMAGE_INSTALL_append = " dpu"

Step 4: Point the PetaLinux build system to the .hdf file exported from the Vivado Design Suite

  1. Use the following command to open the top-level PetaLinux project confguration GUI:
petalinux-config --get-hw-description=../hsi
  1. Change the serial port to PSU_UART1.
Subsystem AUTO Hardware Settings->Serial Settings->Primary stdin/stdout = psu_uart1

Note: The UART that connects to the USB JTAG/UART board is psu_uart_1.

Subsystem AUTO Hardware Settings

  1. Select Ultra96 Machine.

    DTG Settings -> MACHINE_NAME = zcu100-revc
    

    Note: The Ultra96 was originally called zcu100.

    Tip: Use backspace to delete the default text, then add zcu100-revc.

    By doing this, the build system uses the Ultra96-specific device-tree files.

    DTG Settings

  2. Exit and save the changes. This step will take about 5-7 minutes.

Step 5: Configure the rootfs

Use the following to open the top-level PetaLinux project configuration GUI.

petalinux-config -c rootfs
  1. Enable each item listed below:

    Petalinux Package Groups ->

    • opencv
    • x11
    • v4lutils
    • matchbox

    Note: Do not enable the dev or dbg packages.

    Apps ->

    • dnndk
    • autostart

    Filesystem Packages ->

    • console->tools->protobuf (Note: this is related to the OpenCV modification)
    • libs->libmali-xlnx->libmali-xlnx

    Modules ->

    • dpu
  2. Exit and save the changes.

Step 6: Add DPU to the device tree

At this time, the DPU is not supported by the device-tree generator. Therefore, we need to manually add a device-tree node to the DPU, based on our hardware settings.

At the bottom of project-spec/meta-user/recipes-bsp/device-tree/files/system-user.dtsi, add the following text:

DPU Integration

Tip: You can copy and paste the amba node from <PROJ ROOT>/files/dpu.dtsi.

Note: In this version of the DPU driver, only the interrupts and core-num parameters are being parsed. The DPU must be located at address 0x8F000000. The reg and memory parameters are ignored.

Interrupt Values

PS Interface GIC IRQ # Linux IRQ #
PL_PS_IRQ1[7:0] 143:136 111:104
PL_PS_IRQ0[7:0] 128:121 96:89

To calculate interrupt number(that is, the Linux IRQ), subtract 32 from the GIC IRQ number. For example, in the Vivado project, we connected to PL_PS_IRQ0[0] whose GIC IRQ number is 121 (as per TRM). Therefore, the Linux IRQ number is 121-32 = 89 (0x59).

In the device tree, each interrupt 3-tuple is defined as follows:

Interrupt Description
1st Cell 0 = Shared Peripheral Interrupt (SPI)
1 = Processor to Processor Interrupt (PPI)
2nd Cell Linux Interrupt number
3rd Cell 1 = rising edge
2 = falling edge
4 = level high
8 = level low

Adding more DPU Cores

If the DPU IP is configured to use more than one core, you will need multiple sets of interrupts, and the core-num parameter should be updated accordingly. For example, if you have three cores, interrupts and core-num should be set to the following values, assuming the interrupts are connected to PL_PS_IRQ0[2:0]:

interrupts = <0x0 0x59 0x1 0x0 0x5a 0x1 0x0 0x5b 0x1 >;
core-num = <0x3>;

Step 7: Build the kernel and root file system

petalinux-build

Step 8: Create the boot image

cd images/linux

petalinux-package --boot --fsbl zynqmp_fsbl.elf --u-boot u-boot.elf /
--pmufw pmufw.elf --fpga system.bit --force

Step 9: Create sysroot

The sysroot is required to build applications against the libraries/header files that are provided by some of the packages that are built into the root file system.

Installing the Pre-Built SDK

Running through the full process to rebuild the SDK can take over an hour to complete. Therefore, a pre-built SDK has been provided with the tutorial files.

To download the pre-built SDK file, download and extract the zip file from this link, then copy the sdk.sh file to ../files.

To install the pre-built SDK, use the following command:

cd <PROJ ROOT>/petalinux
petalinux-package --sysroot -s  ../files/sdk.sh

Rebuilding the SDK

If you want to go through full process to rebuild the SDK, use the following steps:

  1. Run the following command to build a Yocto SDK and copy it to <PROJ ROOT>/petalinux/images/linux/sdk.sh:
petalinux-build --sdk
  1. Run the following command to extract and install the generated SDK and sysroot into the specified directory:
petalinux-package --sysroot -d <directory>

Note: If you do not specify the directory (-d), the SDK will be installed at images/linux/sdk.

Build Machine Learning Applications Using Xilinx SDK

Use the following steps to build two machine learning applications that take advantage of the DPU, using the Xilinx® SDK:

Step 1: Launch Xilinx SDK

Run the following command to launch the Xilinx SDK GUI:

xsdk

When the GUI opens, browse to the empty workspace at <PROJ ROOT>/sdk_workspace.

Step 2: Create a New Application Project

Use the following steps to create a new application project:

  1. Click File and select New Application Project

  2. Enter the parameters as follows:

    • Name: resnet50
    • OS Platform: Linux
    • Processor Type: psu_cortexa53
    • Language: C++
  3. Click Next

  4. Select Empty Application

  5. Click Finish.

New Project

Step 3: Import Source Files and Model .elf Files

Use the following steps to import source files and model .elfs files:

  1. Click File and select Import -> General -> Filesystem.

  2. Browse to <PROJ ROOT>/files/resnet50.

  3. Click OK.

  4. Select main.cc.

  5. Check if the Into Folder is set to resnet50/src.

  6. Click Finish, and allow it to overwrite main.cc.

  7. Follow the same steps to import the DPU model .elfs, dpu_resnet50_0.elf, and dpu_resenet50_2.elf files.

Note: You can use the pre-built models from <PROJ ROOT>/files/resnet50/B1152_1.3.0, if you do not have your own.

Step 4: Update the Application Build Settings

Use the following steps to update the application build settings:

  1. Right-click on resnet50 application and select C/C++ Build Settings.

  2. In C/C++ Build -> Environment, add SYSROOT and point to the following:

    ${workspace_loc}/../petalinux/images/linux/sdk/sysroots/aarch64-xilinx-linux
    

Environment Variables

  1. Point the compiler and the linker to SYSROOT:

    • g++ linker settings:

      Miscellaneous -> Linker Flags : --sysroot=${SYSROOT}

      Linker Flags

    • g++ compiler settings:

      Miscellaneous -> Other Flags: --sysroot=${SYSROOT} Other Flags

  2. In the g++ linker libraries tab, add the following libraries:

    • n2cube

    • dputils

    • opencv_core

    • opencv_imgcodecs

    • opencv_highgui

      Linker libraries

  3. In g++ linker -> Miscellaneous, add the model .elfs to Other Objects.

  4. Add dpu_resnet50_0.elf and dpu_resnet50_2.elf from the resnet50/src directory. Note: You can click Workspace to browse to the objects you want, as shown in the following figure:

File Selection

Other Objects

**Note:** This will cause the `.elfs` to be statically linked to the application. It is also possible to dynamically link these objects at runtime(not covered in this guide).
  1. Click OK.
  2. Right-click on the resnet50 application and select Build Project.

Step 5: Build the Face Detection Application

Use the following steps to build the face detection application:

  1. Repeat steps 2 through 5 above.

  2. Add the source file /files/face_detection/face_detection.cc.

  3. Delete main.cc from the project.

  4. Add dpu_densebox.elf from <PROJ ROOT>/files/face_detection/B1152_1.3.0, if you do not have your own.

  5. Set the SYSROOT Environment Variable to the proper value.

  6. Point to SYSROOT in compiler and linker miscellaneous settings.

  7. Add the following libraries:

    • n2cube
    • dputils
    • opencv_core
    • opencv_imgcodecs
    • opencv_highgui
    • opencv_imgproc
    • opencv_videoio
    • pthread
  8. For the g++ Linker Miscellaneous Other Objects, select face_detection/src/dpu_densebox.elf.

  9. Click OK.

  10. Right-click on the face_detection application and select Build Project.

Working with Ultra96

Setting up Ultra96

Use the following steps to set up Ultra96:

  1. Connect a proper 12V power supply.

  2. Connect the AES-ACC-USB-JTAG board.

  3. Connect a microUSB cable between the AES-ACC-USB-JTAG and your PC.

  4. Connect a DisplayPort Monitor using a miniDisplayPort cable.

  5. Connect a USB webcam to one of the host USB ports.

  6. Prepare a blank microSD card with a single FAT32 partition (this is done for you).

    Ultra96

Running Applications on Ultra96

Next, we’ll gather all the images in a SD card staging area first, and then copy them all to the SD card at one time. There is a directory in PROJ_ROOT called sdcard that already includes the directories for the applications and the test images for resnet50. The test images are located in the /sdcard/common/image500_640_480 directory.

Step 1: Copy files to the SD card

Use the following steps to copy the files to the SD card:

  1. Copy <PROJ ROOT>/petalinux/images/linux/image.ub and BOOT.BIN to the sdcard directory.

  2. Copy <PROJ_ROOT>/sdk_workspace/resnet50/Debug/resnet50.elf to the sdcard/resnet50 folder.

  3. Copy <PROJ_ROOT>/sdk_workspace/face_detection/Debug/face_detection.elf to the sdcard/face_detection folder.

Tip: Click here to execute all the commands at once.

  1. Copy and paste the following commands:
 cd <PROJ ROOT>
 cp petalinux/images/linux/image.ub sdcard
 cp petalinux/images/linux/BOOT.BIN sdcard
 cp sdk_workspace/resnet50/Debug/resnet50.elf sdcard/resnet50/
 cp sdk_workspace/face_detection/Debug/face_detection.elf  sdcard/face_detection/`
  1. Copy all the files in the sdcard directory to a blank microSD card on your PC. For subsequent updates, you can skip the common directory that contains the test images and only copy over the update boot images and/or applications.

Step 2: Boot the Ultra96

Place the micro SD card into the Ultra96 and power on the board. Once the board has booted, login using the following credentials:

  • username = root
  • password = root

Step 3: Initialize the display

Run the commands below to prepare the display:

export DISPLAY=:0.0
xrandr --output DP-1 --mode 800x600
xset -dpms

Note: Use xrandr to find a suitable mode for your monitor. When running at 1920x1080, the screen may flicker due to memory bandwidth issues.

If the display goes blank between runs, use xset -dpms to re-enable the display.

Step 4: Run Resnet50

Change to the directory with the resnet50 application and execute the program. • cd /media/card/resnet50 • ./resnet50.elf

Step 5: Run face detection

Change to the following directories with the face_detection application and execute the program.

  • cd /media/card/face_detection
  • ./face_detection.elf

Note: If you see “Open camera error!”, try unplugging the USB camera and inserting it again. If it still isn’t recognized, try rebooting with the camera unplugged, then plug in the camera before launching the application. If both of these efforts fail, try a different camera.