This tutorial demonstrates how to build a custom system that utilizes the 1.3.0 version of Xilinx® Deep Learning Processor (DPU) IP to accelerate machine learning algorithms using the following development flow:
-
Build the hardware platform in the Vivado® Design Suite.
-
Generate the Linux platform in PetaLinux.
-
Use Xilinx SDK to build two machine learning applications that take advantage of the DPU.
Note: The Ultra96 will be the targeted hardware platform. The DPU IP and yocto recipes are based on the ZCU102 DPU v1.3.0 TRD, which can be downloaded here.
This section lists the software and hardware tools required to use the Xilinx® Deep Learning Processor (DPU) IP to accelerate machine learning algorithms.
-
Vivado® Design Suite 2018.2
-
Board files for Ultra96 v1 should be installed
-
Xilinx SDK 2018.2
-
PetaLinux 2018.2
Note: This tutorial is known to work with Vivado/Petalinux/SDK v2018.3, but 2018.2 will provided the best experience at this time. To use with 2018.3, you’ll need to make the following three changes:
-
Edit the
u96_dpu_bd.tcl
script to specify 2018.3. -
Change the
petalinux-image.bbappend
topetalinux-image-full.bbappend
. -
The
protobuf
package isn’t needed.
-
The Ultra96 board
-
12V power supply for Ultra96
-
MicroUSB to USB-A cable
-
AES-ACC-USB-JTAG board
-
DisplayPort monitor
-
Mini-display port cable suitable for the chosen monitor
-
A blank, FAT32 formatted microSD card
-
USB webcam
Download and extract the full tutorial archive from this repository and move the DPU Integration/reference-files sub-directory to your working area. Rename this directory to "dpu_integration_lab". You should end up with a directory structure as shown in the following figure:
The folders are:
-
files: Petalinux/Yocto recipes, source code for SDK, etc.
-
hsi: Directory for handing off
.hdf
files from the Vivado Design Suite to PetaLinux -
ip_repo: Repository for the DPU IP
-
prebuilts: Includes a pre-built
.hdf
file exported from the Vivado Design Suite, and a complete set of files to boot from the SD card and run applications -
sdk_workspace: Empty Eclipse workspace to be used for Xilinx SDK application development
-
vivado: The Vivado Design Suite working directory includes an archived project for Ultra96 as well as a
.tcl
script to create a working.bd
-
sdcard: Staging area for creating the SD Card image
From here, the location of the root lab directory will be referred to as <PROJ ROOT>
.
TIP: There is a file called commands.txt in the files directory, that has most of the commands required for the lab. Copy and paste the file from this location to save time.
The high-level tool flow is shown in the following figure:
-
Create a new project for the Ultra96.
-
Add the DPU IP to the project.
-
Use a
.tcl
script to hook up the block design in the IP integrator. -
Examine the DPU configuration and connections.
-
Generate the bitstream.
-
Export the
.hdf
file.
-
Create a new PetaLinux project with the "Template Flow."
-
Add some new Yocto Recipes and recipe modifications.
-
Import the
.hdf
file from the Vivado Design Suite. -
Configure some Ultra96-specifc hardware options.
-
Add some necessary packages to the root filesystem.
-
Update the device-tree to add the DPU.
-
Build the project.
-
Create a boot image.
-
Create new application projects for resnet50 and face detection.
-
Import the application source code and model
.elfs
generated bydnnc
. -
Update the application settings to point to sysroot, include needed libraries, etc.
-
Build the applications.
cd
into the Vivado directory and launch Vivado.
cd <PROJ ROOT>/vivado/
vivado
-
Create a new project based on the Ultra96 boards files:
-
Project Name: project_1
-
Project Location:
<PROJ ROOT>/vivado
-
Do not specify sources
-
Select Ultra96v1 Evaluation Platform
Note: Make sure you select the v1 option. The U96v1 Board Files are not a part of the standard Vivado installation. They must be installed separately. It is assumed that this step is already completed.
-
-
Click Finish.
-
Click IP Catalog in the Project Manager.
-
Right-click Vivado Repository and select Add Repository.
-
Select /ip_repo Note: You should see a message indicating that one repository and one IP is added.
-
Open the TCL Console tab,
cd
to the<PROJ ROOT>/vivado
directory, and source the.tcl
script that has been provided to create the IP integrator block design for you:source u96_dpu_bd.tcl
-
When the block design is complete, right-click on the design_1 in the Sources tab and select Create HDL Wrapper.
-
Accept the default options.
- Analyze the components and connections in the block design before continuing.
To use the pre-built option, execute the following command to copy the pre-built .hdf
into the project:
cd <PROJ ROOT>
cp prebuilts/design_1_wrapper.hdf hsi`
Note: To save time, we can skip building the Vivado project during this lab session and manually export a pre-built .hdf
file to the directory where the PetaLinux flow expects it.
-
Click Generate Bitstream.
-
Accept the defaults.
Note: This step will take about 45 minutes, depending on the machine.
When the bitstream generation process is complete, do the following steps to export the .hdf
for use by PetaLinux:
-
Click File > Export > Export Hardware.
-
Make sure to include the bitstream.
-
Export the hardware platform to
<PROJ ROOT>/hsi
. -
Click OK.
You can begin with the PetaLinux flow, once the hardware definition file (.hdf
) is exported from the Vivado® Design Suite. At this point, you should have exported the .hdf
to the <PROJ ROOT>/hsi
directory.
Tip: To speed up text entry, use commands.txt
file from the <PROJ ROOT>/files
to copy and paste most of the commands. It is highly recommended that you copy and paste the commands to avoid command-line errors.
Note: All the commands are not available in the commands.txt
file. Make sure you see this lab document for proper sequencing.
Use the following command to create a new PetaLinux project based on the Zynq® UltraScale+ template in a new directory named petalinux
. This project is not based on an existing BSP.
source /opt/xilinx/petalinux/2018.2/settings.sh
cd <PROJ ROOT>
petalinux-create -t project -n petalinux --template zynqMP
cd petalinux
In this step, you will add or edit some Yocto recipes to customize the kernel and rootfs and add the dnndk files.
Note: Make sure to cd
in the PetaLinux directory first.
- Add a recipe for OpenCV v3.1. This is the version that is required by the DPU libraries, but PetaLinux builds v3.3 by default.
cp -rp ../files/recipes-support project-spec/meta-user
- Add a
bbappend
for theprotobuf
package to change the branch that its source is pulled from. This is needed due to the OpenCV v3.1 change.
cp -rp ../files/recipes-devtools project-spec/meta-user
- Add a
bbappend
to modify theLINUX_VERSION_EXTENSION
of the kernel. This is required to make the pre-builtdpu
kernel module (dpu.ko
) “version magic” match the kernel that we built. This step will not be necessary once the DPU kernel sources are integrated into the kernel build. Without this change,dpu.ko
will fail to be inserted at boot.
cp -rp ../files/recipes-kernel project-spec/meta-user
- Add a recipe to add the DPU utilities, libraries, and header files into the root file system.
cp -rp ../files/recipes-apps/dnndk/ project-spec/meta-user/recipes-apps/
- Add a recipe to build the DPU driver kernel module.
cp -rp ../files/recipes-modules project-spec/meta-user
- Add a recipe to create hooks for adding an “austostart” script to run automatically during Linux init.
cp -rp ../files/recipes-apps/autostart project-spec/meta-user/recipes-apps/
- Add a
bbappend
for the base-files recipe to do various things like auto insert the DPU driver, auto mount the SD card, modify the PATH, etc.
cp -rp ../files/recipes-core/base-files/ project-spec/meta-user/recipes-core/
- Modify the PetaLinux Yocto configuration to use OpenCV v3.1 instead of v3.3.
cp ../files/petalinuxbsp.conf project-spec/meta-user/conf/
vi project-spec/meta-user/recipes-core/images/petalinux-image.bbappend
Add the following lines:
IMAGE_INSTALL_append = " dnndk"
IMAGE_INSTALL_append = " autostart"
IMAGE_INSTALL_append = " dpu"
- Use the following command to open the top-level PetaLinux project confguration GUI:
petalinux-config --get-hw-description=../hsi
- Change the serial port to
PSU_UART1
.
Subsystem AUTO Hardware Settings->Serial Settings->Primary stdin/stdout = psu_uart1
Note: The UART that connects to the USB JTAG/UART board is psu_uart_1
.
-
Select Ultra96 Machine.
DTG Settings -> MACHINE_NAME = zcu100-revc
Note: The Ultra96 was originally called zcu100.
Tip: Use backspace to delete the default text, then add zcu100-revc.
By doing this, the build system uses the Ultra96-specific device-tree files.
-
Exit and save the changes. This step will take about 5-7 minutes.
Use the following to open the top-level PetaLinux project configuration GUI.
petalinux-config -c rootfs
-
Enable each item listed below:
Petalinux Package Groups ->
- opencv
- x11
- v4lutils
- matchbox
Note: Do not enable the dev or dbg packages.
Apps ->
- dnndk
- autostart
Filesystem Packages ->
- console->tools->protobuf (Note: this is related to the OpenCV modification)
- libs->libmali-xlnx->libmali-xlnx
Modules ->
- dpu
-
Exit and save the changes.
At this time, the DPU is not supported by the device-tree generator. Therefore, we need to manually add a device-tree node to the DPU, based on our hardware settings.
At the bottom of project-spec/meta-user/recipes-bsp/device-tree/files/system-user.dtsi
, add the following text:
Tip: You can copy and paste the amba node from <PROJ ROOT>/files/dpu.dtsi
.
Note: In this version of the DPU driver, only the interrupts and core-num parameters are being parsed. The DPU must be located at address 0x8F000000
. The reg
and memory
parameters are ignored.
PS Interface | GIC IRQ # | Linux IRQ # |
PL_PS_IRQ1[7:0] | 143:136 | 111:104 |
PL_PS_IRQ0[7:0] | 128:121 | 96:89 |
To calculate interrupt number(that is, the Linux IRQ), subtract 32 from the GIC IRQ number. For example, in the Vivado project, we connected to PL_PS_IRQ0[0]
whose GIC IRQ number is 121 (as per TRM).
Therefore, the Linux IRQ number is 121-32 = 89 (0x59).
In the device tree, each interrupt 3-tuple is defined as follows:
Interrupt | Description |
1st Cell | 0 = Shared Peripheral Interrupt (SPI) 1 = Processor to Processor Interrupt (PPI) |
2nd Cell | Linux Interrupt number |
3rd Cell | 1 = rising edge 2 = falling edge 4 = level high 8 = level low |
If the DPU IP is configured to use more than one core, you will need multiple sets of interrupts, and the core-num
parameter should be updated accordingly. For example, if you have three cores, interrupts
and core-num
should be set to the following values, assuming the interrupts are connected to PL_PS_IRQ0[2:0]
:
interrupts = <0x0 0x59 0x1 0x0 0x5a 0x1 0x0 0x5b 0x1 >;
core-num = <0x3>;
petalinux-build
cd images/linux
petalinux-package --boot --fsbl zynqmp_fsbl.elf --u-boot u-boot.elf /
--pmufw pmufw.elf --fpga system.bit --force
The sysroot
is required to build applications against the libraries/header files that are provided by some of the packages that are built into the root file system.
Running through the full process to rebuild the SDK can take over an hour to complete. Therefore, a pre-built SDK has been provided with the tutorial files.
To download the pre-built SDK file, download and extract the zip file from this link, then copy the sdk.sh
file to ../files
.
To install the pre-built SDK, use the following command:
cd <PROJ ROOT>/petalinux
petalinux-package --sysroot -s ../files/sdk.sh
If you want to go through full process to rebuild the SDK, use the following steps:
- Run the following command to build a Yocto SDK and copy it to
<PROJ ROOT>/petalinux/images/linux/sdk.sh
:
petalinux-build --sdk
- Run the following command to extract and install the generated SDK and sysroot into the specified directory:
petalinux-package --sysroot -d <directory>
Note: If you do not specify the directory (-d
), the SDK will be installed at images/linux/sdk
.
Use the following steps to build two machine learning applications that take advantage of the DPU, using the Xilinx® SDK:
Run the following command to launch the Xilinx SDK GUI:
xsdk
When the GUI opens, browse to the empty workspace at <PROJ ROOT>/sdk_workspace
.
Use the following steps to create a new application project:
-
Click File and select New Application Project
-
Enter the parameters as follows:
- Name: resnet50
- OS Platform: Linux
- Processor Type: psu_cortexa53
- Language: C++
-
Click Next
-
Select Empty Application
-
Click Finish.
Use the following steps to import source files and model .elfs files:
-
Click File and select Import -> General -> Filesystem.
-
Browse to
<PROJ ROOT>/files/resnet50
. -
Click OK.
-
Select main.cc.
-
Check if the
Into Folder
is set to resnet50/src. -
Click Finish, and allow it to overwrite
main.cc
. -
Follow the same steps to import the DPU model
.elfs
,dpu_resnet50_0.elf
, anddpu_resenet50_2.elf
files.
Note: You can use the pre-built models from <PROJ ROOT>/files/resnet50/B1152_1.3.0
, if you do not have your own.
Use the following steps to update the application build settings:
-
Right-click on resnet50 application and select C/C++ Build Settings.
-
In C/C++ Build -> Environment, add SYSROOT and point to the following:
${workspace_loc}/../petalinux/images/linux/sdk/sysroots/aarch64-xilinx-linux
-
Point the compiler and the linker to SYSROOT:
-
In the g++ linker libraries tab, add the following libraries:
-
In g++ linker -> Miscellaneous, add the model
.elfs
to Other Objects. -
Add
dpu_resnet50_0.elf
anddpu_resnet50_2.elf
from theresnet50/src directory
. Note: You can click Workspace to browse to the objects you want, as shown in the following figure:
**Note:** This will cause the `.elfs` to be statically linked to the application. It is also possible to dynamically link these objects at runtime(not covered in this guide).
- Click OK.
- Right-click on the resnet50 application and select Build Project.
Use the following steps to build the face detection application:
-
Repeat steps 2 through 5 above.
-
Add the source file /files/face_detection/face_detection.cc.
-
Delete
main.cc
from the project. -
Add
dpu_densebox.elf
from<PROJ ROOT>/files/face_detection/B1152_1.3.0
, if you do not have your own. -
Set the SYSROOT Environment Variable to the proper value.
-
Point to SYSROOT in compiler and linker miscellaneous settings.
-
Add the following libraries:
- n2cube
- dputils
- opencv_core
- opencv_imgcodecs
- opencv_highgui
- opencv_imgproc
- opencv_videoio
- pthread
-
For the g++ Linker Miscellaneous Other Objects, select
face_detection/src/dpu_densebox.elf
. -
Click OK.
-
Right-click on the face_detection application and select Build Project.
Use the following steps to set up Ultra96:
-
Connect a proper 12V power supply.
-
Connect the AES-ACC-USB-JTAG board.
-
Connect a microUSB cable between the AES-ACC-USB-JTAG and your PC.
-
Connect a DisplayPort Monitor using a miniDisplayPort cable.
-
Connect a USB webcam to one of the host USB ports.
-
Prepare a blank microSD card with a single FAT32 partition (this is done for you).
Next, we’ll gather all the images in a SD card staging area first, and then copy them all to the SD card at one time. There is a directory in PROJ_ROOT called sdcard that already includes the directories for the applications and the test images for resnet50. The test images are located in the /sdcard/common/image500_640_480 directory.
Use the following steps to copy the files to the SD card:
-
Copy
<PROJ ROOT>/petalinux/images/linux/image.ub
andBOOT.BIN
to thesdcard
directory. -
Copy
<PROJ_ROOT>/sdk_workspace/resnet50/Debug/resnet50.elf
to thesdcard/resnet50
folder. -
Copy
<PROJ_ROOT>/sdk_workspace/face_detection/Debug/face_detection.elf
to thesdcard/face_detection
folder.
Tip: Click here to execute all the commands at once.
- Copy and paste the following commands:
cd <PROJ ROOT>
cp petalinux/images/linux/image.ub sdcard
cp petalinux/images/linux/BOOT.BIN sdcard
cp sdk_workspace/resnet50/Debug/resnet50.elf sdcard/resnet50/
cp sdk_workspace/face_detection/Debug/face_detection.elf sdcard/face_detection/`
- Copy all the files in the
sdcard
directory to a blank microSD card on your PC. For subsequent updates, you can skip the common directory that contains the test images and only copy over the update boot images and/or applications.
Place the micro SD card into the Ultra96 and power on the board. Once the board has booted, login using the following credentials:
- username = root
- password = root
Run the commands below to prepare the display:
export DISPLAY=:0.0
xrandr --output DP-1 --mode 800x600
xset -dpms
Note: Use xrandr
to find a suitable mode for your monitor. When running at 1920x1080, the screen may flicker due to memory bandwidth issues.
If the display goes blank between runs, use xset -dpms
to re-enable the display.
Change to the directory with the resnet50
application and execute the program.
• cd /media/card/resnet50
• ./resnet50.elf
Change to the following directories with the face_detection application and execute the program.
cd /media/card/face_detection
./face_detection.elf
Note: If you see “Open camera error!”, try unplugging the USB camera and inserting it again. If it still isn’t recognized, try rebooting with the camera unplugged, then plug in the camera before launching the application. If both of these efforts fail, try a different camera.