LogicTronix have build and tested the DPU TRD for ZCU104, while the DPU IP Product Guide PG338 (v1.2) March 26, 2019 only have provided the steps for building for ZCU102. This development has DPU IP of  DPU_v1.3.0 and it is tested on ZCU104 at May 5, 2019.

For the DPU IP Version 3.0 [Released at August 13, 2019] TRD for ZCU104 with VIVADO/Petalinux  2019.1, Please go through this tutorial: [DPU (3.0) TRD for ZCU104-Hackster].

1. Background:

According to PG338,” Xilinx DPU is a configurable engine dedicated for convolutional neural network. The computing parallelism can be configured according to the selected device and application. It includes a set of efficiently optimized instructions. It can support most convolutional neural networks, such as VGG, ResNet, GoogLeNet, YOLO, SSD, MobileNet, FPN, etc.”

This “DPU TRD for ZCU104 is targeting the ResNet implementation on DNNDK Package, For more information about the DNNDK package, you can refer the DNNDK User Guide- UG1327(v 1.4), April 29, 2019.

2. Let’s move to the main work:

There is also a query on Xilinx Forum for creating the DPU TRD for ZCU104, where the Xilinx Employee also shared the Tcl file for creating the “VIVADO Project for ZCU104 in the base of ZCU102 given sources”. The forum query can be accessed at:  https://forums.xilinx.com/t5/Deephi-DNNDK/DPU-TRD-for-ZCU104/td-p/958695

Here is the block design of the system, it is just same as of ZCU102:

The snippet of the obtained result from the DPU TRD is here:

Load image : 2ILSVRC2012_test_00000035.JPEG, [gasmask, respirator, gas helmet] expected. 

Run ResNet50 CONV layers ...
  DPU CONV Execution time: 13648us
  DPU CONV Performance: 564.918GOPS
Run ResNet50 FC layers ...
  DPU FC Execution time: 281us
  DPU FC Performance: 14.2349GOPS
top[0] prob = 0.472463  name = gasmask, respirator, gas helmet
top[1] prob = 0.135363  name = spotlight, spot
top[2] prob = 0.063941  name = traffic light, traffic signal, stoplight
top[3] prob = 0.049797  name = loudspeaker, speaker, speaker unit, loudspeaker system, speaker system
top[4] prob = 0.030204  name = reflex camera

Load image : 2ILSVRC2012_test_00000140.JPEG, [sidewinder, horned rattlesnake, Crotalus cerastes] expected. 

Run ResNet50 CONV layers ...
  DPU CONV Execution time: 13634us
  DPU CONV Performance: 565.498GOPS
Run ResNet50 FC layers ...
  DPU FC Execution time: 285us
  DPU FC Performance: 14.0351GOPS
top[0] prob = 0.283723  name = sidewinder, horned rattlesnake, Crotalus cerastes
top[1] prob = 0.283723  name = night snake, Hypsiglena torquata
top[2] prob = 0.220963  name = horned viper, cerastes, sand viper, horned asp, Cerastes cornutus
top[3] prob = 0.081288  name = rock python, rock snake, Python sebae
top[4] prob = 0.049304  name = king snake, kingsnake

You can get more detailed [not complete] output of the DPU TRD run on ZCU104 at: Download Output of DPU TRD on ZCU104 [not complete output]- Link

The image set used on this classification can also found at “DPU  TRD ZCU102” inside “zcu102-dpu-trd-2018-2-190322/images/common” or at DPU Integration Tutorial Xilinx Github- image500_640_480:

Here is the “Demo Test-DPU TRD for ZCU104” download link of google drive [141MB]: https://drive.google.com/open?id=1AyxDg1TRwMIcIaAdqrQdf2cvw4Nj0uen. The “Demo Test” file includes,

We are thankful to Salvador Canas Moreno for his contribution on testing our demo test design of “DPU TRD for ZCU104” on ZCU104, without his contribution we wont be able to test and verify our result.

3. Download and test this design yourself

This is DNNDK DPU TRD for the ZCU104,we have build the “Demo Test” for the ZCU104. We followed the PG338 DPU TRD of ZCU102 to build the DPU TRD for the ZCU104 FPGA. We have tested and verified the result of the new DPU TRD on ZCU104. In this TRD we have Resnet50 of Convolutional Neural Network for teh application of image classification. For more details, please visit: PG338 and DNNDK reference can also be found at UF1327, Xilinx.

Here is the “Demo Test” download link of google drive [141MB]: https://drive.google.com/open?id=1AyxDg1TRwMIcIaAdqrQdf2cvw4Nj0uen

===RUN the Application on your ZCU104===

================Steps===================

  1. Download above “Demo Test”, extract it, copy all the contents of “ZCU104_dpu_trd_test” folder into SD Card, now eject the card and put on ZCU104.
  2. Make ready the ZCU104 as the DNNDK User Guide, UG1327[Page 18] of Xilinx.
  3. power cycle [turn on] the FPGA board
  4. open the serial ternimal program and connect with ZCU104 at 115200 baud rate
  5. change directory to “home”
  6. run “./resnet50”
  7. Now you must see the result on the terminal.

For connecting the ZCU104 in ethernet interface, please follow “DNNDK User Guide UG1327, Page 24”

=========================================

If you need any help on implementing this project on  ZCU104 FPGA, then please write us an email to: info@logictronix.com. We will respond you within 24 hour!

4. Reference of this work are:
  1. DPU IP Product Guide PG338 (v1.2) March 26, 2019
  2. https://www.xilinx.com/products/design-tools/ai-inference/ai-developer-hub.html#edge,  and we have used the: “zcu102-dpu-trd-2018-2-190322.zip” for this development.

Note: We prepared this testable design of “DPU TRD for ZCU104” for facilitating those people who are stuck while developing it. The main reference and main design is provided by Xilinx itself, so our main contribution on “DPU TRD for ZCU104” is just creating the design for ZCU104 by following the provided resources and details by Xilinx.