LogicTronix have build and tested the DPU TRD for ZCU104, while the DPU IP Product Guide PG338 (v1.2) March 26, 2019 only have provided the steps for building for ZCU102. This development has DPU IP of DPU_v1.3.0 and it is tested on ZCU104 at May 5, 2019.
For the DPU IP Version 3.0 [Released at August 13, 2019] TRD for ZCU104 with VIVADO/Petalinux 2019.1, Please go through this tutorial: [DPU (3.0) TRD for ZCU104-Hackster].
1. Background:
According to PG338,” Xilinx DPU is a configurable engine dedicated for convolutional neural network. The computing parallelism can be configured according to the selected device and application. It includes a set of efficiently optimized instructions. It can support most convolutional neural networks, such as VGG, ResNet, GoogLeNet, YOLO, SSD, MobileNet, FPN, etc.”
This “DPU TRD for ZCU104 is targeting the ResNet implementation on DNNDK Package, For more information about the DNNDK package, you can refer the DNNDK User Guide- UG1327(v 1.4), April 29, 2019.
2. Let’s move to the main work:
There is also a query on Xilinx Forum for creating the DPU TRD for ZCU104, where the Xilinx Employee also shared the Tcl file for creating the “VIVADO Project for ZCU104 in the base of ZCU102 given sources”. The forum query can be accessed at: https://forums.xilinx.com/t5/Deephi-DNNDK/DPU-TRD-for-ZCU104/td-p/958695
Here is the block design of the system, it is just same as of ZCU102:
The snippet of the obtained result from the DPU TRD is here:
Load image : 2ILSVRC2012_test_00000035.JPEG, [gasmask, respirator, gas helmet] expected. Run ResNet50 CONV layers ... DPU CONV Execution time: 13648us DPU CONV Performance: 564.918GOPS Run ResNet50 FC layers ... DPU FC Execution time: 281us DPU FC Performance: 14.2349GOPS top[0] prob = 0.472463 name = gasmask, respirator, gas helmet top[1] prob = 0.135363 name = spotlight, spot top[2] prob = 0.063941 name = traffic light, traffic signal, stoplight top[3] prob = 0.049797 name = loudspeaker, speaker, speaker unit, loudspeaker system, speaker system top[4] prob = 0.030204 name = reflex camera Load image : 2ILSVRC2012_test_00000140.JPEG, [sidewinder, horned rattlesnake, Crotalus cerastes] expected. Run ResNet50 CONV layers ... DPU CONV Execution time: 13634us DPU CONV Performance: 565.498GOPS Run ResNet50 FC layers ... DPU FC Execution time: 285us DPU FC Performance: 14.0351GOPS top[0] prob = 0.283723 name = sidewinder, horned rattlesnake, Crotalus cerastes top[1] prob = 0.283723 name = night snake, Hypsiglena torquata top[2] prob = 0.220963 name = horned viper, cerastes, sand viper, horned asp, Cerastes cornutus top[3] prob = 0.081288 name = rock python, rock snake, Python sebae top[4] prob = 0.049304 name = king snake, kingsnake
You can get more detailed [not complete] output of the DPU TRD run on ZCU104 at: Download Output of DPU TRD on ZCU104 [not complete output]- Link
The image set used on this classification can also found at “DPU TRD ZCU102” inside “zcu102-dpu-trd-2018-2-190322/images/common” or at DPU Integration Tutorial Xilinx Github- image500_640_480:
Here is the “Demo Test-DPU TRD for ZCU104” download link of google drive [141MB]: https://drive.google.com/open?id=1AyxDg1TRwMIcIaAdqrQdf2cvw4Nj0uen. The “Demo Test” file includes,
We are thankful to Salvador Canas Moreno for his contribution on testing our demo test design of “DPU TRD for ZCU104” on ZCU104, without his contribution we wont be able to test and verify our result.
3. Download and test this design yourself
This is DNNDK DPU TRD for the ZCU104,we have build the “Demo Test” for the ZCU104. We followed the PG338 DPU TRD of ZCU102 to build the DPU TRD for the ZCU104 FPGA. We have tested and verified the result of the new DPU TRD on ZCU104. In this TRD we have Resnet50 of Convolutional Neural Network for teh application of image classification. For more details, please visit: PG338 and DNNDK reference can also be found at UF1327, Xilinx.
Here is the “Demo Test” download link of google drive [141MB]: https://drive.google.com/open?id=1AyxDg1TRwMIcIaAdqrQdf2cvw4Nj0uen
===RUN the Application on your ZCU104===
================Steps===================
- Download above “Demo Test”, extract it, copy all the contents of “ZCU104_dpu_trd_test” folder into SD Card, now eject the card and put on ZCU104.
- Make ready the ZCU104 as the DNNDK User Guide, UG1327[Page 18] of Xilinx.
- power cycle [turn on] the FPGA board
- open the serial ternimal program and connect with ZCU104 at 115200 baud rate
- change directory to “home”
- run “./resnet50”
- Now you must see the result on the terminal.
For connecting the ZCU104 in ethernet interface, please follow “DNNDK User Guide UG1327, Page 24”
=========================================
If you need any help on implementing this project on ZCU104 FPGA, then please write us an email to: info@logictronix.com. We will respond you within 24 hour!
4. Reference of this work are:
- DPU IP Product Guide PG338 (v1.2) March 26, 2019
- https://www.xilinx.com/products/design-tools/ai-inference/ai-developer-hub.html#edge, and we have used the: “zcu102-dpu-trd-2018-2-190322.zip” for this development.
Note: We prepared this testable design of “DPU TRD for ZCU104” for facilitating those people who are stuck while developing it. The main reference and main design is provided by Xilinx itself, so our main contribution on “DPU TRD for ZCU104” is just creating the design for ZCU104 by following the provided resources and details by Xilinx.