TinyVTA - A General Purpose FPGA-based Deep Neural Network Accelerator
Click for more details about the project
tinyVTA is a high-performance FPGA-based tensor accelerator designed for deep neural network inference. It features optimized block matrix-multiply-accumulate (MMAC) and activation function (ACTIV) kernels, implemented using Vitis HLS and invoked through a custom instruction set architecture (ISA). The project includes a custom Pythonic compiler (tinyTVM) that translates PyTorch and TensorFlow fully-connected neural network models into tinyVTA executable instructions. It utilizes the Pynq API for memory management via PynqBuffer objects and PS/PL interfacing with AXI4 protocols. Additionally, APIs are created using tornado, enabling any client to compile their DNN models, remotely program a supported FPGA connected to a server, and perform ML inference through remote procedure calls. The accelerator was evaluated on the Xilinx UltraScale+ ZCU104 FPGA with a fully connected neural network for MNIST digit recognition, achieving precise hardware-software output consistency through extensive verification with a software testbench and hardware validation.