VGTU talpykla: mNet2FPGA: A Design Flow for Mapping a Fixed-Point CNN to Zynq SoC FPGA

VGTU Talpykla
VGTU Repository

Vilniaus
Gedimino
Technikos
Universitetas

	Home

Browse
	Communities & Collections
	Issue Date
	Author
	Title
	Subject

Sign on to:
	Receive email updates
	My DSpace authorized users
	Edit Profile


	About DSpace

VGTU talpykla > Elektronikos fakultetas / Faculty of Electronics > Moksliniai straipsniai / Research articles >

Lietuvių English

Please use this identifier to cite or link to this item: http://dspace.vgtu.lt/handle/1/4049

Title:	mNet2FPGA: A Design Flow for Mapping a Fixed-Point CNN to Zynq SoC FPGA
Authors:	Sledevič, Tomyslav Serackis, Artūras
Keywords:	convolutional neural network (CNN) design flow field-programmable gate array (FPGA) hardware-software co-design
Issue Date:	2020
Publisher:	MDPI
Citation:	Sledevič, T.; Serackis, A. mNet2FPGA: A Design Flow for Mapping a Fixed-Point CNN to Zynq SoC FPGA. Electronics 2020, 9, 1823.
Series/Report no.:	9;11
Abstract:	The convolutional neural networks (CNNs) are a computation and memory demanding class of deep neural networks. The field-programmable gate arrays (FPGAs) are often used to accelerate the networks deployed in embedded platforms due to the high computational complexity of CNNs. In most cases, the CNNs are trained with existing deep learning frameworks and then mapped to FPGAs with specialized toolflows. In this paper, we propose a CNN core architecture called mNet2FPGA that places a trained CNN on a SoC FPGA. The processing system (PS) is responsible for convolution and fully connected core configuration according to the list of prescheduled instructions. The programmable logic holds cores of convolution and fully connected layers. The hardware architecture is based on the advanced extensible interface (AXI) stream processing with simultaneous bidirectional transfers between RAM and the CNN core. The core was tested on a cost-optimized Z-7020 FPGA with 16-bit fixed-point VGG networks. The kernel binarization and merging with the batch normalization layer were applied to reduce the number of DSPs in the multi-channel convolutional core. The convolutional core processes eight input feature maps at once and generates eight output channels of the same size and composition at 50 MHz. The core of the fully connected (FC) layer works at 100 MHz with up to 4096 neurons per layer. In a current version of the CNN core, the size of the convolutional kernel is fixed to 3×3. The estimated average performance is 8.6 GOPS for VGG13 and near 8.4 GOPS for VGG16/19 networks.
Description:	This article belongs to the Section Artificial Intelligence Circuits and Systems (AICAS)
URI:	http://dspace.vgtu.lt/handle/1/4049
ISSN:	2079-9292
Appears in Collections:	Moksliniai straipsniai / Research articles

Files in This Item:

File	Description	Size	Format
mNet2FPGA. A Design Flow for Mapping a Fixed-Point CNN to Zynq SoC FPGA.pdf		702.09 kB	Adobe PDF	View/Open

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

DSpace Software Copyright © 2002-2010 Duraspace - Feedback