Conference material: "Proceedings of the International Conference on Computer Graphics and Vision “Graphicon” (19-21 September 2022, Ryazan)"
Authors:Levkin T., Skonnikov P.N., Trofimov D.V.
Real-Time Implementation of Modified YOLO Object Detector on FPGA
Some operating principles of the YOLO object detector are considered. The functions of processing the array elements coming from the last convolutional layer are analyzed. It is shown that three different types of functions are used to process various parameters. For example, the logistic sigmoid is used to predict the positioning accuracy and the bounding boxes vertical and horizontal offsets. The exponent is used to predict the height and width of boxes, and the SoftMax function is used to predict class probabilities. Formulas for the derivatives of these functions, which are required when training the network, are given. The replacement of these functions with simpler ones for implementation in hardware is proposed. The logistic function is replaced by a rational sigmoid, the exponent is replaced by a shifted fourth-order parabolic curve. The modified SoftMax function also uses a shifted parabola. The derivatives of proposed functions are presented to calculate the error backpropagation. Using the obtained formulas, a YOLO detector based on a modified neural network was trained. The modified YOLO detector with the proposed functions is implemented on a specially designed Xilinx FPGA board. Experimental studies of the board showed the high speed of the detector (more than 60 frames per second) and the high quality of object detection and classification.
Digital image processing, artificial neural networks, YOLO, backpropagation, output layer