Energy-Efficient Fixed-Point Hardware Accelerator for Embedded DNNs
Subject Areas : AI and Robotics
Marzie Mastalizade
1
,
Ali Ansarmohammadi
2
,
Najme Nazari
3
,
Mostafa Salehi
4
1 - Computer Architecture, faculty of Electrical and Computer Engineering, University of Tehran, Tehran, Iran
2 - PHD. student, Faculty of Electrical and Computer Engineering, Tehran University, Tehran, Iran
3 - PHD. student, Faculty of Electrical and Computer Engineering, Tehran University, Tehran, Iran
4 - University of Tehran
Keywords: Deep Neural Network, Embedded Systems, Energy-Efficiency, Fixed-point Quantization,
Abstract :
Deep Neural Networks (DNNs) have demonstrated remarkable performance in various application domains, such as computer vision, pattern recognition, and natural language processing. However, deploying these models on edge-computing devices poses a challenge due to their extensive memory requirements and computational complexity. These factors make it difficult to deploy DNNs on low-power and limited-resource devices. One promising technique to address this challenge is quantization, particularly fixed-point quantization. Previous studies have shown that reducing the bit-width of weights and activations, such as to 3 or 4 bits, through fixed-point quantization can preserve the classification accuracy of full-precision neural networks. Despite extensive research on the compression efficiency of fixed-point quantization techniques, their energy efficiency, a critical metric in evaluating embedded systems, has not been thoroughly explored. Therefore, this research aims to assess the energy efficiency of fixed-point quantization techniques while maintaining accuracy. To accomplish this, we present a model and design an architecture for each quantization method. Subsequently, we compare their area and energy efficiency at the same accuracy level. Our experimental results indicate that incorporating scaling factors and offsets into LSQ, a well-known quantization method, improves DNN accuracy by 0.1%. However, this improvement comes at the cost of a 3× decrease in hardware energy efficiency. This research highlights the significance of evaluating fixed-point quantization techniques not only in terms of compression efficiency but also in terms of energy efficiency when applied to edge-computing device.