چندی‌سازی غیریکنواخت سه حالتی جهت بهبود تنکی و محاسبات شبکه‌های عصبی عمیق در کاربردهای نهفته

محورهای موضوعی : هوش مصنوعی و رباتیک

حسنا معنوی مفرد ¹ , سید علی انصارمحمدی ² , مصطفی ارسالی صالحی نسب ³

1 - دانشجو
2 - دانشجو مقطع دکتری، دانشکده مهندسی برق و کامپیوتر، دانشگاه تهران، تهران، ایران
3 - دانشگاه تهران

تاریخ دریافت : 1401/10/13 تاریخ پذیرش : 1402/03/04 تاریخ انتشار : 1403/03/31

کلید واژه: شبکه‌های عصبی عمیق, چندی‌سازی غیریکنواخت سه حالتی, شبکه عصبی تنک, هرس کردن, دستگاه‌های نهفته,

چکیده مقاله :

شبکه‌های عصبی عمیق به دلیل موفقیت در کاربردهای مختلف، به جذابیت فوق‌العاده‌ای دست یافته‌اند. اما پیچیدگی محاسبات و حجم حافظه از موانع اصلی برای پیاده‌سازی آن‌ها در بسیاری از دستگاه‌های نهفته تلقی می‌شود. از مهم‌ترین روش‌های بهینه‌سازی که در سال‌های اخیر برای برطرف نمودن این موانع ارائه شده، می‌توان به چندی‌سازی‌ و هرس کردن اشاره کرد. یکی از روش‌های معروف چندی‌سازی، استفاده از نمایش اعداد غیریکنواخت دو حالتی است که علاوه بر بهره‌بردن از محاسبات بیتی، افت صحت شبکه‌های دو حالتی را در مقایسه با شبکه‌های دقت کامل کاهش می‌دهد. اما به دلیل نداشتن قابلیت نمایش عدد صفر در آن‌ها، مزایای تنکی داده‌ها را از دست میدهند. از طرفی، شبکه‌های عصبی عمیق به صورت ذاتی تنک هستند و با تنک کردن پارامترهای شبکه عصبی عمیق، حجم داده‌ها در حافظه کاهش مییابد و همچنین به کمک روش‌هایی می‌توان انجام محاسبات را تسریع کرد. در این مقاله می‌خواهیم هم از مزایای چندی‌سازی غیریکنواخت و هم از تنکی داده‌ها بهره ببریم. برای این منظور چندی‌سازی غیریکنواخت سه حالتی برای نمایش اعداد ارائه می‌دهیم که علاوه بر افزایش صحت شبکه نسبت به شبکه غیریکنواخت دو حالتی، قابلیت هرس کردن را به شبکه می‌دهد. سپس میزان تنکی در شبکه چندی شده را با استفاده از هرس کردن افزایش می‌دهیم. نتایج نشان می‌دهد که تسریع بالقوه شبکه ما در سطح بیت و کلمه می‌تواند به ترتیب 15 و 45 برابر نسبت به شبکه غیریکنواخت دو حالتی پایه افزایش یابد.

چکیده انگلیسی:

Deep neural networks (DNNs) have achieved great interest due to their success in various applications. However, the computation complexity and memory size are considered to be the main obstacles for implementing such models on embedded devices with limited memory and computational resources. Network compression techniques can overcome these challenges. Quantization and pruning methods are the most important compression techniques among them. One of the famous quantization methods in DNNs is the multi-level binary quantization, which not only exploits simple bit-wise logical operations, but also reduces the accuracy gap between binary neural networks and full precision DNNs. Since, multi-level binary can’t represent the zero value, this quantization does not take advantage of sparsity. On the other hand, it has been shown that DNNs are sparse, and by pruning the parameters of the DNNs, the amount of data storage in memory is reduced while computation speedup is also achieved. In this paper, we propose a pruning and quantization-aware training method for multi-level ternary quantization that takes advantage of both multi-level quantization and data sparsity. In addition to increasing the accuracy of the network compared to the binary multi-level networks, it gives the network the ability to be sparse. To save memory size and computation complexity, we increase the sparsity in the quantized network by pruning until the accuracy loss is negligible. The results show that the potential speedup of computation for our model at the bit and word-level sparsity can be increased by 15x and 45x compared to the basic multi-level binary networks.

منابع و مأخذ:

اشتراک گذاری

آدرس مقاله

چندی‌سازی غیریکنواخت سه حالتی جهت بهبود تنکی و محاسبات شبکه‌های عصبی عمیق در کاربردهای نهفته