شناسایی برنامه با طبقه‌بندی هوشمند ترافیک شبکه

محورهای موضوعی : فناوری اطلاعات و ارتباطات

1 - پژوهشگاه ارتباطات و فناوری اطلاعات

تاریخ دریافت : 1402/02/22 تاریخ پذیرش : 1402/06/26 تاریخ انتشار : 1403/03/31

کلید واژه: طبقه‌بندي ترافيك رمز, معماری عملیاتی, ويژگيهای آماری, شناسایی برنامه, یادگیری ماشین,

چکیده مقاله :

طبقه‌بندی و تحلیل ترافیک، یکی از چالش‌های بزرگ در حوزه داده کاوی و یادگیری ماشین است که نقش مهمی در تأمین امنیت، تضمین کیفیت و مدیریت شبکه دارد. امروزه حجم زیادی از ترافیک انتقالی در بستر شبكه‏ توسط پروتكلهای ارتباطي امن مانند HTTPS رمز می‌شوند. ترافیک رمز، امکان نظارت و تشخیص ترافيک مشکوک و مخرب در زيرساخت‏هاي ارتباطي را (در قبال افزایش امنيت و حريم خصوصي کاربر) کاهش مي‏دهد و طبقه‌بندی آن بدون رمزگشايي ارتباطات شبكه‏اي كار دشواري است، چرا که اطلاعات payload از دست مي‏رود و تنها اطلاعات سرآيند كه بخشي از آن هم در نسخه‌هاي جدید پروتكلهاي ارتباطي شبكه (نظيرTLS1.03) رمز مي‏شود، قابل دسترس است. از اينرو رويكردهاي قدیمی تحلیل ترافیک مانند روش‌هاي مختلف مبتني بر پورت و Payload کارآمدی خود را از دست داده، و رویکردهای جدید مبتنی بر هوش مصنوعی و یادگیری ماشین در تحلیل ترافیک رمز مورد استفاده قرار می‌گیرند. در این مقاله پس از بررسی روش‌های تحلیل ترافیک، چارچوب معماري عملیاتی برای تحلیل و طبقه‌بندی هوشمند ترافیک طراحی شده است. سپس یک مدل هوشمند با رویکرد شناسایی ترافیک برنامه‌‌ها مبتنی بر معماری پیشنهادی ارائه گردیده و با استفاده از روش‌های یادگیری ماشین روی مجموعه داده ترافیکی Kaggle141 و مجموعه داده محلی مورد ارزیابی قرار گرفته است. نتایج بدست آمده نشان می‌دهد که مدل مبتنی بر جنگل تصادفی، علاوه بر قابلیت تفسیرپذیری بالا در مقایسه با روش‌های یادگیری عمیق، توانسته است دقت بالایی در طبقه‌بندی هوشمند ترافیک (به ترتیب 95% و 97%) نسبت به سایر روش‌های یادگیری ماشین روی مجموعه داده Kaggle141 و ترافیک محلی ارائه دهد.

چکیده انگلیسی:

Traffic classification and analysis is one of the big challenges in the field of data mining and machine learning, which plays an important role in providing security, quality assurance and network management. Today, a large amount of transmission traffic in the network is encrypted by secure communication protocols such as HTTPS. Encrypted traffic reduces the possibility of monitoring and detecting suspicious and malicious traffic in communication infrastructures (instead of increased security and privacy of the user) and its classification is a difficult task without decoding network communications, because the payload information is lost, and only the header information (which is encrypted too in new versions of network communication protocols such as TLS1.03) is accessible. Therefore, the old approaches of traffic analysis, such as various methods based on port and payload, have lost their efficiency, and new approaches based on artificial intelligence and machine learning are used in cryptographic traffic analysis. In this article, after reviewing the traffic analysis methods, an operational architectural framework for intelligent traffic analysis and classification has been designed. Then, an intelligent model for Traffic Classification and Application Identification is presented and evaluated using machine learning methods on Kaggle141. The obtained results show that the random forest model, in addition to high interpretability compared to deep learning methods, has been able to provide high accuracy in traffic classification (95% and 97%) compared to other machine learning methods. Finally, tips and suggestions about using machine learning methods in the operational field of traffic classification have been provided.

منابع و مأخذ:

اشتراک گذاری

آدرس مقاله

شناسایی برنامه با طبقه‌بندی هوشمند ترافیک شبکه