Context Aware Multimodal Fusion YOLOv5 Framework for Pedestrian Detection under IoT Environment
Loading...
Date
Authors
Shu, Y.
Wang, Y.
Zhang, M.
Yang, J.
Wang, Y.
Wang, J.
Zhang, Y.
Advisor
Referee
Mark
Journal Title
Journal ISSN
Volume Title
Publisher
Radioengineering Society
ORCID
Altmetrics
Abstract
Pedestrian detection based on deep networks has become a research hotspot in the field of computer vision. With the rapid development of the Internet of Things (IoT) and autonomous driving technology, the deployment of pedestrian detection models on mobile devices places higher demands on the accuracy and real-time performance of detection. In addition, fully integrating multimodal information can further improve the robustness of the model. To this end, this article proposes a novel multimodal fusion YOLOv5 network for pedestrian detection. Specifically, to improve the performance of multi-scale pedestrian detection, we enhance contextual awareness abilities by embedding the multi-head self-attention (MSA) mechanism and graph convolution operations in the existing YOLOv5 framework. In addition, we can fully explore the real-time advantages of the YOLOv5 framework in pedestrian detection tasks. To improve multimodal information fusion, we introduce the joint cross-attention fusion mechanism to enhance knowledge interaction between different modalities. To validate the effectiveness of the proposed model, we conduct a large number of experiments on two multimodal pedestrian detection datasets. All the results confirm that our proposed model obtains the highest performance in terms of multi-scale pedestrian detection. Moreover, compared to other multimodal deep models, our proposed model still shows superior performance.
Description
Keywords
Citation
Radioengineering. 2025 vol. 34, iss. 1, s. 118-131. ISSN 1210-2512
https://www.radioeng.cz/fulltexts/2025/25_01_0118_0131.pdf
https://www.radioeng.cz/fulltexts/2025/25_01_0118_0131.pdf
Document type
Peer-reviewed
Document version
Published version
Date of access to the full text
Language of document
en
Study field
Comittee
Date of acceptance
Defence
Result of defence
Collections
Endorsement
Review
Supplemented By
Referenced By
Creative Commons license
Except where otherwised noted, this item's license is described as Creative Commons Attribution 4.0 International license

