Deep learning for multi-modal data fusion in IoT applications

  • Anila Saghir Department of Telecommunication Engineering, Sir Syed University of Engineering & Technology, Karachi
  • Anum Akbar Department of Computer Science, Sir Syed University of Engineering & Technology, Karachi
  • Asma Zafar Department of Mathematics, Sir Syed University of Engineering & Technology, Karachi
  • Asif Hassan a Department of Telecommunication Engineering, Sir Syed University of Engineering & Technology, Karachi

Abstract

With the rapid changes in technology, the Internet of Things (IoT) has also emerged with many diverse applications. A massive amount of data is generated and processed through the IoT-based sensors from these applications every day. This sensor-based data is categorized as either structured or unstructured data. Structured data is simpler to process, while the processing of unstructured data is complex, due to its diverse modalities. In IoT applications such as autonomous navigation, environmental monitoring and smart surveillance, semantic segmentation is required, and it relies on detailed scene understanding. The single-modal data like RGB, thermal or depth images fails to provide this detailed information independently. This research proposes a robust solution by fusing the multimodal data and employing a deep learning-based hybrid architecture that incorporates a generative model with a deep convolutional network. The unified model fuses RGB, thermal and depth images for semantic segmentation to improve the accuracy and reliability. The successful results validate the effectiveness of the proposed technique.

Published
Jan 3, 2025
How to Cite
SAGHIR, Anila et al. Deep learning for multi-modal data fusion in IoT applications. Mehran University Research Journal of Engineering and Technology, [S.l.], v. 44, n. 1, p. 75-81, jan. 2025. ISSN 2413-7219. Available at: <https://publications.muet.edu.pk/index.php/muetrj/article/view/3171>. Date accessed: 08 jan. 2025. doi: http://dx.doi.org/10.22581/muet1982.3171.
This is an open Access Article published by Mehran University of Engineering and Technolgy, Jamshoro under CCBY 4.0 International License