Deep learning for multi-modal data fusion in IoT applications

Anila Saghir; Anum Akbar; Asma Zafar; Asif Hassan

doi:10.22581/muet1982.3171

Anila Saghir Department of Telecommunication Engineering, Sir Syed University of Engineering & Technology, Karachi
Anum Akbar Department of Computer Science, Sir Syed University of Engineering & Technology, Karachi
Asma Zafar Department of Mathematics, Sir Syed University of Engineering & Technology, Karachi
Asif Hassan a Department of Telecommunication Engineering, Sir Syed University of Engineering & Technology, Karachi

DOI: https://doi.org/10.22581/muet1982.3171

Abstract

With the rapid changes in technology, the Internet of Things (IoT) has also emerged with many diverse applications. A massive amount of data is generated and processed through the IoT-based sensors from these applications every day. This sensor-based data is categorized as either structured or unstructured data. Structured data is simpler to process, while the processing of unstructured data is complex, due to its diverse modalities. In IoT applications such as autonomous navigation, environmental monitoring and smart surveillance, semantic segmentation is required, and it relies on detailed scene understanding. The single-modal data like RGB, thermal or depth images fails to provide this detailed information independently. This research proposes a robust solution by fusing the multimodal data and employing a deep learning-based hybrid architecture that incorporates a generative model with a deep convolutional network. The unified model fuses RGB, thermal and depth images for semantic segmentation to improve the accuracy and reliability. The successful results validate the effectiveness of the proposed technique.