Reinforcement Learning Based Hierarchical Multi-Agent Robotic Search Team in Uncertain Environment

  • Shahzaib Hamid Department of Electrical Engineering, Superior University, Lahore, Punjab, Pakistan.
  • Ali Nasir Department of Electrical Engineering, University of Central Punjab, Lahore, Pakistan.
  • Yasir Saleem Department of Computer Engineering, University of Engineering and Technology, Lahore, Pakistan.

Abstract

Field of robotics has been under the limelight because of recent advances in Artificial Intelligence (AI). Due to increased diversity in multi-agent systems, new models are being developed to handle complexity of such systems. However, most of these models do not address problems such as; uncertainty handling, efficient learning, agent coordination and fault detection. This paper presents a novel approach of implementing Reinforcement Learning (RL) on hierarchical robotic search teams. The proposed algorithm handles uncertainties in the system by implementing Q-learning and depicts enhanced efficiency as well as better time consumption compared to prior models. The reason for that is each agent can take action on its own thus there is less dependency on leader agent for RL policy. The performance of this algorithm is measured by introducing agents in an unknown environment with both Markov Decision Process (MDP) and RL policies at their disposal. Simulation-based comparison of the agent motion is presented using the results from of MDP and RL policies. Furthermore, qualitative comparison of the proposed model with prior models is also presented.

Published
Jul 1, 2021
How to Cite
HAMID, Shahzaib; NASIR, Ali; SALEEM, Yasir. Reinforcement Learning Based Hierarchical Multi-Agent Robotic Search Team in Uncertain Environment. Mehran University Research Journal of Engineering and Technology, [S.l.], v. 40, n. 3, p. 645-662, july 2021. ISSN 2413-7219. Available at: <https://publications.muet.edu.pk/index.php/muetrj/article/view/2187>. Date accessed: 28 july 2021. doi: http://dx.doi.org/10.22581/muet1982.2103.17.
Section
Articles
This is an open Access Article published by Mehran University of Engineering and Technolgy, Jamshoro under CCBY 4.0 International License