...::Mehran University Research Journal of Engineering & Technology::...

Article Information
Reinforcement Learning for DPM of Embedded Visual Sensor Nodes Keywords: Embedded Visual Sensor Nodes, Dynamic Power Management, Online Learning, Timeout Olicies Mehran University Research Journal of Engineering & Technology Volume 33 , Issue 2 UMAIR ALI KHAN1,FAREED AHMED JOKHIO,INTESAB HUSSAIN SADHAYO Abstract This paper proposes a RL (Reinforcement Learning) based DPM (Dynamic Power Management) technique to learn timeout policies during a visual sensor node's operation which has multiple power/performance states. As opposed to the widely used static timeout policies, our proposed DPM policy which is also referred to as OLTP (Online Learning of Timeout Policies), learns to dynamically change the timeout decisions in the different node states including the non-operational states. The selection of timeout values in different power/performance states of a visual sensing platform is based on the workload estimates derived from a ML-ANN (Multi-Layer Artificial Neural Network) and an objective function given by weighted performance and power parameters. The DPM approach is also able to dynamically adjust the power-performance weights online to satisfy a given constraint of either power consumption or performance. Results show that the proposed learning algorithm explores the power-performance tradeoff with non-stationary workload and outperforms other DPM policies. It also performs the online adjustment of the tradeoff parameters in order to meet a user-specified constraint

Article Information

Reinforcement Learning for DPM of Embedded Visual Sensor Nodes

Keywords: Embedded Visual Sensor Nodes, Dynamic Power Management, Online Learning, Timeout Olicies

Mehran University Research Journal of Engineering & Technology

Volume 33 , Issue 2

UMAIR ALI KHAN1,FAREED AHMED JOKHIO,INTESAB HUSSAIN SADHAYO

Abstract

This paper proposes a RL (Reinforcement Learning) based DPM (Dynamic Power Management) technique to learn timeout policies during a visual sensor node's operation which has multiple power/performance states. As opposed to the widely used static timeout policies, our proposed DPM policy which is also referred to as OLTP (Online Learning of Timeout Policies), learns to dynamically change the timeout decisions in the different node states including the non-operational states. The selection of timeout values in different power/performance states of a visual sensing platform is based on the workload estimates derived from a ML-ANN (Multi-Layer Artificial Neural Network) and an objective function given by weighted performance and power parameters. The DPM approach is also able to dynamically adjust the power-performance weights online to satisfy a given constraint of either power consumption or performance. Results show that the proposed learning algorithm explores the power-performance tradeoff with non-stationary workload and outperforms other DPM policies. It also performs the online adjustment of the tradeoff parameters in order to meet a user-specified constraint