Artificial Urdu Text Detection and Localization from Individual Video Frames
Abstract
In current era of technology, information acquisition from images and videos become most important task due to the rapid development of data mining and machine learning.The information can be either textual, visual, or combination of these. Text appearing in images or videos is a significant source of information and plays a vital role to perceive it. Developing a unified method to detect the text is hard, as textual properties (i.e. font, size, color, illumination, orientation, etc.) may vary with the complex background. So far, multimedia and computer vision community unable yet to standardize any ideal approach to extract the text smoothly. In this paper, a novel method is proposed to detect and localize artificial Urdu text in individual video frames. Firstly, Sobel and Canny edge detection operators are applied to input frame and are merged with MSER (Maximally Stable Extremal Region) detected regions. Next, geometric constraints are applied to eliminate obvious non-text regions with large and small variations. Further refining of non-text regions is achieved by stroke width transform. SVM (Support Vector Machine) classifier is trained to classify text and non-text objects. Finally, bounding boxes are used to localize the text.Experimental results show that the proposed method is robust and efficient than state-of-the-art methods.