Identification of Urdu Ghazal Poets using SVM

  • Nida Tariq College of Information Technology, University of the Punjab, Lahore, Pakistan
  • Iqra Ijaz College of Information Technology, University of the Punjab, Lahore, Pakistan
  • Muhammad Kamran Malik College of Information Technology, University of the Punjab, Lahore, Pakistan
  • Zubair Malik College of Information Technology, University of the Punjab, Lahore, Pakistan.
  • Faisal Bukhari College of Information Technology, University of the Punjab, Lahore, Pakistan.

Abstract

Urdu literature has a rich tradition of poetry, with many forms, one of which is Ghazal. Urdu poetry structures are mainly of Arabic origin. It has complex and different sentence structure compared to our daily language which makes it hard to classify. Our research is focused on the identification of poets if given with ghazals as input. Previously, no one has done this type of work. Two main factors which help categorize and classify a given text are the contents and writing style. Urdu poets like Mirza Ghalib, Mir Taqi Mir, Iqbal and many others have a different writing style and the topic of interest. Our model caters these two factors, classify ghazals using different classification models such as SVM (Support Vector Machines), Decision Tree, Random forest, Naïve Bayes and KNN (K-Nearest Neighbors). Furthermore, we have also applied feature selection techniques like chi square model and L1 based feature selection. For experimentation, we have prepared a dataset of about 4000 Ghazals. We have also compared the accuracy of different classifiers and concluded the best results for the collected dataset of Ghazals.

Published
Oct 1, 2019
How to Cite
TARIQ, Nida et al. Identification of Urdu Ghazal Poets using SVM. Mehran University Research Journal of Engineering and Technology, [S.l.], v. 38, n. 4, p. 935-944, oct. 2019. ISSN 2413-7219. Available at: <https://publications.muet.edu.pk/index.php/muetrj/article/view/1240>. Date accessed: 20 apr. 2024. doi: http://dx.doi.org/10.22581/muet1982.1904.07.
This is an open Access Article published by Mehran University of Engineering and Technolgy, Jamshoro under CCBY 4.0 International License