Sentiment Analysis for Roman Urdu

  • Ayesha  Rafique Department of Information Technology, University College of Information Technology, University of the Punjab, Lahore, Pakistan
  • Kamran Malik Department of Information Technology, University College of Information Technology, University of the Punjab, Lahore, Pakistan
  •  Zubair Nawaz Department of Information Technology, University College of Information Technology, University of the Punjab, Lahore, Pakistan
  • Faisal  Bukhari Department of Information Technology, University College of Information Technology, University of the Punjab, Lahore, Pakistan
  • Akhtar  Hussain Jalbani Department of Information Technology, Quaid-e-Awam University of Engineering, Science and Technology,Nawabshah , Pakistan

Abstract

The majority of online comments/opinions are written in text-free format. Sentiment Analysis can be used as a measure to express the polarity (positive/negative) of comments/opinions. These comments/ opinions can be in different languages i.e. English, Urdu, Roman Urdu, Hindi, Arabic etc. Mostly, people have worked on the sentiment analysis of the English language. Very limited research work has been done in Urdu or Roman Urdu languages. Whereas, Hindi/Urdu is the third largest language in the world. In this paper, we focus on the sentiment analysis of comments/opinions in Roman Urdu. There is no publicly available Roman Urdu public opinion dataset. We prepare a dataset by taking comments/opinions of people in Roman Urdu from different websites. Three supervised machine learning algorithms namely NB (Naive Bayes), LRSGD (Logistic Regression with Stochastic Gradient Descent) and SVM (Support Vector Machine) have been applied on this dataset. From results of experiments, it can be concluded that SVM performs better than NB and LRSGD in terms of accuracy. In case of SVM, an accuracy of 87.22% is achieved.

Published
Apr 1, 2019
How to Cite
RAFIQUE, Ayesha  et al. Sentiment Analysis for Roman Urdu. Mehran University Research Journal of Engineering and Technology, [S.l.], v. 38, n. 2, p. 463-470, apr. 2019. ISSN 2413-7219. Available at: <http://publications.muet.edu.pk/index.php/muetrj/article/view/977>. Date accessed: 20 oct. 2019.
Section
Articles
This is an open Access Article published by Mehran University of Engineering and Technolgy, Jamshoro under CCBY 4.0 International License