Sentiment Analysis based on Soft Clustering through Dimensionality Reduction Technique

  • Saba Akmal Department of Computer Science, University of Engineering and Technology Lahore, 54890, Pakistan.
  • Hafiz Muhammad Shahzad Asif Department of Computer Science, University of Engineering and Technology Lahore, 54890, Pakistan.

Abstract

Clustering based sentiment analysis confers new directions to analyze real-world opinions without human participation and pre-tagged training data overhead. Clustering based techniques do not rely on linguistic information and more convenient as compared to other traditional machine learning techniques. Combining the dimensionality reduction techniques with clustering algorithms highly influence the computational cost and improve the performance of sentiment analysis. In this research, we applied Principal Component Analysis technique to reduce the size of features set. This reduced feature set improves binary K-means clustering results of sentiments analysis. In our experiments, we demonstrate the performance of the clustering system with a reduced feature set to provide high-quality sentiment analysis. However, K-mean clustering has its own limitations such as hard assignment and instability of results. To overcome the limitation of traditional Kmeans algorithm we applied soft clustering (Expectation maximization algorithm) approach which stabilizes clustering accuracy. This approach allows a soft assignment to cluster documents. Consequently, our experimental accuracy is 95% with standard deviation rate of 0.1% which is sufficient to apply the clustering technique in real-world applications.

Published
Jul 1, 2021
How to Cite
AKMAL, Saba; ASIF, Hafiz Muhammad Shahzad. Sentiment Analysis based on Soft Clustering through Dimensionality Reduction Technique. Mehran University Research Journal of Engineering and Technology, [S.l.], v. 40, n. 3, p. 630 - 644, july 2021. ISSN 2413-7219. Available at: <https://publications.muet.edu.pk/index.php/muetrj/article/view/2186>. Date accessed: 28 july 2021. doi: http://dx.doi.org/10.22581/muet1982.2103.16.
Section
Articles
This is an open Access Article published by Mehran University of Engineering and Technolgy, Jamshoro under CCBY 4.0 International License