What Causes Wrong Sentiment Classifications of Game Reviews?

Markos Viggiato, Dayi Lin, Abram Hindle, Cor-Paul Bezemer

2021/04/05

What Causes Wrong Sentiment Classifications of Game Reviews?

Authors

Markos Viggiato, Dayi Lin, Abram Hindle, Cor-Paul Bezemer

Venue

Abstract

Sentiment analysis is a popular technique to identify the sentiment of a piece of text. Several different domains have been targeted by sentiment analysis research, such as Twitter, movie reviews, and mobile app reviews. Although several techniques have been proposed, the performance of current sentiment analysis techniques are still far from acceptable, mainly when applied in domains on which they were not trained. In addition, the causes of wrong classifications are not clear. In this paper, we study how sentiment analysis performs on game reviews. We first report the results of a large scale empirical study on the performance of widely-used sentiment classifiers on game reviews. Then, we investigate the root causes for the wrong classifications and quantify the impact of each cause on the overall performance. We study three existing classifiers: Stanford CoreNLP, NLTK, and SentiStrength. Our results show that most classifiers do not perform well on game reviews, with the best one being NLTK (with an AUC of 0.70). We also identified four main causes for wrong classifications, such as reviews that point out advantages and disadvantages of the game, which might confuse the classifier. The identified causes are not trivial to be resolved and we call upon sentiment analysis and game researchers and developers to prioritize a research agenda that investigates how the performance of sentiment analysis of game reviews can be improved, for instance by developing techniques that can automatically deal with specific game-related issues of reviews (e.g., reviews with advantages and disadvantages). Finally, we show that training sentiment classifiers on reviews that are stratified by the game genre is effective.

Bibtex

@article{viggiatoTG2021-game-review-sentiment,
 abstract = {Sentiment analysis is a popular technique to identify the sentiment of a piece of text. Several different domains have been targeted by sentiment analysis research, such as Twitter, movie reviews, and mobile app reviews. Although several techniques have been proposed, the performance of current sentiment analysis techniques are still far from acceptable, mainly when applied in domains on which they were not trained. In addition, the causes of wrong classifications are not clear. In this paper, we study how sentiment analysis performs on game reviews. We first report the results of a large scale empirical study on the performance of widely-used sentiment classifiers on game reviews. Then, we investigate the root causes for the wrong classifications and quantify the impact of each cause on the overall performance. We study three existing classifiers: Stanford CoreNLP, NLTK, and SentiStrength. Our results show that most classifiers do not perform well on game reviews, with the best one being NLTK (with an AUC of 0.70). We also identified four main causes for wrong classifications, such as reviews that point out advantages and disadvantages of the game, which might confuse the classifier. The identified causes are not trivial to be resolved and we call upon sentiment analysis and game researchers and developers to prioritize a research agenda that investigates how the performance of sentiment analysis of game reviews can be improved, for instance by developing techniques that can automatically deal with specific game-related issues of reviews (e.g., reviews with advantages and disadvantages). Finally, we show that training sentiment classifiers on reviews that are stratified by the game genre is effective.},
 accepted = {2021-04-05},
 author = {Markos Viggiato and Dayi Lin and Abram Hindle and Cor-Paul Bezemer},
 authors = {Markos Viggiato, Dayi Lin, Abram Hindle, Cor-Paul Bezemer},
 code = {viggiatoTG2021-game-review-sentiment},
 day = {05},
 funding = {NSERC Discovery},
 institution = {University of Alberta},
 journal = {IEEE Transactions on Games},
 month = {April},
 pagerange = {350--363},
 pages = {350--363},
 role = { Researcher / Co-author},
 title = {What Causes Wrong Sentiment Classifications of Game Reviews?},
 type = {article},
 url = {http://softwareprocess.ca/pubs/viggiatoTG2021-game-review-sentiment.pdf},
 venue = {IEEE Transactions on Games},
 year = {2021}
}