Can publicly available machine learning models be trusted to detect sentiment in texts?




Dunaeva, Daria Olegovna
Analyst, Center for Applied Big Data Analysis, Tomsk State University, Tomsk, Russia
ddo@data.tsu.ru

Basina, Polina Alexandrovna
Analyst, Center for Applied Big Data Analysis, Tomsk State University, Tomsk, Russia
basina@data.tsu.ru


Abstract
Today, one of the most popular tasks for the scientific and commercial sector is to determine the tonality of texts, for example, reviews or posts on social networks. At the same time, there is a problem when not all researchers have the opportunity to independently create a data set and train their own machine learning model. A popular way is to use already trained public models to determine the tonality of texts. However, the question arises: do users of such solutions get the result they wanted?
In the presence of a large selection of ready-made solutions for determining the tonality, the results of the models give errors due to the complexity and contextual conditionality of the linguistic explication of emotions. Within the framework of the presented work, the results of the work of 6 publicly available machine learning models were compared to determine the tonality on the data marked up by researchers.

Keywords: natural language processing, sentiment analysis, machine learning