Countries of the world in the global online information space: quantitative representation, dynamics, tonality

Шариков Александр Вячеславович
Профессор Института медиа НИУ ВШЭ, Москва, Россия

In 2021, the Higher School of Economics launched the research project "Transformation of the world representation in the global online information space under the influence of the COVID-19 epidemic". Within the framework of this project, the peculiarities of the representation of various countries on the global Internet were studied, more precisely in information materials (in the journalistic sense). The analysis was carried out using a number of statistical resources and the FACTIVA monitoring system, which provides users with the opportunity to process texts in 25 languages, and from the end of 2021 - in 26 languages with a good level of linguistic representation of resources (the statistical error for languages does not exceed 3%). Every day, the corpus of FACTIVA texts was replenished by the amount of 150 to 330 thousand materials. These languages cover more than 90% of the information content of the Internet. In 9 languages (Chinese, English, French, German, Italian, Japanese, Portuguese, Russian, Spanish), it was possible to conduct sentiment analysis, i.e. to determine the context of publications (positive, negative, neutral) with a mention of the country. Materials were analyzed from January 1, 2020 to May 31, 2022 with different levels of access and the ability to upload texts with specified filters. The most complete information was obtained for 2020. The analysis was conducted in three directions: identification of the frequency of publications mentioning a particular country, identification of the dynamics of the appearance of texts mentioning countries, and identification of the tonality of publications mentioning countries. The frequency of mentions was compared with some country statistical indicators (population, GDP, GDP per capita). The following trends were revealed:
· The global information space is structured more on a linguistic basis. Every major language is used by quite large groups of countries. For example, the Russian language, the second largest in terms of the number of sites on the global Internet, is represented in the FACTIVA database by resources from 27 countries.
· In 2020-2021, texts mentioning Russia appeared most often on the global Internet in 25 languages in total, the US were second, Germany was third. The representation of countries varies in different language zones. In the language segment of a certain language, native-speaking countries of this language are more often mentioned.
· Media storms were observed on the global Internet in 2020-2022. Several types of media storms were identified for two criteria: by the nature of variability (explosive or gradually increasing) and by the duration of development (short-term, medium-term, long-term). Examples: the assassination of Iranian General Suleimani in January 2020 (short-term), the military conflict in Nagorno-Karabakh (medium-term), the COVID-19 pandemic (long-term, gradually increasing, without an explosive reaction of the press).
· The tone of the coverage of events is closely related to the language. In particular, materials in Chinese and Japanese are more often presented in a positive tone, and in French and Russian - in a negative one.
· An inverse relationship was found between the percentage of negative materials mentioning a country (relative to the total number of texts mentioning it) and GDP per capita. I.e., the poorer the population of a country, the more often negative materials associated with it appear.

Ключевые слова: global online information space, representation of countries, media dynamics, media storm, tonality of publication