Language of constructions through statistics: corpus studies of language

Bukharova, Anna Arkadievna
Postgraduate Student, Department of English Lexicology, Moscow State Linguistic University, Moscow, Russia


More and more often modern linguistics is relying on statistics, and big data - electronic language corpora - become the methodology of research. A corpus is a collection of speech samples reflecting the linguistic behaviour of language users. Corpus linguistics together with Construction Grammar offers an integrative approach to the analysis of linguistic material. Among its advantages are the reliance on quantitative data, sampling the language into classes of lexico-grammatical constructions. Sampling of language material allows analyzing linguistic choices and seeing language from a different perspective - in terms of frequency and productivity of word combinations, as well as revealing semantic habits of language users. The corpus provides an opportunity to collect and process constructions of the same type, whose properties can be analyzed, and statistics then allows for a greater degree of objectivity of conclusions. The aim of the presentation is to show the possibilities of corpus processing of linguistic material on the basis of Russian and English; to show how our linguistic choices function, how free we are in choosing words, how we can analyze our linguistic behaviour, which is often taken for granted, and see what we actually say on a daily basis.

Keywords: corpus, corpus linguistics, construction grammar, lexical-grammatical construction, corpus search, databases, linguistic behaviour