In this analytical report, we've tried to "grab the tongue" of our candidates for the President of Ukraine. The objects of our study were texts of election programs, published on the site of the Central Election Commission. We've downloaded them and selected the texts of election promises. We used the programming language R for analytical processing of texts. Before processing the texts, we deleted double spaces, numbers, and "noise words" ("stop words"). Besides that, all the letters were changed to lower-case.

The noise words list, created by us, contained the following words: or; of course; in fact; however; would; in; of; all; where; something; for; before; is; with; and; from; her; their; his; barely; almost; on; even; above; her; them; him; but; here; therefore; under; when; about; and; yes; this; that; such; so; also; too; those; these; to; as, what, how, which. It should be mentioned, that we didn't delete words with the same roots (i.e. conjugated word Ukraine in Ukrainian languages). To our mind, the primary texts would become much shorter if we delete such words and, therefore, considerably influence the final results. Besides that, bringing such words to the common denominator would distort the meaning of texts.

On the results of analytical processing of texts, we've created "word clouds" – the most often used words in election programs. As a threshold, we used number three for the repeated use of words in texts.

Besides determination of the most frequently used words, we compared the similarity of election programs. The comparison was based on Ochiai coefficient ("cosine similarity"). The results of comparison are given below.

texty

 

The results of similarity comparison are quite ambiguous. As long as the texts are written on the same topic, they are expectedly similar. The Ochiai coefficient doesn't go below 0.55 for all pairs of texts. However, the election programs of Hrytsenko and Tsariov are quite unusual: they cover "everything" in the literal sense of this word. Nevertheless, their programs are quite similar to the other. There are also "diametrically opposed" views in texts. For example, in pairs «Klymenko-Kuibida», «Dobkin-Symonenko», «Kuzmyn-Tiahnybok» and «Tiahnybok-Tymoshenko».

Finally, we've selected Top 15 of words which are the most frequently used in all analyzed texts of electoral programs:

top15

As long as we didn't change conjugated word forms, the histogram contains different forms of the word Ukraine. However, it doesn't lessen the great love for the Motherland that Presidential candidates feel. Nevertheless, the other words speak for themselves. Unfortunately, even after the revolution, the word "power" goes far ahead of the "development", implying the true aim of future presidency. Besides that, "securing+providing" was very often, what reminds us about so-called "budget eating" and various "special allowances".

bogom

boy

Grynenko

Grytsenko

Dopkin

Klymenko

Konovalyuk

Korolevska

Kuzmin

 

Kuybida

Lyashko

Malomuzh

Poroshenko

Rabynovych

Saranov

Symonenko

Tigibko

Tymoshenko

Tyagnybok

Tsarov

Tsushko

Shkiryak

Yarosh