Creating a corpus of all possible word-forms of modified Russian sound verbs using web-scraping methodology. Compilation and adjustment of summary tables for the future tense, imperative, and gerund forms. Forms of “dual action”. Experimental multi-dimensional scaling of web-scraping results.
Создание корпуса всех возможных словоформ модифицироватнных русских глаголов звучания методом веб-извлечения. Составление и корректировка сводных таблиц для форм будущего времени, повелительного наклонения, и деепричастий. Формы совместного действия. Экспериментальное многомерное шкалирование результатов веб-извлечения.
Modern Russian dictionaries do not include all possible forms of words. To compile such an index, even for one part of a speech, on paper is not technically feasible. With the appearance of electronic versions of dictionaries, however, for the first time we can try to create an inventory of all possible forms for the lexical-semantic group of Russian sound verbs, using the web-scraping methodology. This project attempts to develop a number of comprehensive tables for the prefixed (semantically modified at the word-formation level) sound verbs and all their forms. A novel, four-position system of numbering the verbal forms to support experimental multi-dimensional scaling of results have been introduced. The research output takes into the account not only all documented (recorded) modifications of the source verbs, but also all potential derivative forms. The compellation of the tables, the labelling of the elements and the subsequent analysis of their correlations revealed some technical execution and documenting challenges that are directly related to the semantics of verbal modifications (e.g. the concurrence of the future tense forms, the imperatives of verbs of sound and forms of “dual action”). The project outcomes may positively impact the development of diverse web-based applications for gathering, visualizing, or analyzing data from a variety of digital lexicographic sources across a single or multiple language, corpora, or from other digital text collections.
01 Jun 2021
30 Apr 2022
Ivliyeva, Irina V., Koob, Perry. Creating a corpus of all possible word-forms of modified Russian sound verbs using web-scraping methodology. Compilation and adjustment of summary tables for the future tense, imperative, and gerund forms. Forms of “dual action”. Experimental multi-dimensional scaling of web-scraping results. June 2021 – April 2022. Working theses. March 2022.
Dr. Irina V. Ivliyeva, firstname.lastname@example.org
Professor of Russian, Arts, Languages, and Philosophy Department
Missouri University of Science and Technology
Perry B. Koob, email@example.com
Database Administrator/System Administrator
Academic Technology Support Team
Missouri S&T Information Technology
Arts, Languages, and Philosophy