Research Data

Method of web-extraction (web scraping) of Russian verb paradigms from electronic dictionaries and databases. Matrix organization of lacunae, their codification and classification (on the material of the verbs of sound)

Alternative Title

Метод веб-извлечения парадигм русских глаголов из электронных словарей и баз данных. Матричная организация лакун, их кодификация и классификация (на материале глаголов звучания). Рабочие тезисы. Февраль

Ивлиева, И.В.
Kуб, Перри
______________________________________________

Abstract

At present linguists have access to digital collections of dictionaries and language samples known as “corpora”. Data mining using these sources allows to statistically verify the words’ (wordforms’) codification, determine their frequency of usage with respect to grammatical and colloquial contexts. Data collected by using the web scraping methodology in this research may be used on their own or in combination with data collected from other big data sources. This project is especially significant for the “rich morphology” languages. Different strategies may be applied for gathering, visualizing, or analyzing data from various online Russian dictionaries, corpora, or from other big data sources using digital technologies (e.g., from web portals, to computer-assisted text collections, etc.).

Start Date

01 Nov 2020

End Date

28 Feb 2021

Recommended Citation

Ivliyeva, Irina V. and Koob, Perry, "Method of web-extraction (web scraping) of Russian verb paradigms from electronic dictionaries and databases. Matrix organization of lacunae, their codification and classification (on the material of the verbs of sound)" (2021). Research Data. 7.
https://scholarsmine.mst.edu/research_data/7

Contact Information

Dr. Irina V. Ivliyeva, ivliyeva@mst.edu
Professor of Russian, Arts, Languages, and Philosophy Department
Missouri University of Science and Technology

Perry B. Koob, koobp@mst.edu
Database Administrator/System Administrator
Academic Technology Support Team
Missouri S&T Information Technology

Department(s)

Arts, Languages, and Philosophy

Document Type

Data

Document Version

Final Version

File Format

text

Language(s)

Russian

Language 2

English

Publication Date

28 Feb 2021

Link to Research

COinS

Research Data

Method of web-extraction (web scraping) of Russian verb paradigms from electronic dictionaries and databases. Matrix organization of lacunae, their codification and classification (on the material of the verbs of sound)

Alternative Title

Abstract

Start Date

End Date

Recommended Citation

Contact Information

Department(s)

Document Type

Document Version

File Format

Language(s)

Language 2

Publication Date

Search

Browse

Author Corner

Useful Links

Article Locations

Research Data

Method of web-extraction (web scraping) of Russian verb paradigms from electronic dictionaries and databases. Matrix organization of lacunae, their codification and classification (on the material of the verbs of sound)

Alternative Title

Author

Abstract

Start Date

End Date

Recommended Citation

Contact Information

Department(s)

Document Type

Document Version

File Format

Language(s)

Language 2

Publication Date

Share

Search

Browse

Author Corner

Useful Links

Article Locations