Search CORE

1,721,041 research outputs found

Replication Data for: Contextually determined or semantically distinct? The competition between instrumental, long form nominative and short form nominative in Russian predicate adjectives

Author: Janda Laura Alexis
Publication venue
Publication date: 04/02/2025
Field of study

Dataset description This post provides the data and R scripts for analysis of data on the variation between long form nominative, short form nominative, and instrumental case in Russian predicate adjectives in sentences containing an overt copula verb. We analyze the various factors associated with the choice of form of the adjective.This is the abstract of the article: Based on data from the syntactic subcorpus of the Russian National Corpus, we undertake a quantitative analysis of the competition between Russian predicate adjectives in the instrumental (e.g., pustym ‘empty’), the long form nominative (e.g., pustoj ‘empty’), and the short form nominative (e.g., pust ‘empty’). It is argued that the choice of adjective form is partly determined by the context. Four (nearly) categorical rules are proposed based on the following contextual factors: the form of the copula verb, the presence/absence of a complement, and the nature of the subject of the sentence. At the same time, a “space of competition” is identified, where all three adjective forms are attested. It is hypothesized that within the space of competition, the three forms are recruited to convey different meanings, and it is argued that our analysis lends support to the traditional idea that the short form nominative is closely related to verbs. Our findings are furthermore compatible with the idea that the short form nominative expresses temporary states, rather than inherent permanent characteristics.</p

DataverseNO

Replication Data for: The long and the short of it: Russian predicate adjectives with zero copula

Author: Janda Laura Alexis
Publication venue
Publication date: 01/09/2023
Field of study

Description of Dataset This is a study of examples of Russian predicate adjectives in clauses with zero-copula present tense, where the adjective is a short form (SF) or a long form nominative (LF). The data was collected in 2022 from SynTagRus (https://universaldependencies.org/treebanks/ru_syntagrus/index.html), the syntactic subcorpus of the Russian National Corpus (https://ruscorpora.ru/new/). The data merges the results of several searches conducted to extract examples of sentences with long form and short form adjectives in predicate position, as identified by the corpus. The examples were imported to a spreadsheet and annotated manually, based on the syntactic analyses given in the corpus. For present tense sentences with no copula (Река спокойна or Река спокойная), it was necessary to search for an adjective as the top (root) node in the syntactic structure. The syntactic and morphological categories used in the corpus are explained here: https://ruscorpora.ru/page/instruction-syntax/. In order for the R code to run from these files, one needs to set up an R project with the data files in a folder named "data" and the R markdown files in a folder named "scripts". Method: Logistic regression analysis of corpus data carried out in R (R version 4.2.3 (2023-03-15)-- "Shortstop Beagle" Copyright (C) 2023 The R Foundation for Statistical Computing) and documented in an .Rmd file.Publication Abstract The present article presents an empirical investigation of the choice between so-called long (e.g., prostoj ‘simple’) and short forms (e.g., prost ‘simple’) of predicate adjectives in Russian based on data from the syntactic subcorpus of the Russian National Corpus. The data under scrutiny suggest that short forms represent the dominant option for predicate adjectives. It is proposed that long forms are descriptions of thematic participants in sentences with no complement, while short forms may take complements and describe both participants (thematic and rhematic) and situations. Within the “space of competition” where both long and short forms are well attested, it is argued that the choice of form to some extent depends on subject type, gender/number, and frequency. On the methodological level, the approach adopted in the present study may be extended to other cases of competition in morphosyntax. It is suggested that one should first “peel off” contexts where (nearly) categorical rules are at work, before one undertakes a statistical analysis of the “space of competition”.</p

DataverseNO

Sources and Targets in Kuteva et al. 2019

Author: Janda Laura Alexis
Publication venue
Publication date: 15/09/2023
Field of study

This dataset is based on examples found in Kuteva et al. 2019: Kuteva, Tania, Bernd Heine, Bo Hong, Haiping Long, Heiko Narrog, and Seongha Rhee. 2019. World Lexicon of Gramaticalization (2nd ed.). Cambridge: Cambridge University Press. Kuteva et al.’s World Lexicon of Grammaticalization (2019) is an inventory of examples of morphological reanalysis observed across a sample of over 900 languages. The goal of Kuteva et al.’s inventory is to represent grammaticalization changes that are documented in multiple languages. The examples are cataloged as types defined by Source to Target shifts, such as Ablative > Partitive, that are attested in two or more languages of the sample. The inventory lists 526 Source > Target types. The purpose of this dataset is to explore how many Sources also serve as Targets, and to provide a broad semantic classification of the Sources. The classification is intended only for general description of patterns, and does not represent a precise assignment to mutually exclusive classes. Its purpose is to give a qualitative overview and thus does not lend itself to further quantitative analysis. This dataset is the basis for the analysis in Section 4 of this publication: Janda, Laura A. To Appear. “Morphological reanalysis: recycling old form to new function”, as part of Volume 3 Morphology & Syntax, Part 1 Morphology of The Wiley Blackwell Companion to Diachronic Linguistics, edited by Edith Aldridge, Anne Breitbarth, Katalin É. Kiss, Adam Ledgeway, Joe Salmons, and Alexandra Simonenko.</p

DataverseNO

Replication Data for: Going Beyond Words: Engaging Grammar for Insights into Political Discourse

Author: Janda Laura Alexis
Publication venue
Publication date: 24/02/2025
Field of study

Dataset description: This dataset contains data in connection with a selection of three of Putin's speeches from 2023 and 2024. The related book chapter also includes analysis of data from Putin's speeches in 2022, and that data is available here: Obukhova, A. (2022). Replication Data for: the Case for Case in Putin’s Speeches. https://doi.org/10.18710/APDMDZ. DataverseNO, V2.Abstract for related publication: We present Keymorph Analysis as method to reveal the role of grammar in political discourse, demonstrating how this method can be used to gain a more in-depth understanding of language in discourse and securitization analyses. We present a longitudinal study of Putin’s speeches, comparing his use of grammatical case to usual grammatical behavior in Russian, and tracking changes from the beginning of the full-scale invasion of Ukraine in early 2022 through the end of 2024.</p

DataverseNO

Replication Data for: Understanding ‘many’ through the lens of Ukrainian багато

Author: Janda Laura Alexis
Publication venue
Publication date: 17/09/2024
Field of study

Dataset description: The General Regionally Annotated Corpus of Ukrainian (GRAC, Shvedova et al. 2017-2024, uacorpus.org) was consulted to collect data for further analysis concerning the distribution of Singular vs. Plural verb forms in the target bahato construction. GRAC is a Sketch Engine corpus of over 1.8 billion words, representing texts from over 30,000 authors created between 1816 and 2023. This corpus is designed to serve as source material for linguistic research on Standard Ukrainian. Our data was collected during the month of February 2024. We extracted and annotated 28,491 examples of the bahato construction. An additional set of examples was collected from the Russian National Corpus (ruscorpora.ru) during the month of August 2024 to provide comparison with the Russian mnogo construction. For this purpose, 6,612 examples were extracted and annotated for word order and Singular vs. Plural verb agreement. Both the Ukrainian and the Russian data are included in this dataset, along with the R scripts used to analyze this data. Article abstract: We reveal an ongoing language change in Ukrainian involving a construction with a subject comprised of the indefinite quantifier багато ‘many’ modifying a noun phrase in the Genitive Plural. Number agreement on the verb varies, allowing both Singular (in 69.1% of attestations) and Plural (in 30.9% of attestations). Based on statistical analysis of corpus data, we investigate the influence of the factors of year of creation, word order of subject and verb, and animacy of the subject on the choice of verb number. We find that, while all combinations of word order and animacy are robustly attested, VS word order and inanimate subjects tend to prefer Singular, whereas SV word order and animate subjects tend to prefer Plural. Since about the 1950s, the proportion of Plural has been increasing, overtaking Singular in the current decade. We propose that this Singular vs. Plural variation is motivated by the human embodied experience of construing a group of items as either a homogeneous mass (and therefore Singular) or a multiplicity of individuals (and therefore Plural). This proposal is supported by the identification of micro-constructions that prefer Singular and show reduced individuation of human beings

DataverseNO

Replication Data for: Looking into the Russian future

Author: Kosheleva Daria
Janda Laura Alexis
Publication venue
Publication date: 14/01/2022
Field of study

This dataset concerns the data for the article that covers the topic of future tense meanings in Russian. Abstract: The relationship between future time and future tense forms in Russian is complex. The forms traditionally attributed to the future tense in certain cases do not refer to future time. Those cases have been previously presented as a list and/or attributed to the sphere of modality. In this article, we suggest a data-driven approach applied to the spectrum of meanings of Russian future tense forms. We analyzed corpus data and discovered that 44% of perfective future forms and 22% of imperfective future forms do not unambiguously express future time meaning. Among the non-future time meanings that Russian future tense forms can express are Gnomic, Performative, Implicative, Hypothetical, Alternation, and Stable scenario. Furthermore, we propose that the meanings of the future tense constitute a radial category. Future time reference is the prototypical meaning of the future tense. The remaining meanings comprise extensions connected to the prototypical meaning. We describe the radial category with reference to Langacker’s (2008) model of tense and potentiality. Additionally, we explore the interaction of future tense and modality.</p

DataverseNO

Replication Data for: Typology of reduplication in Russian: constructions within and beyond a single clause

Author: Endresen Anna
Janda Laura Alexis
Zhukova Valentina
Publication venue
Publication date: 02/11/2023
Field of study

We analyze repetition in Russian from the perspective of the Russian Constructicon which represents over 2200 grammatical constructions described in terms of anchors (fixed elements) and slots (for various filler elements) and fully annotated for their syntactic and semantic characteristics. The Russian Constructicon facilitates the first large-scale investigation of reduplication across a representative sample of an entire language, enabling us to map out a typology invoking these and other factors in the context of Construction Grammar. Our data on repetitions includes 118 constructions tagged the Russian Constructicon for Reduplication, meaning that repetition occurs within a clause, and 28 entries tagged as Discourse “Echo” Constructions because they require the repetition of a word or phrase from a previous clause (often provided by an interlocutor). Five constructions carry both tags. We propose a theoretical expansion of the definition of reduplication to include the Discourse “Echo” type, arguing that constructions are not limited to a single clause or even to a single speaker. Our typology further explores the distribution of various formal and semantic factors observed in constructions with repetition and compares them with both previous typological research on reduplication and their distribution across the entire Russian Constructicon. Despite the fact that Russian does not use reduplication as a productive grammatical marker, we argue that reduplication is widespread and systematic in Russian

DataverseNO

Picking apart Russian particles. An empirical study on the meaning and use of že and ved’

Author: McDonald James David
Publication venue
Publication date: 01/01/2021
Field of study

This thesis explores the meaning and use of the Russian particles že and ved‘, as well as their relationship. In this thesis I carry out three investigations using cognitive linguistic methods, corpus data and statistical methods. First, I explore the meaning and use of že and ved‘ and how they are translated to English using parallel corpus data. I propose a radial category for both že and ved‘ and show how these networks relate. In my second investigation I use corpus data and statistical tools to examine which factors may influence the replaceability of že with ved‘, focusing on the part of speech to the left of že (POS), the submeanings I recognised in my first investigation and the way the Russian National Corpus (RNC) tags že. I show that the POS appears to be the most influential factor. My third investigation looks further at this. I carried out a questionnaire focusing on seven combinations of the POS and submeanings of že from my second investigation to see if it is possible to add ved‘ instead of že. My findings are not conclusive, but show that ved‘ appears to be most replaceable when the POS is adverb and the submeaning is Emphasiser. Whilst these results are not conclusive on the relationship between že and ved‘, this thesis presents že and ved‘ in a way that can facilitate learners of Russian in understanding these lexemes better

Munin - Open Research Archive

NORA - Norwegian Open Research Archives

Asymmetries in Linguistic Construal : Russian Prefixes and the Locative Alternation

Author: Sokolova Svetlana
Publication venue
Publication date: 01/01/2012
Field of study

The present dissertation is an empirical corpus study of Russian Locative Alternation verbs, such as gruzit’ ‘load’, which appear both in the Theme-Object (load the hay onto the truck) and the Goal-Object (load the truck with hay) constructions. In addition to the semantics of the verb and the syntactic structures at stake, we study the way both of them can be modified. We show that the semantics of verbal roots should be described in terms of classifying Themes and Goals, which facilitates a more fine-grained description of verbal semantics. Our data provide evidence that in addition to the Theme-Object and the Goal-Object constructions, four adjacent constructions are pertinent to the phenomenon. We show the relations among these relevant constructions by presenting them in constructional maps that illustrate the network of constructions that is typical for each verb. As additional factors, we observe the contribution of the prefixes za-, na-, and po- and three basic modifications of constructions, which so far have not been the focus of the research on the Locative Alternation: metaphorical extensions, reduction within constructions (when one of the participants is omitted), elaboration (i.e. interaction with other constructions). A logistic regression analysis explores the statistical relationships among factors

Munin - Open Research Archive

NORA - Norwegian Open Research Archives

Possessive constructions in North Saami

Author: Janda Laura Alexis
Antonsen Lene
Publication venue
Publication date: 01/01/2014
Field of study

We have investigated the distribution of two possessive constructions in North Saami: 1) with the possessive suffix appended to a noun, and 2) with the reflexive pronoun ieža-: 1) Son manai latnjasis ja velledii seŋgii. [3Sg.prn.NOM go.PRET.3Sg room.ILLSg.POSS3Sg and lay-down.PRET.3Sg bed.ILLSg] ‘He went to his room and lay down on the bed.’ 2) Hihtásit son manai sisa, gavccui loktii iežas latnjii ja velledii seŋgii moddját. [Slowly prn.3Sg.NOM go.PRET.3Sg in, climb.PRET.3Sg upstairs.ILLSg Reflprn.3Sg.GEN room.ILLSg and lay-down.PRET.3Sg bed.ILLSg smile.INFIN] ‘She went slowly inside, climbed upstairs to her room and lay down on the bed and smiled.’ Both examples are from Kirsti Paltto’s novel Ája. We see that the same author can use both the construction with the possessive suffix appended to a noun and the iežas pronoun, even though the referent is the same (son ‘s/he’) and the noun is the same (latnja ‘room’). Are these two constructions in free variation or is there a semantic/syntactic difference? If there is a difference, what is it? And is this difference also dependent upon the region and age of the author? We have gathered over 1800 sentences with these two constructions from works of fiction. The authors come from both Kautokeino and the Finnish side of the Tana river and represent three generations. In addition we have gathered 1500 examples from a new translation of the New Testament. All examples have been manually tagged for a variety of factors, such as the semantic class, case, and number for both the possessed item and the possessor, the type of reference (anaphoric, endophoric, exophoric), the author, etc. We analyze the use of the two possessive constructions in North Saami in relation to construction grammar (Goldberg 1995 & 2006), cognitive linguistics (Taylor 1996, Langacker 2008), and typological comparisons of possessive constructions in the world’s languages (Heine 1997, McGregor 2009, Aikhenvald & Dixon 2013). We use statistical methods (“CART” = Classification & Regression Trees and Random Forests, Strobl et al. 2009) to evaluate the influence of the various factors on the choice between the two possessive constructions. We find that there is a language change taking place in North Saami and that the possessive suffix is used less and less while the ieža-form, which was used in the older generation almost exclusively to mark a strong contrast, is now more and more neutral in the middle and younger generations. The most important factor in the choice between the constructions is the semantic class of the item that is possessed: the use of the possessive suffix remains strong only for inalienables (body parts and kin). After semantic class the next most important factor is the case marking for both the item that is possessed and the possessor. References Aikhenvald, Alexandra Y. and R. M. W. Dixon, eds. 2013. Possession and Ownership. Oxford: Oxford University Press. Goldberg, Adele. 1995. Constructions: A Construction Grammar Approach to Argument Structure. Chicago: Chicago University Press. Goldberg, Adele. 2006. Constructions at Work: The Nature of Generalizations in Language. Oxford: Oxford University Press. Heine, Bernd. 1997. Possession. Cambridge: Cambridge University Press. Langacker, Ronald W. 2008. Cognitive Grammar: A Basic Introduction. Oxford: Oxford University Press. McGregor, William B. 2009. “Introduction”. In McGregor, William B., ed. The Expression of Possession. Berlin: Mouton de Gruyter. 1-12. Strobl, C., G. Tutz & J. Malley. 2009. An introduction to Recursive Partitioning: Rationale, Application, and Characteristics of Classification and Regression Trees, Bagging, and Random Forests. Psychological Methods 14. 323-348. Taylor, John R. 1996. Possessives in English. Oxford: Clarendon Press

Munin - Open Research Archive

NORA - Norwegian Open Research Archives