1,720,983 research outputs found
Distant reading applied to the study of the history of a discipline: Publications trends in the Journal of Personality and Social Psychology
The End of Year Addresses of the Presidents of the Italian Republic (1948-2006): discoursal similarities and differences
Identifying specific textual units of documents taken from large corpora. Comparing methods.
Actes JADT'2006 en ligne, (8th international Conference on Textual Data statistical Analysis
Chronological analysis of textual data and curve clustering: preliminary results based on wavelets
In textual analysis, many corpora include texts which have a chronological
order. The temporal evolution of (key) words is relevant in order to highlight the
distinctive features of the chronological corpus. In a typical bag-of-words approach
data are organized in word-type x time-point contingency tables. Such discrete data
can be thought of as continuous objects represented by functional relationships. The
aims of this study are identifying a specific sequential pattern for each word as a
functional object, and determining prototype patterns representing clusters of words
portraying a similar evolution. We propose the application of a flexible waveletbased
model for curve clustering to a corpus of end-of-year addresses delivered by
the ten Presidents of Italian Republic in the period 1949-2011
Can Correspondence Analysis Challenge Transformers in Authorship Attribution Tasks?
With reference to a large corpus of 76 Italian contemporary popular mystery novels by 16 different authors, this study aims to assess the performance of large language models in an authorship attribution test. The results obtained through both transformers and correspondence analysis vector representations are compared and contrast in machine learning classification tasks. Although in previous works transformers have been shown to perform better than other alternatives, in this case, correspondence analysis wins the challenge. Results support the hypothesis that specialized large corpora require tailor-made representations
Portraying the life cycle of ideas in social psychology through functional (textual) data analysis: a toolkit for digital history
This paper presents a method for the digital history of a discipline (social psychology in this application) through the analysis of scientific publications. The titles of a comprehensive set of papers published in the Journal of Personality and Social Psychology (1965–2021) were collected, yielding a total of 10,222 items. The corpus thus constructed underwent several stages of preprocessing until the final conversion into a terms x time-points matrix, where terms are stemmed words and multi-words. After normalizing frequencies via a chi square-like transformation, clusters of words portraying similar temporal patterns were identified by functional (textual) data analysis and distance-based curve clustering. Among the best candidates in terms of the number of clusters, the solutions with six, nine and thirteen clusters (from lower to higher resolution) have been chosen and the nesting relationship demonstrated. They reveal—at different levels of granularity—increasing, decreasing, and stable keywords trends, highlighting methods, theories, and application domains that have become more popular in recent years, lost popularity, or have remained in common use. Moreover, this method allows to highlight historical issues (such as crises in the discipline or debates over the use of terms). The results highlight the core topics of social psychology in the past and today, underlying the crucial contribution of this method for the digital history of a discipline
- …
