1,721,006 research outputs found

    A Statistical Theory of Language Translation Based on Communication Theory

    Full text link
    We propose the first statistical theory of language translation based on communication theory. The theory is based on New Testament translations from Greek to Latin and to other 35 modern languages. In a text translated into another language, all linguistic variables do numerically change. To study the chaotic data that emerge, we model any translation as a complex communication channel affected by “noise”, studied according to Communication Theory applied for the first time to this channel. This theory deals with aspects of languages more complex than those currently considered in machine translations. The input language is the “signal”, the output language is a “replica” of the input language, but largely perturbed by noise, indispensable, however, for conveying the meaning of the input language to its readers. We have defined a noise-to-signal power ratio and found that channels are differently affected by translation noise. Communication channels are also characterized by channel capacity. The translation of novels has more constraints than the New Testament translations. We propose a global readability formula for alphabetical languages, not available for most of them, and conclude with a general theory of language translation which shows that direct and reverse channels are not symmetric. The general theory can also be applied to channels of texts belonging to the same language both to study how texts of the same author may have changed over time, or to compare texts of different authors. In conclusion, a common underlying mathematical structure governing human textual/verbal communication channels seems to emerge. Language does not play the only role in translation; this role is shared with reader’s reading ability and short-term memory capacity. Different versions of New Testament within the same language can even seem, mathematically, to belong to different languages. These conclusions are everlasting because valid also for ancient Roman and Greek readers

    Going Beyond Counting First Authors in Author Co-citation Analysis

    Full text link
    The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed

    Variations on the Author

    Full text link
    “Variations on the Author” discusses two of Eduardo Coutinho’s recent films (Um Dia na Vida, from 2010, and Últimas Conversas, posthumously released in 2015) and their contribution to the general question of documentary authorship. The director’s filmography is characterized by a consistent yet self-effacing form of authorial self-inscription: Coutinho often features as an interviewer that rather than express opinions propels discourses; an interviewer that is good at listening. This mode of self-inscription characterizes him as an author who is not expressive but who is nonetheless markedly present on the screen. In Um Dia na Vida, however, Coutinho is completely absent form the image, while Últimas Conversas, on the contrary, includes a confessional prologue that moves the director from the margins to the center of his films. This article examines the ways in which these works stand out in the filmography of a director who offers new insights into the notion of cinematic authorship

    Appropriate Similarity Measures for Author Cocitation Analysis

    Full text link
    We provide a number of new insights into the methodological discussion about author cocitation analysis. We first argue that the use of the Pearson correlation for measuring the similarity between authors’ cocitation profiles is not very satisfactory. We then discuss what kind of similarity measures may be used as an alternative to the Pearson correlation. We consider three similarity measures in particular. One is the well-known cosine. The other two similarity measures have not been used before in the bibliometric literature. Finally, we show by means of an example that our findings have a high practical relevance.information science;Pearson correlation;cosine;similarity measure;author cocitation analysis

    Readability across Time and Languages: The Case of Matthew’s Gospel Translations

    No full text
    We have studied how the readability of a text can change in translation by considering Matthew’s Gospel, written in Greek, translated into Latin and 35 modern languages. We have found that the deep-language parameters CP (characters per word), PF (words per sentence), IP (words per interpunctions), MF (interpunctions per sentence) and a universal readability index GU  of each translation are so diverse from language to language, and even within a given language for which there are many versions of Matthew—such as in English and Spanish—that the resulting texts mathematically seem to be diverse. The several tens of versions of Matthew’s Gospel studied appear to address very diverse audiences. If a reader could understand all of them well, he/she would have the impression of reading texts written by diverse authors, although all of them tell the same story

    Linguistic Communication Channels Reveal Connections between Texts: The New Testament and Greek Literature

    No full text
    We studied two fundamental linguistic channels—the sentences and the interpunctions channels—and showed they can reveal deeper connections between texts. The applied theory does not follow the actual paradigm of linguistic studies. As a study case, we considered the Greek New Testament, with the purpose of determining mathematical connections between its texts and possible differences in the writing style (mathematically defined) of the writers and in the reading skill required of their readers. The analysis was based on deep-language parameters and communication/information theory. To set the New Testament texts in the larger Greek classical literature, we considered texts written by Aesop, Polybius, Flavius Josephus, and Plutarch. The results largely confirmed what scholars have found about the New Testament texts, therefore giving credibility to the theory. The Gospel according to John is very similar to the fables written by Aesop. Surprisingly, the Epistle to the Hebrews and Apocalypse are each other’s “photocopies” in the two linguistic channels and not linked to all other texts. These two texts deserve further study by historians of the early Christian church literature at the level of meaning, readers, and possible Old Testament texts that might have influenced them. The theory can guide scholars to study any literary corpus

    Readability Indices Do Not Say It All on a Text Readability

    No full text
    We propose a universal readability index, GU, applicable to any alphabetical language and related to cognitive psychology, the theory of communication, phonics and linguistics. This index also considers readers’ short-term-memory processing capacity, here modeled by the word interval IP, namely, the number of words between two interpunctions. Any current readability formula does not consider Ip, but scatterplots of Ip versus a readability index show that texts with the same readability index can have very different Ip, ranging from 4 to 9, practically Miller’s range, which refers to 95% of readers. It is unlikely that IP has no impact on reading difficulty. The examples shown are taken from Italian and English Literatures, and from the translations of The New Testament in Latin and in contemporary languages. We also propose an extremely compact formula, relating the capacity of human short-term memory to the difficulty of reading a text. It should synthetically model human reading difficulty, a kind of “footprint” of humans. However, further experimental and multidisciplinary work is necessary to confirm our conjecture about the dependence of a readability index on a reader’s short-term-memory capacity

    Linguistic Mathematical Relationships Saved or Lost in Translating Texts: Extension of the Statistical Theory of Translation and Its Application to the New Testament

    No full text
    The purpose of the paper is to extend the general theory of translation to texts written in the same language and show some possible applications. The main result shows that the mutual mathematical relationships of texts in a language have been saved or lost in translating them into another language and consequently texts have been mathematically distorted. To make objective comparisons, we have defined a “likeness index”—based on probability and communication theory of noisy binary digital channels-and have shown that it can reveal similarities and differences of texts. We have applied the extended theory to the New Testament translations and have assessed how much the mutual mathematical relationships present in the original Greek texts have been saved or lost in 36 languages. To avoid the inaccuracy, due to the small sample size from which the input data (regression lines) are calculated, we have adopted a “renormalization” based on Monte Carlo simulations whose results we consider as “experimental”. In general, we have found that in many languages/translations the original linguistic relationships have been lost and texts mathematically distorted. The theory can be applied to texts translated by machines. Because the theory deals with linear regression lines, the concepts of signal-to-noise-ratio and likenss index can be applied any time a scientific/technical problem involves two or more linear regression lines, therefore it is not limited to linguistic variables but it is universal
    corecore