1,721,000 research outputs found

    Text Mining on Elementary Forms in Complex Lexical Structures

    No full text
    After showing the advantages of formulating lexical structures with variable elements in terms of symbolic objects (), the Authors propose to introduce the information which determine their building in the analysis of elementary units ( ). It is worth noting that, dealing with symbolic data, the observed textual units disappear by the collapsing procedure. In order to visualize forms, an analysis on elementary data, introducing the external information on the complex structure they belong, has been proposed. This analysis can be usefully performed complementary to the symbolic objects analysis, because it enables to analyze the dependence relations of the forms on the contextual information in which they have been used. In order to enrich the analysis of textual data, it is possible to introduce other external information, related to the fragments where the forms appear. In doing that by a double partial analysis, we represent on low-dimensional spaces the relational structure existing between the two sets of information introduced. Forms and fragments can be represented as supplementary points, in order to study the role they played in those relations. An application dealing with a very large of lexical structures with variable elements, extracted from the Italian newspaper “La Repubblica” during the Nineties, has been performed, in order to show the relation between years and contextual information in the different identified context, and the single forms mainly involved

    Text Mining on Elementary Forms in Complex Lexical Structures

    No full text
    After showing the advantages of formulating lexical structures with variable elements in terms of symbolic objects (), the Authors propose to introduce the information which determine their building in the analysis of elementary units ( ). It is worth noting that, dealing with symbolic data, the observed textual units disappear by the collapsing procedure. In order to visualize forms, an analysis on elementary data, introducing the external information on the complex structure they belong, has been proposed. This analysis can be usefully performed complementary to the symbolic objects analysis, because it enables to analyze the dependence relations of the forms on the contextual information in which they have been used. In order to enrich the analysis of textual data, it is possible to introduce other external information, related to the fragments where the forms appear. In doing that by a double partial analysis, we represent on low-dimensional spaces the relational structure existing between the two sets of information introduced. Forms and fragments can be represented as supplementary points, in order to study the role they played in those relations. An application dealing with a very large of lexical structures with variable elements, extracted from the Italian newspaper “La Repubblica” during the Nineties, has been performed, in order to show the relation between years and contextual information in the different identified context, and the single forms mainly involved

    Going Beyond Counting First Authors in Author Co-citation Analysis

    Full text link
    The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed

    Potenzialità di TaLTaC nella anonimizzazione di sentenze della magistratura

    No full text
    I documenti provenienti dalla Pubblica Amministrazione sono una importante fonte di dati per l’analisi di fenomeni sociali ed economici, ma la presenza nel testo di riferimenti a persone fisiche rendono necessario disporre di procedure di anonimizzazione per rispettare le norme relative alla privacy. I riferimenti alle persone fisiche sono di diversa natura, dipendendo anche dal tipo di documenti disponibili. Nelle sentenze della magistratura penale, per esempio, le persone fisiche coinvolte hanno ruoli anche molto diversi (imputato, parte offesa, parte civile, testimone, etc) e sono identificate da nome e cognome, luogo e data di nascita ma anche con il codice fiscale per le sentenze emesse in tempi più recenti. Nel caso di sentenze civili di divorzio, inoltre, tra le persone fisiche citate possono trovarsi anche eventuali figli, anche di età minore. Un’ulteriore garanzia di privacy imporrebbe di rendere anonime anche le citazioni delle figure professionali (avvocati, esperti, e gli stessi giudici) inserite in qualsiasi sentenza sia civile che penale.Documents originating from the Public Administration are an important source of data for the analysis of social and economic phenomena, but the presence in the text of references to individuals make it necessary to have anonymization procedures in order to comply with privacy regulations. References to individuals are of a different nature, depending also on the type of documents available. In the sentences of the criminal judiciary, for example, the individuals involved have very different roles (defendant, offended party, civil party, witness, etc.) and are identified by their name, place and date of birth but also with the tax code for the sentences issued in more recent times. In the case of civil judgments of divorce, moreover, among the physical persons mentioned there may also be children. A further guarantee of privacy would impose to make anonymous the quotations of professional figures (lawyers, experts, and the same judges) included in any judgment both civil and criminal
    corecore