1,721,139 research outputs found

    On automatic decipherment of lost ancient scripts relying on combinatorial optimisation and coupled simulated annealing

    Full text link
    This paper introduces a novel method for addressing the challenge of deciphering ancient scripts. The approach relies on combinatorial optimisation along with coupled simulated annealing, an advanced technique for non-convex optimisation. Encoding solutions through k-permutations facilitates the representation of null, one-to-many, and many-to-one mappings between signs. In comparison to current state-of-the-art systems evaluated on established benchmarks from literature and three new benchmarks introduced in this study, the proposed system demonstrates superior performance in enhancing cognate identification results

    Automatic detection of prosodic prominence by means of acoustic analyses

    No full text
    Prosodic prominence is commonly regarded as the perceptual salience of a linguistic unit relative to its environment. However, we are far from having a consensus on how it is measured subjectively and how it relates to objectively measurable acoustic events or linguistic structures such as lexical stress, prosodic focus, etc. Here we will concentrate mainly on the identification of prominence by means of acoustic parameters and automatic techniques. Considering this topic, some questions are still open in the community: (a) How can we reliably define and portray prosodic prominence? (b) What is the best prominence domain in acoustics? (c) Is prominence a continuous or a discrete phenomenon? (d) What are the acoustic parameters that support it and how can we combine them to reliably identify prominence? (e) To what extent are acoustic parameters language specific? Can we identify universals across languages? (f) What is the best paradigm for the automatic identification of prominence: Rule-Based or Machine Learning Systems? (g) How can we evaluate automatic systems in the right way? This contribution will briefly address these points

    Semgrex-Plus (v1.0)

    No full text
    Creating a treebank, annotating each sentence with its syntactic structure, is certainly a time-consuming and error prone task. For these reasons, treebanks often require maintenance and revisions to correct mistakes or to adapt it to different needs. In big projects, such as the Universal Dependencies (http://universaldependencies.org/) project, guidelines updates due to new language addition, change in theoretical approaches of a specific phenomenon management, mistakes or other changes often require specific tools to automate, at the maximum possible level, the process of treebank substructures rewriting. Moreover, the treebanks developed for a specific language need often to be completely converted to adhere to other standards, for example to comply to the UD specifications and conventions. Using the Semgrex-Plus tool scholars are able to define appropriate sets of rules to convert dependency treebanks into different formats. The tool allows for the definition of formal rules for rewriting dependencies and token tags as well as an algorithm for treebank rewriting able to avoid rule interference during the conversion process. This tool is publicly available (https://github.com/ftamburin/Semgrex-Plus.git)

    Il ruolo del linguista nel trattamento automatico delle lingue.

    No full text
    In questo breve saggio, tenterò di tracciare i confini, peraltro molto sfumati, di una disciplina che, nata attorno alla metà del secolo scorso, affronta problemi estremamente attuali, specialmente in questi anni nei quali il trattamento dell’informazione è divenuto uno degli aspetti centrali della nostra vita quotidiana. Tenterò inoltre di caratterizzare il contributo del linguista nello sviluppo di strumenti automatici per l’analisi delle lingue e come questo contributo sia mutato negli anni. Gli ovvi limiti di spazio mi costringeranno a tratteggiare brevemente argomenti che meriterebbero, e hanno meritato in passato, ben più ampie discussioni, e di questo mi scuso anticipatamente

    Semgrex-Plus: a Tool for Automatic Dependency-Graph Rewriting

    No full text
    This paper describes an automatic procedure, the Semgrex-Plus tool, we developed to convert dependency treebanks into different formats. It allows for the definition of formal rules for rewriting dependencies and token tags as well as an algorithm for treebank rewriting able to avoid rule interference during the conversion process.This tool is publicly available

    Using Deep Neural Networks for Smoothing Pitch Profiles in Connected Speech

    No full text
    This paper presents a new pitch tracking smoother based on deep neural networks (DNN). It leverages Long Short-Term Memories, a particular kind of recurrent neural network, for correcting pitch detection errors produced by state-of-the-art Pitch Detection Algorithms. The proposed system has been extensively tested using two reference benchmarks for English and exhibited very good performances in correcting pitch detection algorithms outputs when compared with the gold standard obtained with laryngographs

    CLUB Working Papers in Linguistics. Volume 6

    No full text
    Questo sesto volume della collana “CLUB Working Papers in Linguistics” raccoglie alcuni dei contributi presentati nel corso delle iniziative organizzate dal Circolo Linguistico dell’Università di Bologna nell’anno accademico 2020-2021. Risalgono al programma ufficiale i primi tre saggi, a firma rispettivamente di Elisa Corino (Università di Torino), Marina Benedetti (Università per Stranieri di Siena) e Andrea Sansò (Università dell’Insubria). I successivi tre contributi sono stati originariamente presentati in occasione dei seminari periodici del Circolo; si tratta dei lavori di Silvia Brambilla e Idea Basile (Università di Bologna e Università Roma “La Sapienza”), Marta Maffia e Massimo Pettorino (Università di Napoli “L’Orientale”) e Anna Dall’Acqua (Università di Bologna e Injenia S.r.L.). Il volume si chiude con un articolo di Ottavia Cepraga, vincitrice del premio ‘Una tesi in linguistica’ 2021

    State-of-the-art Italian dependency parsers based on neural and ensemble systems

    No full text
    In this paper we present a work which aims to test the most advanced, state-of-the-art syntactic dependency parsers based on deep neural networks (DNN) on Italian. We made a large set of experiments by using two Italian treebanks containing different text types downloaded from the Universal Dependencies project and propose a new solution based on ensemble systems. We implemented the proposed ensemble solutions by testing different techniques described in literature, obtaining very good parsing results, well above the state of the art for Italian

    Are Quantum Classifiers Promising?

    No full text
    This paper presents work in progress on the development of a new gen- eral purpose classifier based on Quantum Probability Theory. We will propose a kernel-based formulation of this classifier that is able to compete with a state-of-the- art machine learning methods when clas- sifying instances from two hard artificial problems and two real tasks taken from the speech processing domain

    Decipherment of Lost Ancient Scripts as Combinatorial Optimisation using Coupled Simulated Annealing

    Full text link
    This paper presents a new approach to the ancient scripts decipherment problem based on combinatorial optimisation and coupled simulated annealing, an advanced non-convex optimisation procedure. Solutions are encoded by using k-permutations allowing for null, one-to-many, and many-to-one mappings between signs. The proposed system is able to produce enhanced results in cognate identification when compared to the state-of-the-art systems on standard evaluation benchmarks used in literature
    corecore