1,720,992 research outputs found
Proceedings of The 3rd Workshop on Multi-word Units in Machine Translation and Translation Technology (MUMTTT 2017)
This volume documents the proceedings of the 3rd Workshop on Multi-word Units in Machine Translation and Translation Technology (MUMTTT 2017), held on 4 November 2017 as part of the EUROPHRAS 2017 conference: "Computational and Corpus-based Approaches to Phraseology: Recent advances and interdisciplinary approaches" (London, 13-14 November 2015), jointly organised by the European Association for Phraseology (EUROPHRAS), the University of Wolverhampton (Research Institute of Information and Language Processing) and the Association for Computational Linguistics – Bulgaria. The workshop was held under the auspices of the European Society of Phraseology (EUROPHRAS), the Special Interest Group on the Lexicon of the Association for Computational Linguistics (SIGLEX), and SIGLEX's Multiword Expressions Section (SIGLEX-MWE). The workshop was co-chaired by Ruslan Mitkov (University of Wolverhampton), Johanna Monti (Università degli Studi di Sassari), Gloria Corpas Pastor (Universidad de Málaga) and Violeta Seretan (Université de Genève).
The topic of the workshop was the integration of multi-word units in machine translation and translation technology tools. In spite of the relative progress achieved for particular types of units such as verb-particle constructions, the identification, interpretation and translation of multi-word units in general still represent open challenges, both from a theoretical and a practical point of view. The idiosyncratic morpho-syntactic, semantic and translational properties of multi-word units pose many obstacles even to human translators, mainly because of intrinsic ambiguities, structural and lexical asymmetries between languages, and, finally, cultural differences. The aim of the workshop was to bring together researchers and practitioners working on MWU processing from various perspectives, in order to enable cross fertilisation and foster the creation of innovative solutions that can only arise from interdisciplinary collaborations. The present edition of the workshop provided a forum for researchers and practitioners in the fields of (Computational) Linguistics, (Computational) Phraseology, Translation Studies and Translation Technology to discuss recent advances in the area of multi-word unit processing and to coordinate research efforts across disciplines in order to improve the integration of multi-word units in machine translation and translation technology tools. The programme included 5 oral presentations, and featured an invited talk by Carlos Ramisch, Aix-Marseille University, France. The papers accepted are indicative of the current efforts of researchers and developers who are actively engaged in improving the state of the art of multi-word unit translation. We would like to thank all authors who contributed papers to this workshop edition and the Programme Committee members who provided valuable feedback during the review process
Multi-word Units in Machine Translation and Translation Technology
The correct interpretation of Multiword Units (MWUs) is crucial to many applications in Natural Language Processing but is a challenging and complex task. In recent years, the computational treatment of MWUs has received considerable attention but there is much more to be done before we can claim that NLP and Machine Translation (MT) systems process MWUs successfully.
This volume provides a general overview of the field with particular reference to Machine Translation and Translation Technology and focuses on languages such as English, Basque, French Romanian, German, Dutch and Croatian among others. The chapters of the volume illustrate a variety of topics that address this challenge, such as the use of rule-based approaches, compound splitting techniques, MWU identification methodologies in multilingual applications, and MWU alignment issues
Multiword units in machine translation and translation technology
The correct interpretation of Multiword Units (MWUs) is crucial to many applications in Natural Language Processing but is a challenging and complex task. In recent years, the computational treatment of MWUs has received considerable attention but we believe that there is much more to be done before we can claim that NLP and Machine Translation (MT) systems process MWUs successfully. In this chapter, we present a survey of the field with particular reference to Machine Translation and Translation Technology
Multi-word unit processing in Machine Translation
The correct interpretation of Multiword Units (MWUs) is crucial to many applications in Natural Language Processing but is a challenging and complex task. In recent years, the computational treatment of MWUs has received considerable attention but there is much more to be done before we can claim that NLP and Machine Translation (MT) systems process MWUs successfully.
This volume provides a general overview of the field with particular reference to Machine Translation and Translation Technology and focuses on languages such as English, Basque, French, Romanian, German, Dutch and Croatian, among others. The chapters of the volume illustrate a variety of topics that address this challenge, such as the use of rule-based approaches, compound splitting techniques, MWU identification methodologies in multilingual applications, and MWU alignment issues
Accurate Collocation Extraction Using a Multilingual Parser
This paper focuses on the use of advanced techniques of text analysis as support for collocation extraction. A hybrid system is presented that combines statistical methods and multilingual parsing for detecting accurate collocational information from English, French, Spanish and Italian corpora. The advantage of relying on full parsing over using a traditional window method (which ignores the syntactic information) is first theoretically motivated, then empirically validated by a comparative evaluation experiment.
- …
