1,721,020 research outputs found

    User-focused Task-oriented Machine Translation Evaluation for Wikis: A Case Study

    No full text
    This paper reports MT evaluation experiments that were conducted at the end of year 1 of the EU-funded CoSyne 1 project for three language combinations, considering translations from German, Italian and Dutch into English. We present a comparative evaluation of the MT software developed within the project against four of the leading free webbased MT systems across a range of state-of-the-art automatic evaluation metrics. The data sets from the news domain that were created and used for training purposes and also for this evaluation exercise, which are available to the research community, are also described. The evaluation results for the news domain are very encouraging: the CoSyne MT software consistently beats the rule-based MT systems, and for translations from Italian and Dutch into English in particular the scores given by some of the standard automatic evaluation metrics are not too distant from those obtained by wellestablished statistical online MT systems. © 2011 European Association for Machine Translation

    A Framework for Diagnostic Evaluation of Machine Translation Based on Linguistic Checkpoints

    No full text
    This paper describes an approach to the diagnostic evaluation of machine translation (MT) based on linguistic checkpoints, which can provide valuable information both to the developers and to the end-users of MT systems. We present a flexible framework and a new tool, DELiC4MT, for fine-grained diagnostic MT evaluation which can be extended to any language pair and applied to any evaluation target, once the phenomena of interest are covered by the linguistic analysis. As a case study, we evaluate the CoSyne1 MT software against four leading web-based MT systems across a set of linguistic phenomena for three language pairs (from German, Italian and Dutch into English)

    A Comparative Evaluation of Research vs. Online Machine Translation Systems

    No full text
    This paper reports MT evaluation experiments that were conducted at the end of year 1 of the EU-funded CoSyne1 project for three language combinations, considering translations from German, Italian and Dutch into English. We present a comparative evaluation of the MT software developed within the project against four of the leading free web-based MT systems across a range of state-of-the-art automatic evaluation metrics. The data sets from the news domain that were created and used for training purposes and also for this evaluation exercise, which are available to the research community, are also described. The evaluation results for the news domain are very encouraging: the CoSyne MT software consistently beats the rule-based MT systems, and for translations from Italian and Dutch into English in particular the scores given by some of the standard automatic evaluation metrics are not too distant from those obtained by wellestablished statistical online MT systems

    Meta-Evaluation of a Diagnostic Quality Metric for Machine Translation

    No full text
    Diagnostic evaluation of machine translation (MT) is an approach to evaluation that provides finer-grained information compared to state-of-the-art automatic metrics. This paper evaluates DELiC4MT, a diagnostic metric that assesses the performance of MT systems on user-defined linguistic phenomena. We present the results obtained using this diagnostic metric when evaluating three MT systems that translate from English to French, with a comparison against both human judgements and a set of representative automatic evaluation metrics. In addition, as the diagnostic metric relies on word alignments, the paper compares the margin of error in diagnostic evaluation when using automatic word alignments as opposed to gold standard manual alignments. We observed that this diagnostic metric is capable of accurately reflecting translation quality, can be used reliably with automatic word alignments and, in general, correlates well with automatic metrics and, more importantly, with human judgements

    A Web Application for the Diagnostic Evaluation of Machine Translation over Specific Linguistic Phenomena

    No full text
    This paper presents a web application and a web service for the diagnostic evaluation of Machine Translation (MT). These web-based tools are built on top of DELiC4MT, an opensource software package that assesses the performance of MT systems over user-defined linguistic phenomena (lexical, morphological, syntactic and semantic). The advantage of the web-based scenario is clear; compared to the standalone tool, the user does not need to carry out any installation, configuration or maintenance of the tool

    Perception vs Reality: Measuring Machine Translation Post-Editing Productivity

    No full text
    This paper presents a study of user-perceived vs real machine translation (MT) post-editing effort and productivity gains, focusing on two bidirectional language pairs: English— German and English—Dutch. Twenty experienced media professionals post-edited statistical MT output and also manually translated comparative texts within a production environment. The paper compares the actual post-editing time against the users’ perception of the effort and time required to post-edit the MT output to achieve publishable quality, thus measuring real (vs perceived) productivity gains. Although for all the language pairs users perceived MT post-editing to be slower, in fact it proved to be a faster option than manual translation for two translation directions out of four, i.e. for Dutch→English, and (marginally) for English→German. For further objective scrutiny, the paper also checks the correlation of three state-of-the-art automatic MT evaluation metrics (BLEU, METEOR and TER) with the actual post-editing time

    Neural Automatic Post-Editing Using Prior Alignment and Reranking

    Full text link
    We present a second-stage machine translation (MT) system based on a neural machine translation (NMT) approach to automatic post-editing (APE) that improves the translation quality provided by a firststage MT system. Our APE system (AP ESym) is an extended version of an attention based NMT model with bilingual symmetry employing bidirectional models, mt → pe and pe → mt. APE translations produced by our system show statistically significant improvements over the first-stage MT, phrase-based APE and the best reported score on the WMT 2016 APE dataset by a previous neural APE system. Re-ranking (AP ERerank) of the n-best translations from the phrase-based APE and AP ESym systems provides further substantial improvements over the symmetric neural APE model. Human evaluation confirms that the AP ERerank generated PE translations improve on the previous best neural APE system at WMT 2016.Santanu Pal is supported by the People Programme (Marie Curie Actions) of the European Union’s Framework Programme (FP7/2007-2013) under REA grant agreement no 317471. Sudip Kumar Naskar is supported by Media Lab Asia, MeitY, Government of India, under the Young Faculty Research Fellowship of the Visvesvaraya PhD Scheme for Electronics & IT. Qun Liu and Josef van Genabith is supported by funding from the European Union Horizon 2020 research and innovation programme under grant agreement no 645452 (QT21)

    Named entities : recognition, classification, and use /

    No full text
    Inhoudsopgave : -- A survey of named entity recognition and classification / David Nadeau and Satoshi Sekine -- Diversity in logarithmic opinion pools / Andrew D.M. Smith and Miles Osborne -- Handling conjunctions in named entities / Pawel Mazur and Robert Dale -- Complex named entities in Spanish texts : structures and properties / Sofia N. Galicia-Haro and Alexander Gelbukh -- Named entity recognition and transliteration in Bengali / Asif Ekbal, Sudip Kumar Naskar and Sivaji Bandyopadhyay -- A note on the semantic and morphological properties of proper names in the Prolex project / Duško Vitas, Cvetana Krstev, and Denis Maurel -- Cross-lingual named entity recognition / Ralf Steinberger and Bruno Pouliquen.Inhoudsopgave : -- A survey of named entity recognition and classification / David Nadeau and Satoshi Sekine -- Diversity in logarithmic opinion pools / Andrew D.M. Smith and Miles Osborne -- Handling conjunctions in named entities / Pawel Mazur and Robert Dale -- Complex named entities in Spanish texts : structures and properties / Sofia N. Galicia-Haro and Alexander Gelbukh -- Named entity recognition and transliteration in Bengali / Asif Ekbal, Sudip Kumar Naskar and Sivaji Bandyopadhyay -- A note on the semantic and morphological properties of proper names in the Prolex project / Duško Vitas, Cvetana Krstev, and Denis Maurel -- Cross-lingual named entity recognition / Ralf Steinberger and Bruno Pouliquen.Voorheen gepubliceer iin Lingvisticae investigationes 30(2007)1.Met register en bibliografische verwijzinge

    Supertags as source language context in hierarchical phrase-based SMT

    Full text link
    Statistical machine translation (SMT) models have recently begun to include source context modeling, under the assumption that the proper lexical choice of the translation for an ambiguous word can be determined from the context in which it appears. Various types of lexical and syntactic features have been explored as effective source context to improve phrase selection in SMT. In the present work, we introduce lexico-syntactic descriptions in the form of supertags as source-side context features in the state-of-the-art hierarchical phrase-based SMT (HPB) model. These features enable us to exploit source similarity in addition to target similarity, as modelled by the language model. In our experiments two kinds of supertags are employed: those from lexicalized tree-adjoining grammar (LTAG) and combinatory categorial grammar (CCG). We use a memory-based classification framework that enables the efficient estimation of these features. Despite the differences between the two supertagging approaches, they give similar improvements. We evaluate the performance of our approach on an English-to-Dutch translation task, and report statistically significant improvements of 4.48% and 6.3% BLEU scores in translation quality when adding CCG and LTAG supertags, respectively, as context-informed features
    corecore