1,721,651 research outputs found
Machine Translation Markers in Post-Edited Machine Translation Output
The author has conducted an experiment for two consecutive years with postgraduate university students in which half do an unaided human translation (HT) and the other half post-edit machine translation output (PEMT). Comparison of the texts produced shows - rather unsurprisingly - that post-editors faced with an acceptable solution tend not to edit it, even when often more than 60% of translators tackling the same text prefer an array of other different solutions. As a consequence, certain turns of phrase, expressions and choices of words occur with greater frequency in PEMT than in HT, making it theoretically possible to design tests to tell them apart. To verify this, the author successfully carried out one such test on a small group of professional translators. This implies that PEMT may lack the variety and inventiveness of HT, and consequently may not actually reach the same standard. It is evident that the additional post-editing effort required to eliminate what are effectively MT markers is likely to nullify a great deal, if not all, of the time and cost-saving advantages of PEMT. However, the author argues that failure to eradicate these markers may eventually lead to lexical impoverishment of the target language
Building a Custom Machine Translation Engine as part of a Postgraduate University Course: a Case Study
In 2015, I was asked to design a postgraduate course on machine translation (MT) and post-editing. Following a preliminary theoretical part, the module concentrated on the building and practical use of custom machine translation (CMT) engines. This was a particularly ambitious proposition since it was not certain that students with undergraduate degrees in languages, translation and interpreting, without particular knowledge of computer science or computational linguistics, would succeed in assembling the necessary corpora and building a CMT engine. This paper looks at how the task was successfully achieved using KantanMT to build the CMT engines and Wordfast Anywhere to convert and align the training data.
The course was clearly a success since all students were able to train a working CMT engine and assess its output. The majority agreed their raw CMT engine output was better than Google Translate’s for the kinds of text it was trained for, and better than the raw output (pre-translation) from a translation memory tool.
There was some initial scepticism among the students regarding the effective usefulness of MT, but the mood clearly changed at the end of the course with virtually all students agreeing that post-edited MT has a legitimate role to play
I, robot
Article about machine translation and generative artificial intelligence published in the ITI Bulleti
Solving Terminology Problems More Quickly with 'IntelliWebSearch (Almost) Unlimited'
Michael Farrell received several descriptions of university courses to translate from Italian into English in early 2005. The syllabuses boiled down to a list of topics and laws of mathematics and physics: not many complex sentences, but a great deal of terminology which needed translating and double checking with the utmost care and attention.
To do this, he found himself repeatedly copying terms to his PC clipboard, opening his browser, opening the most appropriate on-line resources, pasting terms into search boxes, setting search parameters, clicking search buttons, analysing results, copying the best solutions back to the clipboard, returning to the translation environment and pasting the terms found into the text.
He quickly realized that he needed to find a way to semi-automate the terminology search process in order to complete the translation in a reasonable time and for his own sanity. He immediately started looking around for a tool, but surprisingly there seemed to be nothing similar to what he needed on the market. Having already created some simple macros with a free scripting language called AutoHotkey, he set about writing something that would do the trick.
The first simple macro he knocked out gradually grew and developed until it became a fully fledged software tool: IntelliWebSearch. After speaking to several colleagues about it, he was persuaded to share his work and put together a small group of volunteer beta- testers. After a few weeks of testing on various Windows systems, he released the tool as freeware towards the end of 2005.
At the beginning of his workshop, Michael Farrell will explain what prompted him to create the tool and how he went about it. He will then go on to describe its use and its limitations, and show how it can save translators and terminologists a lot of time with a live demonstration, connectivity permitting.
The workshop will conclude with a presentation revealing for the first time in public some of the features of a new version which is currently being developed under the code name "IntelliWebSearch (Almost) Unlimited" (pre-alpha at the time of writing).
The workshop is aimed at professional translators, interpreters and terminologists in all fields, especially those interested in increasing efficiency through the use of technology without lowering quality standards
Raw Output Evaluator, a Freeware Tool for Manually Assessing Raw Outputs from Different Machine Translation Engines
Raw Output Evaluator is a freeware tool, which runs under Microsoft Windows. It allows quality evaluators to compare and manually assess raw outputs from different machine translation engines. The outputs may be assessed in comparison to each other and to other translations of the same input source text, and in absolute terms using standard industry metrics or ones designed specifically by the evaluators themselves. The errors found may be highlighted using various colours. Thanks to a built-in stopwatch, the same program can also be used as a simple post-editing tool in order to compare the time required to post-edit MT output with how long it takes to produce an unaided human translation of the same input text. The MT outputs may be imported into the tool in a variety of formats, or pasted in from the PC Clipboard. The project files created by the tool may also be exported and re-imported in several file formats. Raw Output Evaluator was developed for use during a postgraduate course module on machine translation and post-editing
Going Beyond Counting First Authors in Author Co-citation Analysis
The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation
counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings
are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that
only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into
account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed
Variations on the Author
“Variations on the Author” discusses two of Eduardo Coutinho’s recent films (Um Dia na Vida, from 2010, and Últimas Conversas, posthumously released in 2015) and their contribution to the general question of documentary authorship. The director’s filmography is characterized by a consistent yet self-effacing form of authorial self-inscription: Coutinho often features as an interviewer that rather than express opinions propels discourses; an interviewer that is good at listening. This mode of self-inscription characterizes him as an author who is not expressive but who is nonetheless markedly present on the screen. In Um Dia na Vida, however, Coutinho is completely absent form the image, while Últimas Conversas, on the contrary, includes a confessional prologue that moves the director from the margins to the center of his films. This article examines the ways in which these works stand out in the filmography of a director who offers new insights into the notion of cinematic authorship
Appropriate Similarity Measures for Author Cocitation Analysis
We provide a number of new insights into the methodological discussion about author cocitation analysis. We first argue that the use of the Pearson correlation for measuring the similarity between authors’ cocitation profiles is not very satisfactory. We then discuss what kind of similarity measures may be used as an alternative to the Pearson correlation. We consider three similarity measures in particular. One is the well-known cosine. The other two similarity measures have not been used before in the bibliometric literature. Finally, we show by means of an example that our findings have a high practical relevance.information science;Pearson correlation;cosine;similarity measure;author cocitation analysis
- …
