82022 - Machine Translation and Post-Editing (CL2)

Academic Year 2018/2019

  • Teaching Mode: Traditional lectures
  • Campus: Forli
  • Corso: Second cycle degree programme (LM) in Specialized translation (cod. 9174)

Learning outcomes

The student - understands the theoretical principles of machine translation; is able to use effectively one or more machine translation engines and to post-edit (or revise) automatically translated texts from a variety of sublanguages - is able to devise, manage and evaluate complex machine translation and post-editing projects, involving several professionals and a variety of skills and competences, in a way that is consistent with professional ethics - is able to acquire higher-level knowledge and competences in the areas of machine translation and post-editing independently, and to apply them to novel tasks

Course contents

The "Machine Translation and Post-editing" (MatPed) module is delivered in the second semester and is one of the three modules that make up the "Translation Technology and methods" course, together with "Terminology and Information Mining" (TerMine), held by Prof. Adriano Ferraresi during the first semester, and "Computer-Assisted Translation and Web Localization" (CatLoc), held during the second semester by myself.

The MatPed module takes place in the second semester and consists itself of two closely connected and interdependent parts, one devoted to machine translation (MT), and the other to post-editing (PE).

In the first part, which is closely linked to the second one, the key theoretical principles of MT are introduced, starting with a brief overview of the major milestones in its history, from the 1940's up to the latest developments, then the main architectures of MT systems are presented. An in-depth look at linguistic and translation phenomena which are particularly challenging for MT processing helps to raise the students' awareness, so that they can evaluate critically and objectively the potential as well as the limitations of this technology. Complex topics concerning the evaluation of MT quality and the effectiveness of MT systems are discussed, presenting both standard human evaluation methods (e.g. fluency and adequacy judgements) and state-of-the-art automatic evaluation metrics, such as BLEU, NIST, METEOR and TER. This is followed by an exploration of related issues, such as the evaluation of the time and effort required to introduce MT into translation workflows, MT quality estimation and prediction. On the basis of the explanations of the theoretical concepts and of the key relevant notions, these topics are also covered with practical exercises and reflective activities concerning a range of scenarios in which MT can be deployed. This helps the students to systematically consider its pros and cons, simulating the professional environments of companies and institutions that use MT software on a regular basis.

The second part, which is closely linked to the first one, focuses in particular on the PE of texts translated with MT systems, comparing this operation with other existing strategies that help make automatically translated texts useful and accessible. These include pre-editing, controlled language for the drafting of restricted input and the approach based on ‘sublanguage' for certain specialized domains, in particular when a single source text has to be automatically translated into a range of target languages. With regard to PE, different ways of intervention (minimum, medium, complete, etc.) are discussed in relation to variables such as the specific conditions of the revision task, the post-editor profile (bilingual, monolingual of the target language, expert in the field, etc.), the type of translation, the publication venue and the circulation methods planned for the revised target text, its potential readers and users, etc. Various PE strategies allowing for the improvement of the raw output provided by MT systems are also presented. The aim of these strategies is to obtain a target text which meets the specific requirements of the translation context, e.g. intervening with the minimum number of possible changes or trying to achieve high (publishable) quality for the final text, based on the circumstances. Issues relating to the quality and effectiveness of post-editing are considered according to the time gains it allows, depending on the linguistic standard required for a particular target text, which is functional to its planned use. Finally, the students are guided to explore the connections between the use of MT with post-editing and the work of professional translators who normally use computer-assisted translation tools, in particular translation memories.

In both parts, methods and resources presented during the TermMine module will be exploited.

Readings/Bibliography

During lessons based on theoretical aspects, the teacher will be using the following bibliographical references:

Arnold, D.J., L. Balkan, S. Meijer, R. Lee Humphreys & L. Sadler (1994) "Machine Translation: An Introductory Guide". London: Blackwells-NCC. Available online: www.essex.ac.uk/linguistics/external/clmt/MTbook

Bentivogli, L., Bisazza A., Cettolo M., Federico, M. (2016) "Neural versus phrasebased machine translation quality: a a case study".
arXiv preprint arXiv:1608.04631.

Bersani Berselli, G. (edited by) (2011) "Usare la Traduzione Automatica". Bologna: CLUEB.

Carl, J., Gutermuth, S & Hansen Schirra, S (2015) "Post-Editing Machine Translation. Efficinecy, strategies and revision processes in professional translation settings". In Psycholinguistic and Cognitive Inquiries into Translation and Interpreting. Amsterdam and Philadelphia: John Benjamins.

Hutchins, John (1986) "Machine Translation: Past, Present, Future". Chichester: Ellis Horwood. Available online: www.hutchinsweb.me.uk/PPF-TOC.htm

Hutchins, W.J. & H.L. Somers (1992) "An Introduction to Machine Translation". London: Academic Press. Available online: www.hutchinsweb.me.uk/IntroMT-TOC.htm

Hutchins, W.J. & H.L. Somers (1995) "Introduccion a la Traduccion Automatica". Madrid: Visor [traduzione spagnola di Hutchins & Somers (1992)].

Kyunghyun C., van Merrienboer B., Bahdanau D., Bengio Y. (2016) "On the Properties of Neural Machine Translation: Encoder-Decoder Approaches" arXiv.org > cs > arXiv:1409.1259

Loffler-Laurian, Anne Marie (1996) "La Traduction Automatique". Vileneuve d'Ascq: Presses Universitaires du Septentrion.

Quah, C.K. (2006) "Translation and Technology". Basingstoke: Palgrave MacMillan.

Somers, Harold (ed.) (2003) "Computers and Translation: A Translator's Guide". Amsterdam and Philadelphia: John Benjamins.

Guerberof, Ana (2009) "Productivity and Quality in the Post–editing of Outputs from Translation Memories and Machine Translation”. Localisation Focus 7(1): 11-21. Available online: http://isg.urv.es/library/papers/2009_Ana_Guerberof_Vol_7-11.pdf

NIST (2007) Post Editing Guidelines for GALE Machine Translation Evaluation. Available online: http://projects.ldc.upenn.edu/gale/Translation/Editors/GALEpostedit_guidelines-3.0.2.pdf

O'Brien, Sharon (2002) “Teaching Post-editing: A Proposal for Course Content”. Proceedings of the 6th EAMT Workshop on “Teaching Machine Translation”. EAMT/BCS, UMIST, Manchester, UK. 99-106. Available online: http://mt-archive.info/EAMT-2002-OBrien.pdf

Poulis, Alexandros and David Kolovratnik (2012) "To Post-edit or not to Post-edit? Estimating the Benefits of MT Post-editing for a European Organization". Proceedings of the AMTA 2012 Workshop on Post-editing Technology and Practice (WPTP 2012). The Tenth Biennial Conference of the Association for Machine Translation in the Americas, October 28-November 1 2012, San Diego, CA, USA. Available online: http://amta2012.amtaweb.org/AMTA2012Files/html/9/9_paper.pdf

Toral, A., Sanchez-Cartagena V.M. (2017) "A Multifaceted Evaluation of Neural versus Phrase-Based Machine Translation for 9 Language Directions". Available online: https://arxiv.org/pdf/1701.02901.pdf

Teaching methods

Apart from covering the theoretical aspects, the lectures are held using a participative approach and take the form of a workshop.

Theoretical aspects are presented by the lecturer and explored in more depth independently by the students through readings assigned during the course. The applied part consists of hands-on practice in the lab led by the lecturer and take-home exercises to be done autonomously or in groups by the students.

Attendance is compulsory (at least 70% of lessons need to be attended).

Assessment methods

Assessment will be based on a written test lasting two hours. It will consist of a theoretical and a practical question focusing on the theoretical principles covered in class as well as their applications in professional translation, with a critical analysis of the relevant processes and potential.

The practical question is worth 20 points. Serious errors (lack of requirements indicated in the instructions provided) cause the subtraction of two points, less serious errors (wrong settings) the subtraction of one point, distractions the subtraction of o,5 points.

The theoretical question is worth 10 points and will be evluated according to the following criteria: correctness, completeness and clearness of the information presented.

The final mark of the "Translation Technology and methods" course will be calculated as the arithmetic mean of the marks obtained in the TerMine, CatLoc and MatPed modules.

The final marks are published a few days later through the Moodle platform.

Teaching tools

Lessons are held in a computer lab with Internet connection and beamer. Students will also use (online) translation software, tools and resources.

Teaching materials are made available on the Moodle platform.

Links to further information

http://moodle.sslmit.unibo.it/course/view.php?id=1116

Office hours

See the website of Claudia Lecci