96639 - MACHINE TRANSLATION

Academic Year 2022/2023

  • Teaching Mode: Traditional lectures
  • Campus: Forli
  • Corso: Second cycle degree programme (LM) in Specialized translation (cod. 9174)

Learning outcomes

The student knows the history, theoretical principles and state-of-the-art developments of machine translation; s/he is able to devise, carry out, manage and evaluate complex machine translation projects involving several professionals and a variety of skills and competences, including those for pre- and post-editing in a variety of registers and sublanguages, in a way that is consistent with professional ethics; s/he is able to acquire higher-level knowledge and competences related to machine translation technology independently, and to employ them for the optimization of related language industry processes.

Course contents

The "Machine Translation" (MT) module is delivered in the second semester.

In this course the key theoretical principles of MT are introduced, starting with a brief overview of the major milestones in its history, from the 1940's up to the latest developments, then the main architectures of MT systems are presented.
An in-depth look at linguistic and translation phenomena which are particularly challenging for MT processing follows, to raise the students' awareness, so that they can evaluate critically and objectively the potential as well as the limitations of this technology. Complex topics concerning the evaluation of MT quality and the effectiveness of MT systems are discussed, presenting both standard human evaluation methods (e.g. fluency and adequacy judgements) and state-of-the-art automatic evaluation metrics, such as BLEU, BERT, etc.

Going beyond theoretical and methodological presuppositions, students will then learn how to collect, process, and select data to train and adapt a MT system.

This is followed by an exploration of related issues, such as the evaluation of the time, effort and cost required to introduce MT into translation workflows, MT quality estimation and prediction, relative and absolute performance of MT systems with different language pairs and linguistic domains.

Based on the explanations of the theoretical concepts and of the key methodological notions, these topics are addressed with practical exercises and reflective activities concerning a range of scenarios in which MT can be deployed. This helps the students to systematically consider its pros and cons, simulating the professional environments of companies and institutions that use MT software on a regular basis.

Readings/Bibliography

During lessons based on theoretical aspects, the teacher will be using the following bibliographical references:

Bentivogli, L., Bisazza A., Cettolo M., Federico, M. (2016) "Neural versus phrasebased machine translation quality: a a case study".
arXiv preprint arXiv:1608.04631.

Thierry Poibeau (2017), "Machine Translation", MIT Press

Joss Moorkens, S. Castilho, F. Gaspari, S. Doherty (2018) “Translation Quality Assessment: From Principles to Practice”, Springer

Bernard Scott (2018), “Translation, Brains and the Computer: A Neurolinguistic Solution to Ambiguity and Complexity in Machine Translation”, Springer

Philipp Koehn (2020), “Neural Machine Translation”, Cambridge University Press

Kyunghyun C., van Merrienboer B., Bahdanau D., Bengio Y. (2016) "On the Properties of Neural Machine Translation: Encoder-Decoder Approaches" arXiv.org > cs > arXiv:1409.1259

Toral, A., Sanchez-Cartagena V.M. (2017) "A Multifaceted Evaluation of Neural versus Phrase-Based Machine Translation for 9 Language Directions". Available online: https://arxiv.org/pdf/1701.02901.pdf

Teaching methods

Lessons will be partly held as lectures (mainly covering the theoretical and methodological aspects), and partly as workshops, using a participative approach.

Theoretical and methodological aspects are presented by the lecturer and explored in more depth independently by the students through readings assigned during the course (given the fast pace at which the field evolves, the reading list included here is to be considered provisional, to be updated and refined during the course). The applied part consists of hands-on practice sessions in the lab, led by the lecturer with input and interaction by the students, and take-home exercises to be done autonomously or in groups by the students.

Attendance is compulsory (at least 70% of lessons need to be attended).

As concerns the teaching methods of this course unit, all students must attend the online Modules 1, 2 on Health and Safety [https://www.unibo.it/en/services-and-opportunities/health-and-assistance/health-and-safety/online-course-on-health-and-safety-in-study-and-internship-areas].

 

 

Assessment methods

The assessment consists of two parts: an individual project and an oral test.

The individual project will be assigned to the student during the last month of the course, and it consists of training (adapting) a machine translation system and analyzing its performance.

During the oral test, the student may discuss the approach used and results obtained in their project, as well as answer a few questions about the course contents.

Evaluation model

30 – 30L excellent results, demonstrating a great understading of  course contents, as well as good ability at explaining and evaluating pros and cons of different approaches to machine translation and its applications.

27 – 29 above average results, with minor errors or balanced by a good knowledge of fundamental concepts and applications.

24 – 26 good results, with some errors or knowledge gaps that show a partial understanding of contents and required skills.

21 – 23 sufficient results, but with notable gaps in knowledge or skills acquired in the course contents.

18 – 20 results that only prove minimal knowledge of the course contents.

< 18 insufficient, basic concepts have not been understood or demonstrated, the students has to take again the test.

Teaching tools

Lessons are held in a computer lab with Internet connection and projector.
Students will also use (online) machine translation software, tools and resources.

Office hours

See the website of Federico Garcea