82011 - Corpus Linguistics

Course Unit Page

  • Teacher Silvia Bernardini

  • Credits 6

  • SSD L-LIN/12

  • Teaching Mode Traditional lectures

  • Language English

  • Campus of Forli

  • Degree Programme Second cycle degree programme (LM) in Specialized translation (cod. 9174)

  • Teaching resources on Virtuale


This teaching activity contributes to the achievement of the Sustainable Development Goals of the UN 2030 Agenda.

Quality education Partnerships for the goals

Academic Year 2021/2022

Learning outcomes

The student knows the basic features (terms, concepts, methods and techniques) needed to build and analyse corpora; s/he is able to understand, analyze and evaluate the structures, functions and textual and discursive organization of the English language; s/he is able to employ the knowledge acquired through the empirical analysis of texts to inform translation choices and the production of coherent and complex written texts and oral speeches in a variety of specialized text types and genres.

Course contents

Corpus linguistics is an approach to the empirical analysis of languages based on techniques and methods for the qualitative and quantitative analysis of collections of texts in electronic format. Through a didactic approach that combines lectures and workshops, the course offers a theoretical and practical introduction to corpus linguistics that will allow students to apply the acquired knowledge to practical tasks ranging from the collection of texts, through their structuring and annotation with metadata and the generation of research hypotheses, to the analysis and description of the results obtained. Different types of corpora (monolingual, comparable, parallel) and different tools for their consultation (corpus query tools or concordancers) will be presented. The main focus will be on the English language, also in relation to other languages known by the students, and will make constant reference to the applications of corpus linguistics to translation studies (also known as corpus-based translation studies). At the end of the course students will be able to extend their acquired knowledge to application and research areas such as the creation of textual resources for machine translation, the didactic use of corpora, discourse analysis, stylistics and so on.

Academic writing competences are specifically focused upon in a 20-hour seminar ("lettorato"), devoted to them.


Crawford, W. and Csomay, E. 2016. Doing corpus linguistics. Oxford and New York: Routledge.

Egbert, J., Larsson, T. and Biber, D. 2020. Doing linguistics with a corpus. Cambridge: Cambridge University Press.

McEnery, T. and A. Hardie 2012. Corpus linguistics. Method, theory and practice. Cambridge: Cambridge University Press.

Mikhailov, M. and R. Cooper 2016. Corpus linguistics for translation and contrastive studies. Oxford and New York: Routledge

Other readings will be chosen jointly by the lecturer and the students, based on the areas of application of corpus linguistics focused upon. Students will be encouraged to actively search for relevant literature, and to share it with the class.

Teaching methods

The module is structured around a) a series of lectures covering the main theoretical and methodological aspects of corpus linguistics, and b) extensive hands-on, workshop-like lessons in which students apply the knowledge gained in the lectures by building and using their own corpora and by consulting existing ones available in the public domain.

Hands-on activities are problem-based, i.e. they revolve around authentic problem that students solve working autonomously or in small groups. Peer support and the lecturer' scaffolding create a relaxed learner-centred environment conducive to the development of relational and problem-solving skills.

Assessment methods

Success in learning is assessed through observation and interaction in class and through unassessed coursework such as oral presentations and short writing exercises, along the lines of the final exam.

The end of course exam consists in the preparation of an abstract outline of a research project (in linguistics, translation studies, language teaching, etc) involving the use of language corpora.

Abstracts should be between 800 and 1,000 words and include a list of references (not included in the word count). They should provide a clear outline of the aim of the paper, including clearly articulated research question(s), details about the research approach and method(s), and (preliminary) results.

The abstracts will be submitted to the course teacher who will make a preliminary assessment, to be followed by a brief interview with the candidate about the work, resulting in a final assessment grade.

A maximum of 10 points out of 30 is assigned on the basis of formal/linguistic aspects: written and oral academic English language skills (lexis and grammar, structure, register and genre). The remaining 20 points are assigned on the basis of content (competences and skills related to the subject matter of the module): understanding of theoretical notions, command of techniques for searching, analysing and reporting corpus data, capacity for original thought and argumentation.

Formal/linguistic aspects (language skills)

  • 10 points: excellent language skills
  • 8-9 points: good/very good language skills
  • 7 points: sufficient language skills
  • 0-6 points: inadequate/seriously deficient language skills

Content-related aspects (competences and skills related to the subject matter of the module)

  • 20 points: excellent subject-related competences and capacities
  • 16-19 points: very good subject-related competences and capacities
  • 12-15 points: good subject-related competences and capacities
  • 11 points: sufficient subject-related competences and capacities
  • 0-10 points: inadequate/seriously deficient subject-related competences and capacities

Teaching tools

Both lecture-like and workshop-like sessions take place in a computer lab equipped with PCs and a data projector, so as to be able to switch back and forth between the two teaching methods.

Slides are used for lectures and subsequently made available to the students via the Moodle/Virtuale platform, in PDF or MS PPT format.

During workshop sessions, students have individual hands-on access to software for constructing and analysing corpora (e.g., Intertext editor, AntConc, NoSketch Engine).

As concerns the teaching methods of this course unit, all students must attend the online Modules 1, 2 on Health and Safety.

Office hours

See the website of Silvia Bernardini