29398 - Computational Linguistics (1) (LM)

Academic Year 2019/2020

  • Teaching Mode: Traditional lectures
  • Campus: Bologna
  • Corso: Second cycle degree programme (LM) in Digital Humanities and Digital Knowledge (cod. 9224)

    Also valid for Second cycle degree programme (LM) in Computer Science (cod. 8028)

Learning outcomes

At the end of this course, the student acquires foundational notions about Natural Language Processing with particular attention at the statistical/algorithmic techniques. The methods and instruments from Natural Language Processing will be then applied at each level of linguistic analysis.

Course contents

  • Part I: Foundations
    • Introduction
      • Natural Language Processing - Problems and perspectives
      • Introduction/Recall to/of probability calculus
        • N-grams and Language Models
        • Markov Models
      • Recurrent Neural Network Language Models
      • The evaluation of NLP applications
    • Corpora
      • Corpora and their construction: representativeness
      • Concordances, collocations and measures of words association
      • Methods for Text Retrieval
  • Part II: Natural Language Processing
    • Computational Phonetics
      • Speech samples: properties and acoustic measures
      • Analysis in the frequency domain, Spectrograms
      • Applications in the acoustic phonetic field.
      • Speech recognition with HMM and Deep Neural Networks
    • Computational Morphology
      • Morphological operations
      • Static lexica, Two-level morphology
    • Computational Syntax
      • Part-of-speech tagging
      • Grammars for natural language
      • Natural language Parsing
      • Supplementary worksheet: formal grammars for NL
        • Formal languages and Natural languages. Natural language complexity
        • Phrase structure grammars, Dependency Grammars
        • Treebanks
        • Modern formalisms for parsing natural languages
    • Computational Semantics
      • Lexical semantics: WordNet and FrameNet
      • Word Sense Disambiguation
      • Word-Space models
      • Logical approaches to sentence semantics
  • Part III: Applications and Case studies:
    • Emotions and Sentiment in Speech and language
    • Topic modelling
    • (Automatic detection of Prosodic Prominence)
    • (Stylometry and Dialectometrics)

Readings/Bibliography

Some chapters extracted from:

  • McEnery T., Wilson A. (1996). Corpus Linguistics, Edinburgh University Press.
  • D. Jurafsky and J.H. Martin (2008). Speech and Language Processing, Prentice Hall.
  • A. Clark, C. Fox, S. Lappin (2010). The Handbook of Computational Linguistics and Natural Language Processing, Blackwell Handbooks in Linguistics.
  • Mitkow R. (ed.) (2003). The Oxford Handbook of Computational Linguistics.
  • Ritchie C. and Mellish C. (2000). Techniques in Natural Language Processing.

Slides, handouts and papers downloadable from the course web site.

Teaching methods

Face-to-face classes and labs for 30 hours.

Assessment methods

Students attending this course can choose between two different exam types:

  • develop a project, previously approved by the teacher, write a report on it (at least 10 pages) and discuss it at the oral exam with some other questions on other course topics. A list of project proposals can be found here. Students are allowed to suggest other project topics to the teacher;
  • a classical oral colloquium consisting of at least three questions on the course contents.


The oral colloquium is designed to evaluate the critical skills and methodological knowledge gained by the student.

Reaching a clear view of all the course topics as well as using a correct language terminology will be valued with maximum rankings. Mnemonic knowledge of the course topics or not completely appropriate terminology will be valued with intermediate rankings. Unknown topics or inappropriate terminology use will be valued, depending on the seriousness of the omissions, with minimal or insufficient rankings.

With regard to the project report, a critical analysis of the problem and the proposed solution(s) as well as a proper evaluation of the proposed solution(s) are required to get maximum rankings.The exam consists of an oral colloquium on the course contents designed to evaluate the critical skills and methodological knowledge gained by the student.

It is compulsory to register for the exam using the online procedure.

Teaching tools

The course web site is the central point for any kind of information about the course. It contains the handouts and the readings discussed during the lessons as well as a rich software repository useful for laboratory practice.

A live USB key, downloadable from the course website, has been prepared for the students containing a complete computing environment to practice with the procedures proposed during the course. This tool will be used also in the laboratory sessions.

Links to further information

http://corpora.ficlit.unibo.it/NLP/

Office hours

See the website of Fabio Tamburini

SDGs

Quality education Partnerships for the goals

This teaching activity contributes to the achievement of the Sustainable Development Goals of the UN 2030 Agenda.