- Docente: Angelo Di Iorio
- Credits: 8
- SSD: INF/01
- Language: Italian
- Teaching Mode: Traditional lectures
- Campus: Bologna
- Corso: Second cycle degree programme (LM) in Digital Innovation Policies and Governance (cod. 5889)
Learning outcomes
The course aims to provide knowledge about models, processes, and tools for the representation and processing of digital documents, particularly administrative and regulatory documents, and their organization into accessible and interoperable document repositories.
At the end of the course, the student will know the main techniques for extracting information from text documents.
The student will be able to represent and link digital text documents, design complex document repositories, and analyze texts to automatically extract meaningful information.
Course contents
The course includes an introductory part on the main techniques for document markup and the identification of their structural components, with a particular focus on XML technologies and validation languages.
The main document formats will also be studied, along with their characteristics, fields of application, and limitations.
Subsequently, some languages and tools for the transformation between document formats will be covered.
Additionally, the course will address some models and tools for the automatic extraction of information from text, as well as the main Natural Language Processing techniques.
Students with SLD or temporary or permanent disabilities. It is suggested that they get in touch as soon as possible with the relevant University office (https://site.unibo.it/studenti-con-disabilita-e-dsa/en) and with the lecturer in order to seek together the most effective strategies for following the lessons and/or preparing for the examination.
Readings/Bibliography
Due to the rapid evolution of the subject, no single textbook is used.
All teaching materials will be made available on Virtuale, including references to online resources for further study and useful practice for the project work.
Teaching methods
The course includes lectures, with slides provided on Virtuale along with other teaching materials.
Laboratory activities are also planned, focusing on the languages studied during the course and the use of some Python libraries.
Assessment methods
The final evalution will consist of the preparation and presentation of a project work, individual or in group of maximum two students.
See the Italian version for details.
Teaching tools
Slides and lab exercises
Office hours
See the website of Angelo Di Iorio
SDGs


This teaching activity contributes to the achievement of the Sustainable Development Goals of the UN 2030 Agenda.