85446 - Knowledge Representation and Extraction (1) (LM)

Course Unit Page

  • Teacher Andrea Giovanni Nuzzolese

  • Credits 6

  • SSD INF/01

  • Language English

  • Campus of Bologna

  • Degree Programme Second cycle degree programme (LM) in Digital Humanities and Digital Knowledge (cod. 9224)

  • Course Timetable from Mar 21, 2022 to May 03, 2022

SDGs

This teaching activity contributes to the achievement of the Sustainable Development Goals of the UN 2030 Agenda.

Quality education

Academic Year 2021/2022

Learning outcomes

This course aims to initiate to methods for interpretation of data and content as knowledge sources. At the end of the new course the students will be able to: master the basics of knowledge representation and reasoning, with application to the Semantic Web (ontologies, linked data, knowledge patterns); be familiar with the state-of-the-art in knowledge representation and extraction technologies; use applications to automatically extract knowledge from text; analyse the knowledge requirements of a customer, and produce a plan to implement them.

Course contents

The students will master the basics of representation and extraction of knowledge, intended as data suitable to machine querying jointly with automated reasoning and generalised inferences. The course includes the following themes:

  • Knowledge Representation as computational logic methods (knowledge graphs, ontology design patterns, description logic, linked data frameworks)
  • Knowledge Extraction as a hybridization of either rule-based heuristics (scraping, linguistic patterns, graph-based data analysis), or statistical methods (machine learning, data mining) for extracting data from arbitrary content
  • Representation and extraction will be presented both from a foundational (philosophical, cognitive) viewpoint, as well as optimal satisfaction of task-oriented requirements
  • The Web will be the computational platform to learn, test, and apply the learnt methods, with examples from multiple domains, with a focus on social sciences and humanities.
  • Multiple software components will be introduced during the course in order to make the students accustomed to semantic technologies

Readings/Bibliography

Specific teaching materials, including papers, slides and exercises, will be available on the course site.
Anyways, the following texts can be used as a generic reference for the course:

  • Logic in Action course by Johan van Benthem group (only chapters 1:4): http://www.logicinaction.org
  • World Wide Web Consortium Semantic Web standards portal: https://www.w3.org/standards/semanticweb/
  • P. Hitzler, A. Gangemi, K. Janowicz, A. Krishnadi, V. Presutti (eds.). Ontology Engineering with Ontology Design Patterns: Foundations and Applications, IOS Press, Amsterdam (2016)
  • A. Gangemi. A Comparison of Knowledge Extraction Tools for the Semantic Web. Proceedings of ESWC2013, LNCS, Springer (2013)

Teaching methods

The course will be given in frontal lectures of 3 hours each, possibly including host lectures.
A tutor will complement the course with hands-on and small projects in order to improve practical skills in realistic settings.

Time permitting, a small project will be implemented by groups of students. Informal contests, such as a semantic treasure hunt, may be proposed to students.

Projects and informal contests will contribute to the final grades.

Assessment methods

The final exam will consist of a project work to be presented/defended with the teacher. Students will work in small teams (typically 3 members), choosing a realistic domain or problem that can be modelled, investigated, evaluated, and published on the Web.
The choice will be guided by the teacher and the tutor, who will also accompany the teams in their work when needed. The work should be fairly spread among the members, and motivated during the exam presentation.
In well-justified cases, a student can take a written test (discouraged due to the practical nature of the course).
For past years' students, there is still the choice between a project and a written test.


The final grades will be communicated to the students, and later averaged with the other module's grades.

Teaching tools

Besides the teaching facilities installed in the lab, software tools for design, reasoning, extraction, querying, and visualization of formal knowledge will be used on the existing machines by the students, either alone, in pairs, or in groups. The tools will enable the students to experiment a likely setting for real world semantic technology projects.


Social media will be also used for informal interaction among students, and with the teacher.

Office hours

See the website of Andrea Giovanni Nuzzolese