85446 - Knowledge Representation and Extraction (1) (LM)

Academic Year 2017/2018

  • Docente: Aldo Gangemi
  • Credits: 6
  • SSD: INF/01
  • Language: English
  • Teaching Mode: Traditional lectures
  • Campus: Bologna
  • Corso: Second cycle degree programme (LM) in Digital Humanities and Digital Knowledge (cod. 9224)

Learning outcomes

This course aims to initiate to methods for interpretation of data and content as knowledge sources. At the end of the new course the students will be able to: master the basics of knowledge representation and reasoning, with application to the Semantic Web (ontologies, linked data, knowledge patterns); be familiar with the state-of-the-art in knowledge representation and extraction technologies; use applications to automatically extract knowledge from text; analyse the knowledge requirements of a customer, and produce a plan to implement them.

Course contents

The students will master the basics of representation and extraction of knowledge, intended as data suitable to machine querying jointly with automated reasoning and generalised inferences. The course includes the following themes:

  • Knowledge Representation as computational logical methods (knowledge graphs, ontology design patterns, description logic, linked data frameworks)
  • Knowledge Extraction as a hybridization of either rule-based heuristics (scraping, linguistic patterns, graph-based data analysis), or statistical methods (machine learning, data mining) for extracting data from arbitrary content
  • Representation and extraction will be presented from a foundational (philosophical, cognitive) viewpoint, as well as from the perspective of optimal satisfaction for task-oriented requirements
  • The Web will be the computational platform to learn, test, and apply the learnt methods, with examples from cultural heritage and beyond
  • Multiple software components will be introduced during the course in order to make the students accustomed to semantic technologies

The course will be given in frontal lectures of 3 hours each, possibly including practical hands-on with machines and expert guests.

Readings/Bibliography

The following texts and Web material can be used as a reference for the course:

  • World Wide Web Consortium Semantic Web standards portal: https://www.w3.org/standards/semanticweb/
  • P. Hitzler, A. Gangemi, K. Janowicz, A. Krishnadi, V. Presutti (eds.). Ontology Engineering with Ontology Design Patterns: Foundations and Applications, IOS Press, Amsterdam (2016)
  • A. Gangemi. A Comparison of Knowledge Extraction Tools for the Semantic Web. Proceedings of ESWC2013, LNCS, Springer (2013)
  • E. Hyvönen. Publishing and Using Cultural Heritage Linked Data on the Semantic Web. Morgan & Claypool (2012)

More teaching materials, including papers, slides and exercises, will be available on the AMS Campus site.

Teaching methods

The teaching method is based on slots of 3 hours each, including highly interactive frontal lectures, practical hands-on with machines, and question time with expert guests.

Time permitting, a small project will be implemented by groups of students. Informal contests, such as a semantic treasure hunt, may be proposed to students.

Projects and informal contests will contribute to the final grades.

Assessment methods

The final exam will consist of a 2h written test, typically based on a simple use case demanding for extraction and representation decisions. Multiple choice questions may also be proposed about the methods addressed during the course (the slides distributed on the AMS Campus site will correspond to the main subjects).

The grades will be based on the final exam (2/3), as well as possible projects/contests run during the course (1/3).

The final grades will be published on the teacher's webpage, normally within two weeks after the exam.

Students are required to come to the teacher’s office to register the exam onto their transcript.

Teaching tools

Besides the teaching facilities installed in the lab, software tools for design, reasoning, extraction, querying, and visualization of knowledge will be used on the existing machines by the students, either alone, in pairs, or in groups. The tools will enable the students to experiment a likely setting for real world semantic technology projects.

Social media will be also used for informal interaction among students, and with the teacher.

Office hours: see Aldo Gangemi's website.

Office hours

See the website of Aldo Gangemi