85575 - Semantic Digital Libraries (1) (LM)

Academic Year 2023/2024

Learning outcomes

Libraries have always been an inspiration for the standards and technologies developed by semantic web activities. At the end of the course the students will learn how to manage the process related to a DL creation: from the metadata choice to the ontologies selection; from the network issues to the architecture implementation; from the preservation of data to the curation of the life cycle of digital cultural objects.

Course contents

Learning outcomes

Libraries provide fertile ground for applying semantic web standards and technologies, as well as more recent developments in semantic services powered by machine learning technology. This course offers an overview of the most recent trends in semantic technologies for libraries.

After completing this course, the student is able to:

  • Understand how semantic technologies can be used in the context of libraries and other cultural organizations.

  • Assess novel semantic technologies and apply them in the context of libraries.

  • Select and apply semantic web standards to enrich library services.

  • Select and apply machine learning techniques to enrich library services.

Course contents

The course is tentatively articulated into the following parts (per week):

Week 1: Introduction

  • Introduction to the course, including:

    1. A brief delineation of semantic digital libraries, their historical development, and main components.

    2. Course logistics, assessment, and projects.

  • An overview of the main LOD (Linked Open Data) standards for semantic digital libraries.

  • Examples of semantic digital library platforms: Islandora, Omeka.

Laboratory (graded activity): take a user perspective and assess the offering of a digital library; it can be any national library, institutional, or private system. What does the digital library offer to the users in terms of functionalities? Which of these functionalities are “semantic”? Why? Present your results to the class.

Week 2: Semantic Web and interoperability standards

  • Ontologies for Semantic Publishing and Referencing (SPAR).

  • Textual contents (OCR-D).

  • Annotations: Open Annotation and the Web Annotation Data Model.

  • Interoperability: International Image Interoperability Framework (IIIF), InterPlanetary File System (IPFS), Distributed Text Services (DTS).

Laboratory (graded activity): Explore and compare the use of Semantic Web standards, technologies, and tools in different national library systems. Also, look for a Semantic Web strategy/vision document, and compare it with reality/roadmap. Present your results to the class.

Project ideas brainstorming session.

Week 3: Artificial Intelligence I

  • AI for Semantic Digital Libraries.

  • How to run an annotation campaign.

Laboratory (graded activity): Explore and compare the usage of AI technologies and tools in different national library systems. Present your results to the class.

Project ideas brainstorming session, project updates.

Week 4: Artificial Intelligence II

  • Information extraction: H/OCR with Transkribus.

  • Information extraction: layout parsing.

Laboratory (graded activity): Design and execute an H/OCR information extraction project with Transkribus.

Project ideas brainstorming session, project updates.

Weeks 5: Case studies and advanced topics

Blockchains and Non-Fungible Tokens (NFTs) in digital art.

Laboratory: group work on projects and office hours.

Note that this list of topics is tentative and might still change slightly.

    Readings/Bibliography

    A list of texts will be provided by the lecturer before the beginning of each week.

    General references

    • Kruk and McDaniel (eds.). 2009. Semantic Digital Libraries. Springer.

    • Banarjee and Rese. 2018 (2nd ed.). Building Digital Libraries. ALA Neal-Schuman.

    • van Hooland and Verborg. 2019. Linked Data for Archives, Libraries and Museums. Facet Publishing.

    Primers

    For who needs a refresher or exposure to the background for this course:

    • A brief history of computers [https://www.explainthatstuff.com/historyofcomputers.html] (this is not a prerequisite, just a reference)

    • Introduction to XML [https://www.w3schools.com/xml/xml_whatis.asp] (this is not a prerequisite, just a reference)

    • Metadata basics [https://www.dublincore.org/resources/metadata-basics]

    • Introduction to Linked Data [http://linkeddatabook.com/editions/1.0] (ch. 1 and 2)

    • RDF primer [https://www.w3.org/TR/rdf11-primer]

    • Introduction to information retrieval [https://nlp.stanford.edu/IR-book/html/htmledition/boolean-retrieval-1.html] (this is not a prerequisite, just a reference)

    • Introduction to machine learning [http://ciml.info/] (ch. 1 and 2) (this is not a prerequisite, just a reference)

    Teaching methods

    Recommended prior knowledge

    An understanding of library services and workflows is recommended, for example, acquired by attending the course ‘Knowledge Organization in Libraries and Archives’.

    Teaching method and contact hours

    Lectures, seminars, and laboratories. All sessions take place in person.

    The students can reach out after class, during office hours (please refer to the lecturer's page), and via email (as a last resort).

    Assessment methods

    In-class laboratories conducted during the first part of the course will yield a small (up to 2 points) contribution toward the final grade.

    The final examination consists of the presentation of an original project. The students propose the project. For example, it can focus on an in-depth, technical analysis of an existing (semantic) digital library platform; the design and specification of the user and technical requirements for a new one; the exploration of a currently open research area related to semantic digital libraries; or a case study using a tool/technology discussed in class (e.g., Transkribus). A project may be theoretical, applied, or both.

    The project guidelines will be shared at the beginning of the course. Students will be asked to work in small groups (2-3). Individual projects are allowed for motivated reasons. The students are highly encouraged to do their projects during the duration of the course and present their results shortly after the end of the course, at a dedicated session.

    The personal contribution of each member of a group will be assessed during an individual oral colloquium at a regular exam session. In the oral colloquium, both the project and the course contents will be assessed. The project and the oral examination contribute 50% of the final grade each.

    The program for non-attending students is the same, except that the project might be done individually. Furthermore, non-attending students will have to include extra readings to complement in class contents. Readings will be provided before the start of each week throughout the course.

    Teaching tools

    Slides, live coding, demonstrations, readings, and seminar discussions.

    Classes are held in a classroom equipped with personal computers connected to the Intranet and Internet.

    Office hours

    See the website of Giovanni Colavizza