B1873 - Machine Learning for Humanities (1) (LM)

Academic Year 2024/2025

  • Teaching Mode: Traditional lectures
  • Campus: Bologna
  • Corso: Second cycle degree programme (LM) in Digital Humanities and Digital Knowledge (cod. 9224)

Learning outcomes

At the end of the course, the studentis familiar with the theoreticalprinciplesunderpinning modernmachine learning. The student isfurtherable to understand, apply and evaluatethe main machinelearningtechniquesand implementationsrelevantto addressingpracticalproblems and tasksin thedomains ofCultural Heritageand GLAM.Lastly, the student is able to critically reflect on thepreconditions and implications of using machine learning in these domains.

Course contents

This course offers an introduction to Machine Learning (ML), with a focus on applications in the Arts and Humanities. The course will introduce foundational ML concepts and methods, and will be complemented by laboratory activities where methods will be implemented.

After completing this course, the student can:

  • Understand basic and advanced Machine Learning concepts and methods.

  • Find software libraries that can be used to develop ML applications.

  • Implement ML applications using Python.

  • Evaluate whether ML could be used in an Arts and Humanities task.

Course contents

Machine Learning (ML) is increasingly used in the context of Arts & Humanities research and GLAM applications (Galleries, Libraries, Archives, Museums). Examples range from text recognition and information extraction from historical sources to image search and analysis on artwork collections, from automatic 3D reconstructions of built heritage to the automatic detection of archeological sites from satellite or drone images. This course will lay the foundations for the students to explore and implement similar applications and more.

The breakdown of the topics is as follows (per week):

  1. Week 1: Introduction to Machine Learning, part 1. We discuss the course setup, the fundamentals of machine learning, the types of ML tasks, the key components of an ML workflow, some foundational mathematical concepts, and linear regression. We implement linear regression in numpy.

  2. Week 2: Introduction to Machine Learning, part 2. We discuss the worked-out examples of linear regression, linear classification, and the Multi-Layer Perceptron (MLP), and implement them in numpy.

  3. Week 3: PyTorch and Machine Vision. We introduce PyTorch, discuss tasks in machine vision and the main architectures to work with images (convolutional and residual networks). We implement an image-based task in PyTorch.

  4. Week 4: Language Processing. We introduce tasks in natural language processing and the Transformer, the main architecture to work with texts. We implement text-based tasks in PyTorch.

  5. Week 5: Generative AI. We discuss generative AI models, in particular focusing on Large Language Models (LLMs). We see how to use LLMs in practice and mention advanced applications such as Retrieval Augmented Generation (RAG). We implement a RAG-enabled chatbot using LlamaIndex and Chainlit.

Note that this list of topics is tentative and might still change slightly.

Readings/Bibliography

The materials will be provided via GitHub. The students are expected to take their own notes.

Readings

BOOK: Zhang et al., Dive Into Deep Learning, MIT Press, 2023. https://d2l.ai/index.html.

Readings:

Visual Neural Networks course by 3blue1brown: https://www.youtube.com/playlist?list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi.

Further resources: Awesome AI 4 LAM. https://ai4lam.github.io/awesome-ai4lam.

Teaching methods

Lectures and live coding sessions. Attending students are expected to come prepared to class.

Assessment methods

Oral exam on all course contents (50%) and individual project (50%). The student may select a topic for the project in agreement with the lecturer. The project must be applied, i.e., entailing a substantial component of writing of code and the development of an ML application of choice. The project has to be sent to the lecturer at least 5 days in advance of the oral exam date. The project must be submitted before taking the exam. Project guidelines will be provided at the beginning of the course.

The program for non-attending students is the same.

Recommended prior knowledge

You need to know how to code in Python. High-school algebra and calculus are also expected (you can refresh them using the first reading below). The students will also benefit from having attended 1st-year courses such as ‘Computational Thinking’.

Teaching tools

Slides, live coding, demonstrations, readings, and seminar discussions.

Classes are held in a classroom equipped with personal computers connected to the Internet.

Links to further information

https://github.com/Giovanni1085/UNIBO_MachineLearning

Office hours

See the website of Giovanni Colavizza