95662 - INTRODUCTION TO MACHINE LEARNING

Academic Year 2025/2026

Learning outcomes

At the end of the course, students will have gained a comprehensive understanding of key machine learning techniques. They will have the skills to effectively apply and adapt these methods across diverse situations. Additionally, students will be equipped with fundamental knowledge of the probabilistic concepts underpinning these methods.

Course contents

- Introduction to Machine Learning: regression and classification problems, main paradigms of machine learning, overfitting and regularization, an overview of the interpretation of overfitting in terms of model complexity.

- Basics of Information Theory: entropy of discrete and absolutely continuous random variables, source coding theorem, relative entropy and its application to machine learning.

- Linear Models for Supervised Learning: linear regression, logistic regression, geometric and probabilistic interpretation of loss functions and regularization.

- Stochastic Gradient Descent: review of the deterministic gradient descent method, stochastic gradient descent and convergence results, applications to model training.

- Deep Learning: fully connected neural networks, density results of neural networks in functional spaces, convolutional networks and transformers, characterization of equivariant layers under group actions.

- Review of Stochastic Process Theory: Gaussian processes, Markov chains and transition probabilities, diffusive stochastic processes.

- Language Models: introduction to the language modeling problem, language entropy, neural network-based approaches, large language models, pre-training and post-training.

- Reinforcement Learning: basic algorithms, deep Q-learning, reinforcement learning from human feedback, an overview of applications of reinforcement learning to the formalization of mathematics and automated theorem proving.

- Overview of Gaussian and Diffusion Models: Bayesian approach to regression and classification, Gaussian discriminative models, inverse Markov processes and generative diffusion models.

Prerequisites: fundamental notions of linear algebra, differential calculus in multiple variables, integral calculus, probability theory, Python coding, and group theory.

Recommended preparatory courses: Analisi Matematica I e II, Geometria I, Probabilità e statistica matematica, Informatica, Aritmetica e gruppi.

 

Readings/Bibliography

- Christopher Bishop, Pattern Recognition and Machine Learning

- Aston Zhang, Zack C. Lipton, Mu Li, Alex J. Smola, Dive into Deep Learning

- Christopher Bishop, Deep Learning

- Carl Edward Rasmussen and Christopher K. I. Williams, Gaussian Processes for Machine Learning

- Richard S. Sutton and Andrew G. Barto, Reinforcement Learning: An Introduction

- Dan Jurafsky and James H. Martin, Speech and Language Processing

Teaching methods

- Frontal lectures on the board and/or with slides.

- Coding and simulation activities in the laboratory.

Assessment methods

Submission of a final group project in Python, followed by an oral interview aimed at verifying the individual contributions of each student.

Teaching tools

- Office hours and tutoring.

- PDF lecture notes covering some parts of the program.

- Coding sessions supervised by a tutor.

Office hours

See the website of Stefano Pagliarani

See the website of Giovanni Paolini