- Docente: Laura Anderlucci
- Credits: 6
- SSD: SECS-S/01
- Language: English
- Teaching Mode: Traditional lectures
- Campus: Bologna
- Corso: Second cycle degree programme (LM) in Statistical Sciences (cod. 9222)
Learning outcomes
By the end of the course the student knows the fundamentals of the most important multivariate techniques to build supervised statistical models for predicting or estimating an output based on one or more inputs. The student is able to represent and organize knowledge about large-scale data collections, and to turn data into actionable knowledge.
Course contents
Part 0: Introduction to Supervised Statistical Learning
Part 1: Resampling methods
- Cross-Validation
Part 2: Classification
- Naive Bayes
- k-Nearest Neighbours
- Logistic Regression
- Linear Discriminant Analysis
Part 3: Dimension Reduction and Regularisation
Part 4: Tree-based methods
- Regression and Classification trees
- Bagging; Random Forests; Boosting
Part 5: Overview of the main machine learning methods
- Support Vector Machines
- Neural Networks
Readings/Bibliography
The primary text for the course:
- James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to Statistical Learning. New York: Springer.
Freely available at: http://www-bcf.usc.edu/~gareth/ISL/ISLR%20Seventh%20Printing.pdf
In addition, we will use:
- T. Hastie, R. Tibshirani, and J. Friedman (2001) The Elements of Statistical Learning: data mining, inference and prediction. Springer Verlag.
Freely available at: https://web.stanford.edu/~hastie/Papers/ESLII.pdf
Teaching methods
Lectures and practical sessions.
Assessment methods
Written exam with theoretical questions and practical exercises to be solved in R.
Teaching tools
The following material will be provided: slides of the lectures, exercises with solutions, mock exam.
Office hours
See the website of Laura Anderlucci
SDGs

This teaching activity contributes to the achievement of the Sustainable Development Goals of the UN 2030 Agenda.