93050 - SUPERVISED STATISTICAL LEARNING

Anno Accademico 2021/2022

  • Docente: Laura Anderlucci
  • Crediti formativi: 6
  • SSD: SECS-S/01
  • Lingua di insegnamento: Inglese
  • Modalità didattica: Convenzionale - Lezioni in presenza
  • Campus: Bologna
  • Corso: Laurea Magistrale in Statistical sciences (cod. 9222)

Conoscenze e abilità da conseguire

By the end of the course the student knows the fundamentals of the most important multivariate techniques to build supervised statistical models for predicting or estimating an output based on one or more inputs. The student is able to represent and organize knowledge about large-scale data collections, and to turn data into actionable knowledge.

Contenuti

Part 0: Introduction to Supervised Statistical Learning

Part 1: Resampling methods

  • Cross-Validation

Part 2: Classification

  • Naive Bayes
  • k-Nearest Neighbours
  • Logistic Regression
  • Linear Discriminant Analysis

Part 3: Dimension Reduction and Regularisation

Part 4: Tree-based methods

  • Regression and Classification trees
  • Bagging; Random Forests; Boosting

Part 5: Overview of the main machine learning methods

  • Support Vector Machines
  • Neural Networks

Testi/Bibliografia

The primary text for the course:

  • James, G., Witten, D., Hastie, T., & Tibshirani, R. (2021). An introduction to Statistical Learning. Second Edition. New York: Springer. ISBN: 978-1-0716-1417-4. E-book ISBN 978-1-0716-1418-1

    The book is freely available here:
    https://hastie.su.domains/ISLR2/ISLRv2_website.pdf

 

In addition, we will use:

  • T. Hastie, R. Tibshirani, and J. Friedman (2001) The Elements of Statistical Learning: data mining, inference and prediction. Springer Verlag.
    Freely available at: https://web.stanford.edu/~hastie/Papers/ESLII.pdf

Metodi didattici

Lectures and practical sessions.


Modalità di verifica e valutazione dell'apprendimento

The learning assessment is composed by a written test lasting 70 minutes. The written test is aimed at assessing the student's ability to use the learned definitions, concepts and properties and in solving exercises. During the written exam, students can only use the cheat sheet that is provided on virtuale.unibo.it, containing references to R packages and functions. Students cannot make use of the textbook, personal notes and mobile phones (smart watch or similar electronic data storage or communication device are not allowed either).

The written test consists of 5-7 questions, both multiple choice and open, some of which to be solved in R. The final grade is out of thirty.

Students that, despite having passed the exam, do not feel represented by the obtained result can ask to have an additional (optional) oral exam that can change the grade by +/-3 points.

Strumenti a supporto della didattica

The following material will be provided: slides of the lectures, exercises with solutions, mock exam.


Orario di ricevimento

Consulta il sito web di Laura Anderlucci

SDGs

Istruzione di qualità

L'insegnamento contribuisce al perseguimento degli Obiettivi di Sviluppo Sostenibile dell'Agenda 2030 dell'ONU.