Anno Accademico 2020/2021
- Docente: Christian Martin Hennig
- Crediti formativi: 10
- SSD: SECS-S/01
- Lingua di insegnamento: Inglese
- Modalità didattica: Convenzionale - Lezioni in presenza
- Campus: Bologna
- Corso: Laurea Magistrale in Statistical sciences (cod. 9222)
Conoscenze e abilità da conseguire
By the end of the course, the student gains an understanding of theory and computing of modern statistical methods, with particular emphasis on methods for analysing large amounts of data (big data). More specifically, the student acquires knowledge on the most important methods of statistical learning and prediction and the skills required to solve real-world and decision-making problems.
Contenuti
Cluster analysis: k-means, construction of distances, hierarchical clustering, partitioning around medoids, average silhouette width, mixture models, with algorithms, R-coding, theory, applications and in-depth discussion
Dimension reduction: Variable and feature selection in regression, cross-validation, model selection criteria, Lasso, with algorithms, R-coding, theory, applications and in-depth discussion
Robust statistics: Influence function, breakdown point, robust estimation of univariate and multivariate location and scale and regression, with algorithms, R-coding, theory, applications and in-depth discussion
Testi/Bibliografia
Everitt, B. S., Landau, S., Leese, M., Cluster Analysis (fourth edition), E. Arnold 2001
Hastie, T., Tibshirani, R., Friedman, J., The Elements of Statistical Learning (second edition), Springer 2009.
Hennig, C., Meila, M., Murtagh, F., and Rocci, R., Handbook of Cluster Analysis, Taylor & Francis 2016.
Maronna, R. A., Martin, R. D., Yohai, V. J., Salibián-Barrera, M., Robust Statistics: Theory and Methods (with R), 2nd Edition, Wiley 2019.
Metodi didattici
Classroom lessons, tutorials, computer workshop
Modalità di verifica e valutazione dell'apprendimento
The assessment will have four components. 5/30 marks are assigned to regular homework activity. 5/30 marks are assigned to a literature question to be done at home. 9/30 marks are assigned to a data analysis project to be done at home. 11/30 marks are assigned to a 2 1/2 hours exam comprising of a theoretical question and another data analysis project.
Strumenti a supporto della didattica
Lecture Notes, supporting material provided on the web
Orario di ricevimento
Consulta il sito web di Christian Martin Hennig