- Docente: Christian Martin Hennig
- Crediti formativi: 10
- SSD: SECS-S/01
- Lingua di insegnamento: Inglese
- Modalità didattica: Convenzionale - Lezioni in presenza
- Campus: Bologna
- Corso: Laurea Magistrale in Statistical Sciences (cod. 9222)
Conoscenze e abilità da conseguire
By the end of the course, the student gains an understanding of theory and computing of modern statistical methods, with particular emphasis on methods for analysing large amounts of data (big data). More specifically, the student acquires knowledge on the most important methods of statistical learning and prediction and the skills required to solve real-world and decision-making problems.
Contenuti
Cluster analysis: k-means, construction of distances, hierarchical clustering, partitioning around medoids, average silhouette width, mixture models, with algorithms, R-coding, theory, applications and in-depth discussion; advances techniques of cluster analysis; outlook on big data issues
Robust statistics: Influence function, breakdown point, robust estimation of univariate and multivariate location and scale and regression, with algorithms, R-coding, theory, applications and in-depth discussion
Testi/Bibliografia
Everitt, B. S., Landau, S., Leese, M., Cluster Analysis (fourth edition), E. Arnold 2001
Hastie, T., Tibshirani, R., Friedman, J., The Elements of Statistical Learning (second edition), Springer 2009.
Hennig, C., Meila, M., Murtagh, F., and Rocci, R., Handbook of Cluster Analysis, Taylor & Francis 2016.
Maronna, R. A., Martin, R. D., Yohai, V. J., Salibián-Barrera, M., Robust Statistics: Theory and Methods (with R), 2nd Edition, Wiley 2019.
Metodi didattici
Classroom lessons, tutorials, computer workshop
Given the nature of the activities, for participating in this course course it is required having participated before in the e-learning activity
Module 1 and 2 on security in places of study
Modalità di verifica e valutazione dell'apprendimento
The assessment will have two components plus a bonus component. About 10/30 marks are assigned to a literature question to be done at home. These marks will be given for the ability to understand the scientific explanation of new methodology which is based on and closely related to methodology introduced in the course, the understanding of which is implicitly also assessed. About 20/30 marks are assigned to a 3 hours exam comprising of a theoretical exercise, a data analysis project, and an exercise that asks questions interpreting the given output of another data analysis. Aspects examined here are understanding of the theoretical background and how it is derived (carrying a rather low percentage of the marks), the ability to apply methodology learnt in the course to a real dataset, and the ability to understand and draw practically relevant conclusions from the computer output of such methodology.
Strumenti a supporto della didattica
Lecture Notes, supporting material including datasets provided on the Virtuale website.
Orario di ricevimento
Consulta il sito web di Christian Martin Hennig
