91966 - MODERN STATISTICS AND BIG DATA ANALYTICS

Academic Year 2019/2020

  • Teaching Mode: Traditional lectures
  • Campus: Bologna
  • Corso: Second cycle degree programme (LM) in Statistical Sciences (cod. 9222)

    Also valid for Second cycle degree programme (LM) in Statistical Sciences (cod. 9222)

Course contents

Cluster analysis: k-means, construction of distances, hierarchical clustering, partitioning around medoids, average silhouette width, mixture models, with algorithms, R-coding, theory, applications and in-depth discussion
 
Dimension reduction: Variable and feature selection in regression, cross-validation, model selection criteria, Lasso, with algorithms, R-coding, theory, applications and in-depth discussion

Readings/Bibliography

Everitt, B. S., Landau, S., Leese, M., Cluster Analysis (fourth edition), E. Arnold 2001

Hennig, C., Meila, M., Murtagh, F., and Rocci, R., Handbook of Cluster Analysis, Taylor & Francis 2016.

Hastie, T., Tibshirani, R., Friedman, J., The Elements of Statistical Learning (second edition), Springer 2009.

Lecture Notes

Teaching methods

Classroom lessons, tutorials, computer workshop

Assessment methods

2 hours written exam. 5/30 marks can be earned from homework activity.

Teaching tools

Lecture Notes, supporting material provided on the web

Office hours

See the website of Christian Martin Hennig