90355 - STATISTICS FOR HIGH DIMENSIONAL DATA

Course Unit Page

Academic Year 2020/2021

Learning outcomes

At the end of the course the student has acquired knowledge of the multivariate methods for analyzing high dimensional data. In particular, he/she is able: - to interpret methods of dimension reduction including principal component analysis and factor analysis - to interpret methods of clustering and discrimination - to apply the proper multivariate method and perform his/her own analysis of high dimensional datasets using the software R.

Course contents

- Multivariate and high dimensional problems. Basics of linear and matrix algebra. Random vectors and Gaussian random vectors.

- Principal component analysis: principal component method, visualising principal components, choosing the number of principal components.

- Factor analysis: factor model specification, identification,estimation, rotation, factor scores .

- Discriminant analysis: linear discriminant analyses, quadratic discriminant analysis, Fisher’s discriminant rule, linear discrimination for two normal populations and classes, evaluation of discriminant rules.

- Cluster analysis: distance and similaty measures, hierarchical Agglomerative Clustering, k-means Clustering


Readings/Bibliography

Koch I. Analysis of Multivariate and High Dimensional Data, Cambridge University Press, 2014

Teaching methods

Lectures and tutorials with the software R

Assessment methods

A final project on real data analysed with the software R on one or combined topics of the course and an oral exam. The oral exam consists of a discussion of the project and theoretical questions.

Teaching tools

Teacher's note available at https://virtuale.unibo.it/

Office hours

See the website of Silvia Cagnone