90355 - Statistics For High Dimensional Data

Academic Year 2020/2021

  • Teaching Mode: Traditional lectures
  • Campus: Bologna
  • Corso: Second cycle degree programme (LM) in Economics (cod. 8408)

Learning outcomes

At the end of the course the student has acquired knowledge of the multivariate methods for analyzing high dimensional data. In particular, he/she is able: - to interpret methods of dimension reduction including principal component analysis and factor analysis - to interpret methods of clustering and discrimination - to apply the proper multivariate method and perform his/her own analysis of high dimensional datasets using the software R.

Course contents

- Multivariate and high dimensional problems. Basics of linear and matrix algebra. Random vectors and Gaussian random vectors.

- Principal component analysis: principal component method, visualising principal components, choosing the number of principal components.

- Factor analysis: factor model specification, identification,estimation, rotation, factor scores .

- Discriminant analysis: linear discriminant analyses, quadratic discriminant analysis, Fisher’s discriminant rule, linear discrimination for two normal populations and classes, evaluation of discriminant rules.

- Cluster analysis: distance and similaty measures, hierarchical Agglomerative Clustering, k-means Clustering


Readings/Bibliography

Koch I. Analysis of Multivariate and High Dimensional Data, Cambridge University Press, 2014

Teaching methods

Lectures and tutorials with the software R

Assessment methods

A final project on real data analysed with the software R on one or combined topics of the course and an oral exam. The oral exam consists of a discussion of the project and theoretical questions.

Teaching tools

Teacher's note available at https://virtuale.unibo.it/

Office hours

See the website of Silvia Cagnone