90591 - BIG DATA ANALYTICS

Academic Year 2022/2023

  • Teaching Mode: Traditional lectures
  • Campus: Bologna
  • Corso: Second cycle degree programme (LM) in Business Administration (cod. 0897)

Learning outcomes

This course will present multivariate statistical methods in several ways:

1) It offers an overview of the main multivariate methods in economics and business.

2) It helps in choosing the best statistical method for different economic data sets.

3) A brief explanation of some advanced methods, like spatial classification and clusterization, are also included. 

 

Course contents

1. Data and Introduction to Multivariate Statistics

1.1 Data and main problems

1.2 Aims of Multivariate Statistics

1.3 Time series Interpolation

2. Multivariate Regression Analysis

2.1 Univariate Regressions: a reference

2.2 Marginal Effects and dummy variables

2.3 Residual Analysis and Specification Tests

2.4 Violation of Gauss-Markov Hypotheses

2.5 WLS and 2SLS estimators

3 Multivariate Analysis

3.1 Variable selection: LASSO

3.2 Principal Component Analysis on residuals

4. Discrimination and Classification

4.1 Nonlinear Models 

4.2 Multinomial Logistic Regression

4.3 Spatial Probit Model

5. Clustering

5.1 Regression Trees

5.2 Hierarchical Clustering

5.3 K-means Clustering

5.4 Spatial Clustering

6. Time series

6.1 Time series components

6.2 Time series decomposition

6.3 Estimation of time series components

6.4 Simple forecasting methods 

7 Nonlinear Models, Power Transfromations and Panel Data.

7.1 Nonlinear Models

7.2 Power Trasformations of distributions

7.3 Panel Data

Readings/Bibliography

The following references are recommended:

Tsai Chun-Wei et al. (2015), Big Data Analytics: a survey, Journal of Big Data, 2:21.

Daniel Zelterman (2014), Applied Multivariate Statistics with
R, Springer.

Marno Verbeek (2005), Econometria, I edizione, Zanichelli
Editore.

William Greene (2019), Econometric Analysis, Pearson. Eighth
Edition (Global Edition).

Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani (2021), An Introduction to Statistical Learning with Applications in R, Springer.

Teaching methods

Lessons are carried out considering both methodological and empirical aspects in economics, with the help of the statistical software R.

The used economic datasets are all available in R. 

Assessment methods

Oral examination.

The examination consists into the evaluation of a work group.

The students are divided into groups and they will prepare a short thesis explaining the type of dataset used, the statistical method/s and the results with the use of R. 

Each group will also prepare a short presentation/seminar during which some questions are made.  

 

Teaching tools

Pc; videoprojector; computer laboratory

Office hours

See the website of Anna Gloria Billè

SDGs

Quality education Industry, innovation and infrastructure

This teaching activity contributes to the achievement of the Sustainable Development Goals of the UN 2030 Agenda.