17241 - Data Analysis Laboratory

Academic Year 2023/2024

  • Teaching Mode: Traditional lectures
  • Campus: Bologna
  • Corso: First cycle degree programme (L) in Economics, Markets and Institutions (cod. 8038)

Learning outcomes

This course provides a gentle introduction to the software R, covers some of the topics of basic statistics and probability, and explains and demonstrates how to implement the theoretical arguments and topics covered in the course of Statistics.

Is a beginner-level course designed to teach the fundamentals of R programming language.

An emphasis is put on exploratory data analysis, and a unified approach to linear models is given. This course aims to serve a hybrid purpose: to cover both statistical topics and the R software.

At the end of the course, the student is expected to be able to:

  • Perform basic data manipulation, use visualization techniques and perform explanatory data analysis, estimation and statistical inference with R;
  • Have the necessary background and skills to autonomously use R for implementing the tools of the statistical methodology for the description and the quantitative analysis of economic and social phenomena;
  • Effectively present the results of the analysis conducted.

Course contents

1. Data manipulation, visualization and exploration

Import and export new dataset in R, working with objects, vectors and matrices. Dealing with data, and learn about data: How to summarize it, how to present it, and how to infer from it. 

2. Fundamentals of probability and statistics

Most important commands for descriptive statistics: Discrete random variables (frequencies and contingency tables), continuous random variables (histogram and density) and their respective probability distributions. Empirical distribution and quantile functions. Expected values and variances.

3. Statistical inference

Basic sampling, estimation and testing. Point estimates and confidence intervals, t-tests and p-values. Monte Carlo simulations and finite sample properties of estimators.

4. The simple regression model

Estimating the population parameters of the simple linear regression model from a random sample of the dependent and independent variables. The ordinary least squares estimators of the coefficients, fitted values and residuals. Goodness of fit and residual-diagnostics.

Readings/Bibliography

  • Verzani, J. (2014). Using R for Introductory Statistics (2nd ed.). Chapman and Hall/CRC.
  • Heiss, F. (2020). Using R for Introductory Econometrics (2nd ed.). CreateSpace Independent Publishing Platform. Companion website: http://www.URfIE.net

Teaching methods

Lectures in R, followed by in-class applied examples and tutorials.

Assessment methods

End-of-course practical test in computer lab. The practical test will cover the whole course program, and is aimed at assessing the student's ability to use R for statistical analysis of empirical data.

The grade is graduated as follows:

  • <18 failed,
  • 18-24 sufficient,
  • 25-29 good,
  • 30 e lode excellent.

The final grade of the integrated course is the weighted average of Statistics (weight 2/3) and of the Lab (weight 1/3).

Teaching tools

Notes, cases study and examples.

You are required to bring your own laptop and make sure that the following software is installed PRIOR to the course:

  • A recent version of R [https://www.r-project.org/] and RStudio [https://www.rstudio.com/products/rstudio/download/] (the free version is more than enough).

Office hours

See the website of Enzo D'Innocenzo