81610 - MACHINE LEARNING

Anno Accademico 2023/2024

  • Docente: Sergio Pastorello
  • Crediti formativi: 2
  • SSD: SECS-P/05
  • Lingua di insegnamento: Inglese
  • Modalità didattica: Convenzionale - Lezioni in presenza
  • Campus: Bologna
  • Corso: Laurea in Economics and Finance (cod. 8835)

Conoscenze e abilità da conseguire

The course introduces students to some of the most important Machine Learning predictive models such as regularization methods, tree-based methods, Support Vector Machines and Neural Networks, that can potentially contribute to empirical economics. For each topic will we outline its structure, discuss its pros and cons, and focus attention on the issues associated to its empirical application to several economic problems using the software R.

Contenuti

This course is taught entirely in English.

Module 2

  1. Regularization methods: Ridge regression, the LASSO, Elastic Net
  2. Tree-based methods: Bagging, Random Forests, Boosting, Ensemble Methods
  3. Support Vector Machines in regression and classification tasks
  4. Neural Networks, with applications to numeric and textual data

Testi/Bibliografia

James, Witten, Hastie and Tibshirani, An Introduction to Statistical Learning, Springer 2021 (second edition).

Metodi didattici

This course is taught entirely in English.

For each topic we will first introduce the relevant theory, and then move as soon as possible to its empirical application in the R language. Special emphasis will be placed on the economic interpretation and relevance of the results. Attending classes is important especially to learn the empirical topics of the course.

Modalità di verifica e valutazione dell'apprendimento

This course is taught entirely in English.

The exam is joint with Module 1.

The exam tests the ability to apply the methods learnt to simulated or real data, using R, the acquired knowledge of the theoretical concepts and the ability to interpret estimation results in the light of the underlying theory.

The exam consists of a group project plus discussion.

The structure of the final project should be the following:

1. Data description and motivation

- Motivation. State your final objective: outcome, predictors, ...

- Explore the data. Tools: plots, PCA, clustering

2. Setup and assess a prediction model

- Prediction. Setup and assess forecasting models

- Tools. Linear/logistic regression, Discriminant analysis, KNN, PCR, PLS, stepwise selection, regularization methods (Ridge, LASSO, Elastic Net, ...), splines, trees, SVM, NN, ...

Additional information about the final project:

3. Data

- Use your own data and provide it (during the course we will illustrate several interesting data repositories on the Web: Kaggle, UCI Machine Learning, and others)

- Data must allow exploration and prediction

- Data don’t need to be huge!

4. Tools:

- You should use the most important tools from class (not all)

- Make sure the data and your goals are compatible

- Work in groups of 3 to 5 students. Let the instructors know the groups’ composition as soon as possible

- The final project must be handed in 5 days before the discussion, including: (i) The data in a format easily readable by R, (ii) The R code that reads the data, does the computations and outputs the results; (iii) The pdf document that illustrates the project.

- The file must contain at most 20 pages including tables and figures.

5. Assessment:

- The final project assessment will take into account the difficulty posed by data cleaning and preparation

- The final project assessment will weight the project assessment and the oral discussion.

The maximum possible score is 30 cum laude. The grade is graduated as follows:

<18 failed
18-23 sufficient
24-27 good
28-30 very good
30 e lode excellent

6. Grade rejection:

- Students can reject the grade obtained at the exam once. To this end, he/she must email a request to the instructors within the date set for registration. The instructors will confirm reception of the request within the same date.

- Rejection is intended with respect to the whole exam, whose grade is the weighted average of the grades obtained in the oral discussion and the final project. If the grade is rejected, the student must retake the oral discussion only.

Strumenti a supporto della didattica

This course is taught entirely in English.

We will discuss several empirical analysis using the statistical software R.

Orario di ricevimento

Consulta il sito web di Sergio Pastorello