78830 - Multivariate Statistical Methods for Credit Scoring

Academic Year 2023/2024

  • Teaching Mode: Blended Learning
  • Campus: Rimini
  • Corso: First cycle degree programme (L) in Finance, Insurance and Business (cod. 8872)

Learning outcomes

At the end of the course, students know the basic concepts of statistical methodologies for taking business and financial decisions, and models for accounting and financial data for credit scoring.

Course contents

  • Introduction to statistical methods for credit scoring.
  • Recall of categorical variables: marginal and conditional independence, measures of association.
  • Logistic regression model: model specification, estimation and interpretation of model parameters, variable selection, goodness of fit.
  • Discriminant analysis: canonical discriminant analysis, model-based discriminant analysis.
  • Classification trees: CART and CHAID methods.
  • Methods for the estimation of the classification error and for the evaluation of the performance of the classifier.
  • Basics of neural networks.
  • Basics of latent variable models: latent class analysis.
  • Recall of cluster analysis for its use in credit scoring.

For each topic in the list, an analysis of case studies will be carried out by using the software R (at least 2 hours per week).

Readings/Bibliography

Compulsory reading

  • Elena Stanghellini (2009) "Introduzione ai metodi statistici per il credit scoring", Springer-Verlag. Available as Unibo e-book.
  • Teacher's lecture notes available on the platform Virtual Learning Environment at: virtuale.unibo.it

Suggested textbooks

  • Stefania Mignani, Angela Montanari (1997) "Appunti di analisi statistica multivariata", Esculapio (chap. 5 discriminant analysis, chap. 7 cluster analysis)
  • Sergio Zani, Andrea Cerioli (2007) "Analisi dei dati e data mining per le decisioni aziendali", Giuffrè Editore (chap. 8 distances and similarity indexes, chap. 9 cluster analysis, chap. 11 classification trees, chap. 12 neural networks)

Teaching methods

Theoretical lessons and practical activities in computer laboratory with R - Rstudio (individual or group work).

Attending the lessons is not mandatory but it is strongly recommended.

As concerns the teaching methods of this course unit, all students must attend Module 1, 2 on Health and Safety online.

The course is part of the teaching experimentation of the University (hybrid model, 50 hours in presence and 10 hours online). Detailed information will be given during the first lecture and will be available on the slide of the course introduction.

Assessment methods

Final written test to assess the knowledge of the statistical methods both from a methodological and from a practical point of view. No intermediate tests are planned.

The written test consists of open-ended items dealing with both the statistical theory and the interpretation of outputs produced with the software R. In the teaching material it is possible to find an example of written test. The test duration is from 90 to 120 minutes. During the test, it is possible to use a pocket calculator only.

For each item, it is given a score. The sum of the scores is equal to 32. The final mark is expressed on a scale of 30 and it is calculated by the sum of the scores obtained in the items. The "lode" is given to students who have a total score equal to 32.

The final mark corresponds to the following description of the overall achievement level reached:

< 18: not sufficient (exam failed)

18-23: sufficient

24-25: satisfactory

26-28: good

29-30: very good

30 e lode (30 cum laude): excellent

Prerequisites: mathematics, statistics, probability, inference.

Teaching tools

Slides and lab material, R – Rstudio software.

Office hours

See the website of Mariagiulia Matteucci