40720 - Data Mining

Academic Year 2017/2018

  • Docente: Ida D'Attoma
  • Credits: 6
  • SSD: SECS-S/06
  • Language: English
  • Teaching Mode: Traditional lectures
  • Campus: Forli
  • Corso: Second cycle degree programme (LM) in Economics and management (cod. 9203)

    Also valid for Second cycle degree programme (LM) in Economics and management (cod. 9203)

Learning outcomes

This course will present main statistical methods used in knowledge discovery in business databases; special attention will be paid to techniques that help to single out the relationships of interdependence and patterns in business phenomena. In particular, this course seeks to enable the student: - to correctly plan a data mining process; - to choose the best suited statistical methodology for the problem at hand; - to critically interpret empirical results; - to use these results in the business decision process.

Course contents

  1. Introduction to data mining.

  2. Organization of data: data objects and attributes type, data matrices and their transformations.

  3. Data Preprocessing and Exploratory Analysis: data cleaning, data bivariate exploratory analysis of qualitative and quantitative data.

  4. Measures of Distance.

  5. Hierarchical and non-hierarchical Cluster Analysis.

  6. Classification and prediction methods: an introduction to logistic regression. 

Readings/Bibliography

  • Stéphane Tufféry. Data Mining and Statistics for Decision Making. 2011. John Wiley & Sons.

Teaching methods

The module consists in theoretical session on methods and practical tutorials devoted to applications on real economic data, through the use of SAS statistical software.

Assessment methods

Written exam consisting in a multiple-choice section and a section requiring production and interpretation of statistical outputs. The multiple choice section aims at testing the student's knowledge of the theoretical topics. The second section is targeted at testing the ability of producing and interpreting statistical outputs, and their translation into applied conclusions.

Teaching tools

SAS software demonstrations on data analyisis will be provided. Notes are downloadable from the lecturer's web page.

Office hours

See the website of Ida D'Attoma