40720 - Data Mining

Academic Year 2020/2021

  • Docente: Ida D'Attoma
  • Credits: 6
  • SSD: SECS-S/03
  • Language: English
  • Teaching Mode: Traditional lectures
  • Campus: Forli
  • Corso: Second cycle degree programme (LM) in Economics and management (cod. 9203)

    Also valid for Second cycle degree programme (LM) in Economics and management (cod. 9203)

Learning outcomes

This course will present main statistical methods used in knowledge discovery in business databases; special attention will be paid to techniques that help to single out the relationships of interdependence and patterns in business phenomena. In particular, this course seeks to enable the student: - to correctly plan a data mining process; - to choose the best suited statistical methodology for the problem at hand; - to critically interpret empirical results; - to use these results in the business decision process.

Course contents

  1. Introduction to data mining.

  2. Organization of data: data objects and attributes type, data matrices and their transformations.

  3. Data Preprocessing and Exploratory Analysis: data cleaning, data bivariate exploratory analysis of qualitative and quantitative data.

  4. Proximity measures: distance and similarity.

  5. Hierarchical and Non-Hierarchical Cluster Analysis.

  6. Classification and prediction methods: an introduction to logistic regression. 

Readings/Bibliography

Lectures are based on selected material from the textbook listed below:

  • Tufféry, S. (2011) Data Mining and Statistics for Decision Making. John Wiley & Sons, Ltd.  

You can check its availability at: 

  • http://sol.unibo.it/SebinaOpac/Opac?sysb=&fromBiblio=
  • http://sol.unibo.it/SebinaOpac/Opac
Additional teaching material will be made available to students using the e-learning platform

Teaching methods

Lectures involve the presentation of theoretical and applied issues of the various data mining methods. After each theoretical session a practical tutorial is devoted to applications on real economic data. Applications are discussed and replicated during the computer laboratory session using SAS statistical software. 

Self-evaluation tests will be made in class and on-line (through the e-learning platform).

 

 

Assessment methods

Attending and non attending students will have a written examination consisting in a multiple-choice section (1/3 of grade) and a section requiring production and interpretation of statistical outputs (2/3 of grade). The multiple choice section aims at testing the student's knowledge of the theoretical topics. The second section is targeted at testing the ability of producing and interpreting statistical outputs, and their translation into applied conclusions. Typical exam questions will be made available during the course. All the students are given to perform tasks of the same difficult in the same time. It is a 2 hours written exam with 6 multiple choice questions on theory and 2/3 practical exercises using the SAS software. The exam will be held in the computer lab. Points awarded for correct answers to each question will be available. The exam is "closed-book". Students are not allowed to consult references and theoretical information sources while performing the task, but they are allowed to consult SAS programs arranged by the teacher. As far as marks registration and preservation concerns detailed information will be given during the first class day in accordance to the university guidelines on the preservation of the marks obtained in singles disciplines for the integrated courses.

Check the virtual space for details.

Teaching tools

  • Lecture notes, additional teaching material ,exercises, typical exam questions, SAS software demonstrations on data analysis, self-evaluation on-line tests will be made available through the e-learning platform.
  • SAS 9.4 Software available at TH (Room 5) and at LABIC 
  • Kahoot Software.

Office hours

See the website of Ida D'Attoma

SDGs

Quality education

This teaching activity contributes to the achievement of the Sustainable Development Goals of the UN 2030 Agenda.