- Docente: Matteo Golfarelli
- Credits: 6
- SSD: ING-INF/05
- Language: Italian
- Moduli: Matteo Golfarelli (Modulo 1) Gianluca Moro (Modulo 2)
- Teaching Mode: Traditional lectures (Modulo 1) Traditional lectures (Modulo 2)
- Campus: Cesena
- Corso: Second cycle degree programme (LM) in Computer Science and Engineering (cod. 8614)
Learning outcomes
After the course the student: -kows the main data/text mining techniques - knows the methodologies for handling a mining project - develops practical skills in the analysis and interpretation of results through practical exercises with commercial tools and / or open source ones.
Course contents
1. Introduction to Data Mining: areas of applicability
2. The knowledge discovery process
o Designing a Data Miing Process
o The CRISP-DM methodology
3. Understanding and preparing data
o Features of different data types
o Statistical data analysis
o Data quality
o Preprocessing: attributes selection and creation
o Measuring similarities and dissimilarities
4. Data mining techniques
o Classification through decision trees and bayesian networks
o Association rules
o Clustering
o Outlier detection
5. Text Mining techniques
o Information Retrieval for Text Mining
o Text categorization
o Opinion Mining
6. Data understanding and validation
7. The Weka software [http://www.cs.waikato.ac.nz/ml/weka/]
8. Case studies analysisReadings/Bibliography
Pang-Ning Tan, Michael Steinbach, Vipin Kumar Introduction to Data Mining. Pearson International, 2006.
Christopher Manning, Hinrich Schutze, Prabhakar Raghavan. Introduction to Information Retrieval. Cambridge University Press, 2008.
Teaching methods
Lessons and practical exercises
Assessment methods
Oral examination and discussion of a project. The project must be decided with the lecturer and can be either the implementation of mining algorithm or the analysis of a dataset using data and text mining techniques.
The goal of the assessment is to verify the cohmprension of the sudied techniques as well as the pratical capability to analyze data and understand and discover the hidden information.
Teaching tools
Practical exercises will be carried out using the open source Weka and R
Links to further information
http://bias.csr.unibo.it/golfarelli/DataMining/
Office hours
See the website of Matteo Golfarelli
See the website of Gianluca Moro