- Docente: Stefano Lodi
- Credits: 5
- SSD: ING-INF/05
- Language: Italian
- Teaching Mode: Traditional lectures
- Campus: Rimini
- Corso: Second cycle degree programme (LS) in Business Information Systems (cod. 0366)
Learning outcomes
The objective of the course is to describe and explain data
warehousing technologies and the process of knowledge discovery in
databases, and their role in business intelligence. In particular,
prominent data mining algorithms are described, and their
operational features and application prerequisites are analysed.
The differences between computational requirements in centralized,
distributed, and stream data environments and the basic techniques
to satisfy them are discussed. The course includes laboratory
classes during which implementations of the most popular data
mining algorithms are experimented with on real data sets.
Course contents
Data warehousing. Data warehousing and business
intelligence. OLTP e OLAP. Architecture of a data warehouse.
Schemata and operations in a data warehouse.
Knowledge Discovery. Knowledge discovery and business
intelligence. The knowledge discovery process. Data Mining
Algorithms: Association rules: The APRIORI algorithm; clustering
vector data: One pass algorithms, the BIRCH algorithm;
density-based clustering algorithms: The DBSCAN algorithm, the
DENCLUE algorithm; clustering categorical data: outline; clustering
metric data: outline. Algorithms for Distributed Data Mining:
Privacy and cooperation in mining distributed data; distributed
algorithms for association rules; distributed algorithms for
clustering; distributed algorithms for classification. Algorithms
for mining data streams: Computation of aggregates on stream data;
clustering data streams: Pyramidal time frames and the CluStream
algorithm. Laboratory classes: IBM Intelligent Miner, Microsoft SQL
Server.
Readings/Bibliography
Jiawei Han and Micheline Kamber, Data Mining: Concepts and Techniques, The Morgan Kaufmann Series in Data Management Systems, Jim Gray, Series Editor. Morgan Kaufmann Publishers, August 2000. 550 pages. ISBN 1-55860-489-8
Teaching methods
Operation, limits of applicability and computational complexity of
the most prominent data mining algorithms are explained during
frontal lessons. During laboratory classes, commercial software
tools are experimented with on real and artificial data sets and
the results are collectively discussed.
Assessment methods
- Practical examination with data warehousing and data mining
tools
- Oral examination
Teaching tools
- PC and overhead projector
- Laboratory with desktop PCs Software:
- Database Management System IBM DB2 Express-C
- IBM Intelligent Miner
- Microsoft SQL server
Links to further information
http://www-db.deis.unibo.it/~slodi/SISD/2008-2009/sisd.html
Office hours
See the website of Stefano Lodi