You are here:

48113 - Decision-Making Support Information Systems

Academic Year 2014/2015

Docente: Stefano Lodi
Credits: 10
SSD: ING-INF/05
Language: Italian

Teaching Mode: In-person learning (entirely or partially)
Campus: Bologna
Corso: Second cycle degree programme (LM) in STATISTICAL SCIENCES (cod. 8055)

Learning outcomes

At the end of the course the student will know the design of information system and applications on database management systems (DBMS). In particular, the student will

know the role and main features and functionalities of information systems in an organization
know a methodology for the design of databases, and how to apply it to user requirements in natural language
know a graphical software tool for design support
be able to read and interpret design schemata
be able to use a relational DBMS

Course contents

Relational databases

The relational model of data

Attributes
Relation and database schema
Tuple, relation and database
Relational algebra

Integrity constraints

Keys and superkeys
Referential integrity

DBMS architecture (outline)

The SQL language

Database creation, querying and updating
Transaction and authentication management

Relational database design

Dependency theory
Normal forms
Armstrong's axioms

Entity-Relationship conceptual design

Entity, relationship, attribute
Generalization hierarchy
Identifier
ER schema design from natural language requirements

Logical design by mapping an ER schema to a relational schema

Selection of primary keys, elimination of generalization hierarchies, elimination of external identifiers
Elimination of composite and multivalued attributes
Translation of entities and relationships into relation schemata

Data warehousing and OLAP

OLTP and OLAP
The multidimensional model of data: Facts, measures, dimensions, hierarchies, cuboids
Star schema, snowflake schema, galaxy schema
Operations in the multidimensional model: roll-up, drill-down, slice and dice, pivot, data cube
Data warehouse: Definition, design, architecture

Knowledge Discovery in Databases

Association rule discovery

Classification of association rules
Apriori algorithm
FP-growth algorithm (outline)

Data clustering

One-pass algorithm
BIRCH algorithm
k-means algorithm
EM algorithm

Supervised classification

Classification trees
Support vector models

Laboratory classes

Relational DBMS IBM DB2, Microsoft SQL Server, Microsoft Access, SQL language.
Building a cube in Microsoft SQL Server
Data mining tasks in Microsoft SQL Server and IBM InfoSphere Warehouse Edition

Readings/Bibliography

Atzeni, P., Ceri, S., Paraboschi, S., & Torlone, R. (2009). Basi di dati. Modelli e linguaggi di interrogazione. Milano: McGraw-Hill.

Further recommended reading:

Van der Lans, R. F. (2001). Introduzione a SQL (2a ed.). Milano: Addison-Wesley.
Han, J., & Kamber, M. (2011). Data Mining. Concepts and Techniques. San Francisco, CA: Morgan Kaufmann.
Tan, P.-N., Steinbach, M., & Kumar, V. (2006). Introduction to Data Mining. Pearson, Boston.

Teaching methods

The course is composed of

frontal lessons in a lecture room, employing one third of the total number of course hours,
lessons in a laboratory, employing two thirds of the total number of course hours, each comprising both frontal expositions and exercises on the techniques for the design of databases and the solution of query and data analysis problems, presented in the expositions.

The following division of the course topics among frontal lessons in a lecture room and lessons in a laboratory will be adopted:

The theoretical and practical notions for the design of queries, for the management of a database, for the design of database and data warehouse schemata, and for knowledge discovery in databases are explained in frontal lessons.
In laboratory lessons, students are encouraged to design increasingly difficult SQL and multimensional queries and to test their correctness on the DBMS at hand, and to design the generation of data mining models using the tools integrated into the DBMSs.

Assessment methods

The examination is composed of two parts:

Preliminary examination on the design of SQL queries in laboratory.

The student is given: A hard copy text containing the description of a relational database schema expressed as CREATE TABLE statement in the SQL language and three queries expressed in natural language concerning the relation of th schema; a digital document conforming to the hard copy text.
The student must produce: Three queries written in the SQL language that retrieve the data required by the the queries expressed in natural language which were received in the hard copy text.
Notes: The student may: produce the solution on paper or as a digital document; use one or more database management systems among the ones which were employed during the course lessons to test the solution.

Oral examination.
The student must answer three questions which may concerning any part of the contents of the course. In particular, the student must show: Mastery of the theoretical notions of the discipline and of the logic and mathematical formalism employed in it; ability in the design of portions of a ER schema corresponding to notable cases of design requirements expressed in natural language; mastery of the application of the logical design techniques from ER schemata to relational schemata; knowledge of the elements of data warehousing and of the data mining techniques implemented in the tools used during the lessons and the ability to use such tools.

Computation of the final mark and constraints among the examinations.

The marks of the two examinations are contained in the interval from zero to thirty, including the extremes. The mark achieved in the preliminary examination is valid until the end of the session in which the preliminary examination has been taken.The assessment of the overall outcome of the examination and the computation of the final mark take place at the end of the oral examination. The final mark is computed as a weighted average of the marks achieved in the two examinations, using the most recent valid mark for the preliminary examination, or zero if no valid mark exists. For the computation of the final mark, the following weights are used:

Preliminary examination on the design of SQL queries in laboratory: 12/30
Oral examination: 18/30

Teaching tools

Presentation of the course topics using a overhead projector
Laboratory with desktop PCs equipped with Microsoft SQL Server and Microsoft Access; teacher's PC connected to an overhead projector to guide laboratory exercises
Documents used in the presentations, distributed at the site http://campus.unibo.it/. Access to the documents is allowed only to students of the course who subscribed to the course mailing list. Credentials to subscribe to the list are given in the first lesson of the course.

Office hours

See the website of Stefano Lodi