48113 - Decision-Making Support Information Systems

Academic Year 2014/2015

  • Docente: Stefano Lodi
  • Credits: 10
  • SSD: ING-INF/05
  • Language: Italian
  • Teaching Mode: In-person learning (entirely or partially)
  • Campus: Bologna
  • Corso: Second cycle degree programme (LM) in STATISTICAL SCIENCES (cod. 8055)

Learning outcomes

At the end of the course the student will know the design of information system and applications on database management systems (DBMS). In particular, the student will

  • know the role and main features and functionalities of information systems in an organization
  • know a methodology for the design of databases, and how to apply it to user requirements in natural language
  • know a graphical software tool for design support
  • be able to read and interpret design schemata
  • be able to use a relational DBMS

Course contents

Relational databases

  • The relational model of data
    • Attributes
    • Relation and database schema
    • Tuple, relation and database
    • Relational algebra
  • Integrity constraints
    • Keys and superkeys
    • Referential integrity
  • DBMS architecture (outline)
  The SQL language
  • Database creation, querying and updating
  • Transaction and authentication management
Relational database design
  • Dependency theory
  • Normal forms
  • Armstrong's axioms
Entity-Relationship conceptual design
  • Entity, relationship, attribute
  • Generalization hierarchy
  • Identifier
  • ER schema design from natural language requirements
Logical design by mapping an ER schema to a relational schema
  • Selection of primary keys, elimination of generalization hierarchies, elimination of external identifiers
  • Elimination of composite and multivalued attributes
  • Translation of entities and relationships into relation schemata
Data warehousing and OLAP
  • OLTP and OLAP
  • The multidimensional model of data: Facts, measures, dimensions, hierarchies, cuboids
  • Star schema, snowflake schema, galaxy schema
  • Operations in the multidimensional model: roll-up, drill-down, slice and dice, pivot, data cube
  • Data warehouse: Definition, design, architecture
Knowledge Discovery in Databases
  • Association rule discovery
    • Classification of association rules
    • Apriori algorithm
    • FP-growth algorithm (outline)
  • Data clustering
    • One-pass algorithm
    • BIRCH algorithm
    • k-means algorithm
    • EM algorithm
  • Supervised classification
    • Classification trees
    • Support vector models
Laboratory classes
  • Relational DBMS IBM DB2, Microsoft SQL Server, Microsoft Access, SQL language.
  • Building a cube in Microsoft SQL Server
  • Data mining tasks in Microsoft SQL Server and IBM InfoSphere Warehouse Edition

Readings/Bibliography

  • Atzeni, P., Ceri, S., Paraboschi, S., & Torlone, R. (2009). Basi di dati. Modelli e linguaggi di interrogazione. Milano: McGraw-Hill.

Further recommended reading:

  • Van der Lans, R. F. (2001). Introduzione a SQL (2a ed.). Milano: Addison-Wesley.
  • Han, J., & Kamber, M. (2011). Data Mining. Concepts and Techniques. San Francisco, CA: Morgan Kaufmann.
  • Tan, P.-N., Steinbach, M., & Kumar, V. (2006). Introduction to Data Mining. Pearson, Boston.

Teaching methods

The course is composed of

  • frontal lessons in a lecture room, employing one third of the total number of course hours,
  • lessons in a laboratory, employing two thirds of the total number of course hours, each comprising both frontal expositions and exercises on the techniques for the design of databases and the solution of query and data analysis problems, presented in the expositions.

The following division of the course topics among frontal lessons in a lecture room and lessons in a laboratory will be adopted:

  • The theoretical and practical notions for the design of queries, for the management of a database, for the design of database and data warehouse schemata, and for knowledge discovery in databases are explained in frontal lessons.
  • In laboratory lessons, students are encouraged to design increasingly difficult SQL and multimensional queries and to test their correctness on the DBMS at hand, and to design the generation of data mining models using the tools integrated into the DBMSs.

Assessment methods

The examination is composed of two parts:

  • Preliminary examination on the design of SQL queries in laboratory.
    • The student is given: A hard copy text containing the description of a relational database schema expressed as CREATE TABLE statement in the SQL language and three queries expressed in natural language concerning the relation of th schema; a digital document conforming to the hard copy text.
    • The student must produce: Three queries written in the SQL language that retrieve the data required by the the queries expressed in natural language which were received in the hard copy text.
    • Notes: The student may: produce the solution on paper or as a digital document; use one or more database management systems among the ones which were employed during the course lessons to test the solution.
  • Oral examination.
    The student must answer three questions which may concerning any part of the contents of the course. In particular, the student must show: Mastery of the theoretical notions of the discipline and of the logic and mathematical formalism employed in it; ability in the design of portions of a ER schema corresponding to notable cases of design requirements expressed in natural language; mastery of the application of the logical design techniques from ER schemata to relational schemata; knowledge of the elements of data warehousing and of the data mining techniques implemented in the tools used during the lessons and the ability to use such tools.

Computation of the final mark and constraints among the examinations.

The marks of the two examinations are contained in the interval from zero to thirty, including the extremes. The mark achieved in the preliminary examination is valid until the end of the session in which the preliminary examination has been taken.The assessment of the overall outcome of the examination and the computation of the final mark take place at the end of the oral examination. The final mark is computed as a weighted average of the marks achieved in the two examinations, using the most recent valid mark for the preliminary examination, or zero if no valid mark exists. For the computation of the final mark, the following weights are used:

  • Preliminary examination on the design of SQL queries in laboratory: 12/30
  • Oral examination: 18/30

Teaching tools

  • Presentation of the course topics using a overhead projector
  • Laboratory with desktop PCs equipped with Microsoft SQL Server and Microsoft Access; teacher's PC connected to an overhead projector to guide laboratory exercises
  • Documents used in the presentations, distributed at the site http://campus.unibo.it/. Access to the documents is allowed only to students of the course who subscribed to the course mailing list. Credentials to subscribe to the list are given in the first lesson of the course.

Office hours

See the website of Stefano Lodi