96142 - BUSINESS INTELLIGENCE E BIG DATA M

Academic Year 2022/2023

  • Docente: Stefano Rizzi
  • Credits: 6
  • SSD: ING-INF/05
  • Language: Italian
  • Moduli: Stefano Rizzi (Modulo 1) Federico Ravaldi (Modulo 2)
  • Teaching Mode: Traditional lectures (Modulo 1) Traditional lectures (Modulo 2)
  • Campus: Bologna
  • Corso: Second cycle degree programme (LM) in Engineering Management (cod. 0936)

Learning outcomes

Investigate the topics of data-driven decision making and introduce the architectures of classical business intelligence, with specific reference to data warehousing systems and to the methods for designing them. Discuss the theme of enterprise data strategy in a broader way, focusing on the methods, architectures, and technologies for big data.

Course contents

Requirements/Prior knowledge

A prior knowledge and understanding of database systems and relational model is required to attend with profit this course. These notions are normally achieved by giving an exam of Databases or Information Systems.

Fluent spoken and written Italian is a necessary pre-requisite: all lectures and tutorials, and all study material will be in Italian.

Course contents: BI module

  • the role of BI in the corporate information system
  • the BI pyramid
  • introduction to data warehousing
  • architectures
  • techniques for data analysis: reporting and OLAP
  • lifecycle
  1. data source analysis
  2. requirement analysis
  3. conceptual design
  4. workload and data volume
  5. logical design
  6. ETL design

Course contents: BD module

  • the data revolution
  • big data: characteristics, definitions and state of the art
  • information sources: Internet of Things (IoT), Industry 4.0, ERP, Social networks, Geo Data, etc.
  • NoSQL databases
  • architectures: data platform, data lake e cloud
  • workshop: Hadoop ecosytem & Spark
  • real-time analysis: fast data, lambda architecture, Kafka
  • data strategy
  • case studies

Readings/Bibliography

  • Course slides.
  • M. Golfarelli, Stefano Rizzi. Data Warehouse: teoria e pratica della progettazione. McGraw-Hill, seconda edizione, 2006.

Recommended readings:

  • B. Devlin. Data warehouse: from architecture to implementation. Addison-Wesley Longman, 1997.
  • W.H. Inmon. Building the data warehouse. John Wiley & Sons, 1996.
  • M. Jarke, M. Lenzerini, Y. Vassiliou, P. Vassiliadis. Fundamentals of data warehouse. Springer, 2000.
  • R. Kimball, L. Reeves, M. Ross, W. Thornthwaite. The data warehouse lifecycle toolkit. John Wiley & Sons, 1998.
  • P. Sadalage, M. Fowler. NoSQL Distilled: A Brief Guide to the Emerging World of Polyglot Persistence. Addison-Wesley, 2009.
  • T. White. Hadoop: The Definitive Guide, Storage and Analysis at Internet Scale - 4th Edition. O'Reilly Media, 2015 (suggested chapters: 1,2,3,4,19)

Teaching methods

  • Classroom lectures and exercises are given with the help of slides (through PC+projector presentations). Lectures could be possibly delivered in online or mixed modality using the Teams platform.
  • The program will be integrated by lectures from enterprise consultants.
  • Group exercises on virtual collaborative blackboards.

Assessment methods

The final exam aims at verifying the knowledge acquired by the student with regard to the specific contents of each teaching module. Exams will be conducted in the presence or online, according to the health situation and provisions of the University. In the online case, the Zoom and EOL tools will be employed.

Final examinations for the two modules are carried out separately. Each module examination is made of a 1-hour written test to be done without the aid of books or written notes. For the Data Warehousing module, the test is composed of a practical part, involving the solution of data warehouse conceptual and logical design exercises, and of a theoretical part, including questions on the whole course program. For the Big Data module, the test is composed of open-answer questions on theoretical and design course contents. Further details may be communicated during the lectures and in the "notes" appearing in the exam sessions published on AlmaEsami. Each module test is passed if it receives a 18/30 score on a total score of 31/30.

In order to take an exam, registration through AlmaEsami is required within the assigned deadlines. Students who won't be able to register within the deadline are bound to promptly (and anyway before the official closing of the registration lists) notify the issue to the didactic secretariat. It will be faculty of the teacher to admit them to the test. The tests of the two component modules can be taken in the same or in different scheduled sessions, in any order. Once the outcome of a test has been published, each student is given a week to communicate via email to the teacher his/her intention to refuse the mark obtained. The final exam grade of a course is the average of the scores obtained for the two component modules.

To obtain a passing grade, students are required to at least demonstrate a knowledge of the key concepts of the subject, acquired autonomous design skills, and a comprehensible use of technical language. Higher grades will be awarded to students who demonstrate an organic understanding of the subject and a clear and concise presentation of the contents, a high ability for problem solving, and consistent design capabilities.

Teaching tools

Materials on the course topics are available on the Virtual platform.

Teams platform for remote teaching.

Miro platform for exercises.

Office hours

See the website of Stefano Rizzi

See the website of Federico Ravaldi