87195 - Lab of Big Data Architectures M

Course Unit Page


This teaching activity contributes to the achievement of the Sustainable Development Goals of the UN 2030 Agenda.

Industry, innovation and infrastructure

Academic Year 2022/2023

Learning outcomes

The Lab of Big Data architectures extends and integrates what learnt by the student in the course “statistics and architectures for big data processing” with a more in depth and practical knowledge of the big-data technologies and architectures. The students will learn how to design a big data system, the key concepts and differentiators behind state-of-the-art technologies and architectures, and how to use it effectively. This will be done by a series of practical exercises with interactive explanations, where students will learn by solving practical problems and examples.

Course contents

The course content follows the Brendan Gregg approach to system performance monitoring and optimization and it is intended to give to the student a practical knowledge on monitoring and optimizing the system performance of Linux O.S. based big-data/cloud/HPC computing systems.

1. Working with a big data cluster: Practical experience on connecting to Monte Cimone Cluster; Installing and executing a simple benchmark (Stream, HPL benchamarks.) with Spack.

2. System Performance and O.S. basics:The Utilization Saturation and Errors (USE) Method; Linux O.S. basics.

3. Observability Tools: Linux Perf, Ftrace, eBPF, performance counters, flame graph, roof line

4. CPU profiling: exercises

5. MEM profiling: exercises

6. File system and Disk profiling: exercises

7. Network profiling: exercises


Systems Performance: Enterprise and the Cloud, 2nd Edition (2020)

Brendan Gregg

Teaching methods

The class with consists of the completion of a set of practical tutorials and assignments conducted on the own laptop and on the remote Monte Cimone cluster.

Assessment methods

Based on student's reports on the application of the studied system profiling methods seen in the class to a selected application/benchmark selected by the student.

Teaching tools

The course will be conducted on the Monte Cimone cluster.

In considerazione della tipologia di attività e dei metodi didattici adottati, la frequenza di questa attività formativa richiede la preventiva partecipazione di tutti gli studenti ai Moduli 1 e 2 di formazione sulla sicurezza nei luoghi di studioin modalità e-learning.

Office hours

See the website of Andrea Bartolini