Anno Accademico 2022/2023

  • Docente: Davide Salomoni
  • Crediti formativi: 4
  • SSD: FIS/01
  • Lingua di insegnamento: Inglese
  • Modalità didattica: Convenzionale - Lezioni in presenza
  • Campus: Bologna
  • Corso: Laurea Magistrale in Bioinformatics (cod. 8020)

Conoscenze e abilità da conseguire

At the end of the course, the student has practical and theoretical knowledge on distributed computing and storage infrastructures, cloud computing and virtualization, parallel computing and their application to Big Data Analysis


The course "Infrastructures for Big Data Processing" (BDP2) builds on the course "Introduction to Big Data Processing Infrastructures" (BDP1). Before following this course, students should have already followed the BDP1 course, or at least have good familiarity with the topics covered there.

The BDP2 course will first recap the foundations of Cloud computing and storage services beyond IaaS (PaaS and SaaS). It will then proceed to discuss how to exploit distributed infrastructures for deploying applications and perform processing of big data.

A distinct feature of BDP2 is that it provides a substantial amount of hands-on sessions that directly connect to the theoretical parts. This way, students will readily apply the concepts that are being exposed to real-world use cases. To achieve maximum benefit out of this method, it is strongly recommended that students attend all lectures.

A pre-requisite to follow this course is that each student brings his/her own laptop to the lectures. The laptop should run Microsoft Windows, Linux or Mac OS X. Tablets are not supported. University of Bologna credentials are required in order to access the course material and the computing facilities that will be used during the course.

Introduction to BDP2

  • Course introduction and objectives
  • Clouds beyond the IaaS: general points
  • How to use the Cloud infrastructure for this course.

Cloud Storage

  • File systems and POSIX storage
  • The Network File System (NFS)
  • Object storage, the REST architecture and the JSON format
  • Virtual file systems
  • Simple examples of local and remote data processing

Advanced Docker Containers

  • Recap of basic concepts about containers (from BDP1)
  • Networking in containers
  • Process management, logging and security
  • A complete application development workflow

Authentication and Authorization

  • Principles of Cloud authentication and authorization
  • X.500, LDAP, Radius, Kerberos
  • X.509 and public-key cryptography
  • OAuth and OpenID-Connect
  • Adapting an application to use INDIGO-IAM

Cloud Automation

  • What is Cloud Automation
  • Microservices and monoliths
  • The DevOps concept
  • Container orchestration: Docker Swarm and Kubernetes, with extensive hands-on sessions on Kubernetes
  • Infrastructure as Code: serverless technologies
  • Template-based orchestration of applications
  • Function as a Service (FaaS): hands-on, with the development and deployment of a simple FaaS application targeted to bioinformatics
  • Cloud automation and Machine Learning



Course material will be shared, plus external MOOCs and books will be suggested during the course.

Metodi didattici

The teaching method will be based on some theoretical foundations but it will be highly complemented with practical considerations on real infrastructures used for big data processing, as well as with several hands-on sessions.

It is strongly recommended to attend all lectures.

Due to the kind of activity and didactical methods, attending the present course requires the prior participation of all students to the following e-learning Modules 1 and 2

Module 1 – Safety General Training

Module 2 – Safety Specific Training (part I)

Modalità di verifica e valutazione dell'apprendimento

There will be intermediate mock-up exams during the course. The final exam will focus on all the topics presented during the course and will consist in an oral discussion. 

Strumenti a supporto della didattica

Slides for the theory, use of real-world infrastructures for the hands-on sessions.

Note that a personal laptop (running Windows, Linux or MacOS - no tablets) is required to follow both lectures and hands-on sessions.

Orario di ricevimento

Consulta il sito web di Davide Salomoni


Salute e benessere Istruzione di qualità Imprese innovazione e infrastrutture Partnership per gli obiettivi

L'insegnamento contribuisce al perseguimento degli Obiettivi di Sviluppo Sostenibile dell'Agenda 2030 dell'ONU.