- Docente: Daniele Cesini
- Crediti formativi: 4
- SSD: FIS/01
- Lingua di insegnamento: Inglese
- Modalità didattica: Convenzionale - Lezioni in presenza
- Campus: Bologna
- Corso: Laurea Magistrale in Bioinformatics (cod. 8020)
Conoscenze e abilità da conseguire
At the end of the course, the studente has the basic theoretical and practical knowledge on infrastructures for scientific computing, distributed and parallel systems, batch systems and security technologies.
Contenuti
The course will provide basic concepts of Infrastructure for BigData processing, including Cloud computing at the Infrastructure-as-a-Service level. The course will start with a description of the building blocks of modern data centers and how they are abstracted by the Cloud paradigm. A real-life computational challenge will be given and students will create (during the course) a cloud-based computing model to solve this challenge. A very brief introduction to High Performance Computing (HPC) will also be given. Notions about the emerging “fog” and “edge” computing paradigms and how they are linked to Cloud infrastructures will conclude the course.
Program:
1) Introduction to the course and the computational challenge
- Introduction to BigData
- Presentation of the computational challenge that will accompaign us during the course.
Hands on:
- Set up oftestbed for exercises
2) From your laptop to the datacenter - datacenter building blocks
- CPU Farm
i. Batch system, queues, allocation policies, quota etc..
- Storage
I. DAS vs NAS
II. SAN
III. TAN
IV. Parallel FS
V. Data lifecycle, QoS
- Migration, recall, ACL
- Network: main protocols (eth, infiniband, fc)
- Monitoring and Provisioning
Hands on: Submission on a small cluster already avalaible to students
3) Infrastructures for Parallel Computing
HTC vs HPC
HTC
- Distributed systems
- Grid Computing
HPC
- Shared memory vs distributed memory
- OPENMPI/OPNMPI
- Accelerators for parallel computing
- Hybrid and non-standard resources
Energy efficiency and Low-power computing
- Towards exascale computing
Hands: Demo Live - Speedup curve creations for the NAMD SMTV/APOA1 use cases. Computing on a GPU. Computing on Low Power systems.
4) Cloud IaaS
Cloud Computing: Introduction
Cloud IaaS
i. Advantages and Disadvantages
ii. Application Porting to the Cloud
iii. Openstack introduction
iv. Amazon vs Openstack
Cloud Storage - provisioning di block device e posix fs
Hands on: IaaS instantiation with Openstack - create the infrastructure to run the course exercises
Instatiation of multiple machines - experience on cloud elasticity - Create a mini-cluster - Run the course exercise on that cluster
Create storage volumes on the Cloud and make them available to the cluster
5) Creating a computing model in distributed infrastructures and multi-sites Cloud
Job Submission strategies
i. Push vs pull
ii. Compute driven model
iii. Workload Management Systems
Data Management startegies
i. Repliche, QoS
ii. Data driven computing models
Failover and Disaster Recovery strategies
6) Computing Continuum
- Low Power devices
- Introduction to Edge Computing
- Introducion to Fog Computing
- The Computing Continuum for Big Data Infrastructures
The Course will include for the interested students a visti to the INFN-CNAF datacenter in Bologna.
Testi/Bibliografia
Course material will be shared, plus external MOOCs and books will be suggested during the course.
Metodi didattici
The teaching method will be based on some theoretical foundations but it will be highly complemented with practical considerations on real infrastructures used for big data processing, as well as with some hands-on sessions.
Modalità di verifica e valutazione dell'apprendimento
There will be an oral exam, focusing on the topics presented during the course.
Students will be requested to prepare a small project that will be discussed during the exam.
Strumenti a supporto della didattica
Slides for the theory, use of real-world infrastructures for the hands-on sessions
Orario di ricevimento
Consulta il sito web di Daniele Cesini