- Docente: Matteo Francia
- Crediti formativi: 6
- SSD: ING-INF/05
- Lingua di insegnamento: Inglese
- Moduli: Enrico Gallinucci (Modulo 1) Matteo Francia (Modulo 2)
- Modalità didattica: Convenzionale - Lezioni in presenza (Modulo 1) Convenzionale - Lezioni in presenza (Modulo 2)
- Campus: Cesena
- Corso: Laurea Magistrale in Digital Transformation Management (cod. 5815)
-
Orario delle lezioni (Modulo 1)
dal 19/09/2023 al 25/10/2023
-
Orario delle lezioni (Modulo 2)
dal 14/11/2023 al 13/12/2023
Conoscenze e abilità da conseguire
At the end of the course, the student: - Knows the applications of Big Data technologies and the respective challenges - Knows the hardware and software architectures proposed to handle Big Data - Knows the techniques to store the data and the fundamentals aspect of new generation database systems - Knows the programming paradigms generally adopted in this kind of systems and the main analysis methodologies (batch, interactive, streaming) - Learns the design patterns that regulate the deployment in the Cloud of complex ICT solutions - Learns some of the most relevant components of the Cloud Platforms, with a specific focus on those services that enable Big Data management and IoT applications - Is able to make decisions concerning the appropriate Cloud Platform and the related services to be adopted - Knows the billing models that lay behind Cloud Computing services and learns how to estimate the cost of a specific solution, to support project management, to prepare quotations, or to support the management control system - Acquires practical expertise through laboratory activities in using some of the main open-source Big Data software tools, as well as some of the most adopted Cloud Computing services available on the market
Contenuti
Big Data Architectures and Paradigms
- Hardware infrastructures and software architectures
- Data storage in distributed file systems and NoSQL databases
- The MapReduce programming paradigm
- Main principles of application design and optimization based on Apache Spark
- Architectures and algorithms to handle streams of data
Handling Big Data in the Cloud
- Introduction to data platforms: shifting from databases to well-integrated data ecosystems
- Definition of cloud and taxonomy of cloud services
- Introduction to the most relevant Cloud Platforms, with a specific focus on those services that enable data platforms and IoT applications
- Introduction to the billing models that lay behind Cloud Computing services. Cluster migration Cluster on-premises vs in the cloud
- Deploy real case studies on a cloud provider
Seminars by companies working with cloud and big data platforms
Testi/Bibliografia
- Slides
Recommended readings:
- Ian Foster, Dennis Gannon. Cloud Computing for Science and Engineering. MIT Press, 2017
- Zburivsky, Danil, and Lynda Partner. Designing Cloud Data Platforms. Simon and Schuster, 2021.
- Tom White. Hadoop - The Definitive Guide (4th edition). O'Reilly, 2015
- Matei Zaharia, Holden Karau, Andy Konwinski, Patrick Wendell. Learning Spark, 2nd Edition. O'Reilly, 2020
- Andrew G. Psaltis. Streaming Data - Understanding the real-time pipeline. Manning, 2017
Further readings will be mentioned during the course.
Metodi didattici
Lessons and (mainly guided) practical exercises.
As concerns the teaching methods of this course unit, all students must attend Module 1, 2 on Health and Safety online.
Modalità di verifica e valutazione dell'apprendimento
The exam consists of an oral examination on all the covered topics.
Strumenti a supporto della didattica
Cloud/big data services are accessed through Amazon Web Services and/or Google Cloud Platform via coupons.
Orario di ricevimento
Consulta il sito web di Matteo Francia
Consulta il sito web di Enrico Gallinucci
SDGs
L'insegnamento contribuisce al perseguimento degli Obiettivi di Sviluppo Sostenibile dell'Agenda 2030 dell'ONU.