87195 - Lab of Big Data Architectures M

Course Unit Page

Academic Year 2019/2020

Learning outcomes

The Lab of Big Data architectures extends and integrates what learnt by the student in the course “statistics and architectures for big data processing” with a more in depth and practical knowledge of the big-data technologies and architectures. The students will learn how to design a big data system, the key concepts and differentiators behind state-of-the-art technologies and architectures, and how to use it effectively. This will be done by a series of practical exercises with interactive explanations, where students will learn by solving practical problems and examples.

Course contents

Configuring a Python environment

Connecting to a remote Big-Data Cluster

Creating a Big-Data Pipeline

Working with large datasets: from Pandas data frame to Spark data frame

Machine learning on large-scale time-series dataset

Teaching methods

The class with consists of the completion of a set of practical tutorials and assignments conducted on the own laptop and on a remote big data cluster hosted by the Italian Supercomputing Centre CINECA.

Assessment methods

Based on student's reports

Office hours

See the website of Andrea Bartolini