87216 - Big Data Communications M

Course Unit Page

Academic Year 2019/2020

Learning outcomes

TThe course aims at providing basic concepts of Big Data (i.e., volume, velocity, variability, variety, veracity, value). Besides, students will familiarize with three of the main shifts of paradigm in communications compelled by Big Data: the finite length information theory for a proper study of machine-type communications required by the IoT, the multidimensional stochastic sampling (instead of regular sampling in the time domain) required by crowdsensing, and the use of neural networks (instead of Von Neumann machine) for applications and services in wireless communications (e.g., broadband, 5G).

Course contents

0. Introduction to Big Data Communications (5 hours)

  1. introduction to Big Data (Big data sources, companies involved, applications, how "big" are Big Data, the six "V") - 2.5 hours;
  2. introduction to data communications with high reliability, capacity, and spectral efficiency (performance vs spectral efficiency, distance-limited vs capacity limited systems, coverage evaluation) - 2.5 hours;

1. Neural networks for Big Data Communications (15 hours)

  1. Resource scheduling and spectrum management in 5G (machine learning for optimization) - 2 hours
  2. Machine Learning Basics - 2 hours
  3. Neural Networks (Rosenblat’s perceptron model, single layer vs multilayer, Cybenko's theorem, examples) - 2 hours
  4. Laboratory experience: "make your own neural network" - 2 hours
  5. Convolutional neural networks - 2 hours
  6. Probabilistic graphical models - 3 hours
  7. Deep Networks - 2 hours

2. Network services with Big Data (15 hours)

  1. Basic networking: network layers, SDN and VNF;
  2. Batch processing vs Stream processing: constraints and network designs;
  3. Consumer requirements and network design: the chain of value;
  4. Data center networks: Structure and components, topology, spanning trees, addressing and routing, traffic characteristics

3. Crowd-sensing and spatially distributed services: analytical characterization in realistic scenario (15 hours)

  1. Crowdsensing and environmental monitoring via wireless sensor networks: an example  - 2 hours
  2. From Shannon Sampling Theory to Random Sampling Theory - 3 hours
  3. Spatial point process theory - 3 hours
  4. Signal reconstruction in multidimensional space - 3 hours
  5. Realistic scenario for random sampling (position uncertainties, sample losses, measurement errors, inhomogeneous samples displacement) - 4 hours

4. Finite-lenght information theory for machine-type communications (10 hours)

  1. 5G and massive machine-type communications, Big data communications and small data networks: an apparent paradox - 1 hour
  2. Mathematical tools and basic concepts for information theory: random variables, entropy, mutual information - 2 hours
  3. Infinite length information theory: Shannon channel coding theorem and the concept of "channel capacity" - 2 hours
  4. Finite length information theory: the reliability vs rate tradeoff, the Gallager's random coding bound - 2 hours
  5. Finite length information theory: the Strassen's classic result for error probability, Polyanskiy's and Hayashi's recent interpretation for the concept of "channel dispersion" - 2 hours
  6. Capacity and dispersion evaluation: simple examples - 1 hour



Suggested books:

  1. Goodfellow et al, "Deep Learning", The MIT Press 2016, www.deeplearningbook.org
  2. S. Theodoridis, K. Koutroumbas, "Pattern Recognition", Elsevier AP, 2009
  3. L. E. Sucar, "Probabilistic Graphs Models", Springer 2015.

Teaching methods

Traditional lectures and experimental activity ("make your own neural network") at teacher's and students' laptops.

Assessment methods

Oral exam at the end of the course.

The students has to achieve the sufficiency in each one of the four main topics.

Teaching tools

Blackboard, slides, laptop.

Office hours

See the website of Flavio Zabini