87216 - Big Data Communications M

Academic Year 2023/2024

  • Docente: Flavio Zabini
  • Credits:: 6
  • SSD: ING-INF/03
  • Language: English

Learning outcomes

TThe course aims at providing basic concepts of Big Data (i.e., volume, velocity, variability, variety, veracity, value). Besides, students will familiarize with three of the main shifts of paradigm in communications compelled by Big Data: the finite length information theory for a proper study of machine-type communications required by the IoT, the multidimensional stochastic sampling (instead of regular sampling in the time domain) required by crowdsensing, and the use of neural networks (instead of Von Neumann machine) for applications and services in wireless communications (e.g., broadband, 5G).

Course contents

1. Introduction and Examples (6 hours)

  1. Considerations and scenarios. What does Big Data means (the six "V") - 1 hour
  2. Examples of paradigm shifts: from Von Neumann machine to Neural Networks; from regular sampling theory to stochastic sampling theory - 1 hour
  3. Examples of environmental monitoring (wide ground with real or virtual sensors) and coverage - 4 hours

 

2. Network services with Big Data (6 hours)

  1. Basic networking: network layers, SDN and VNF;
  2. Batch processing vs Stream processing: constraints and network designs;
  3. Consumer requirements and network design: the chain of value;
  4. Data center networks: Structure and components, topology, spanning trees, addressing and routing, traffic characteristics

 

3. Random Sampling and Reconstruction with Big Data (20 hours)

  1. Regular and irregular sampling. Cauchy formulation. WKS Sampling Theorem. Levinson Theorem - 1 hour)
  2. From Shannon Sampling Theory to Random Sampling Theory: WKS Sampling Theorem revised - 2 hours
  3. One dimension Poisson sampling (problem formulation, Poisson sampling process in time domain, Marvasti's spectral theorem) - 2 hours
  4. Spatial Point Process Theory - 3 hours
  5. Multidimensional Signal Reconstruction via random sampling - 3 hours
  6. Uncertainties in realistic scenario and Big Data related topics - 2 hours
  7. Application to the example of environmental monitoring - 3 hours
  8. Application to the example of coverage - 3 hours

 

4. Neural networks for Big Data Communications (10 hours)

  1. Introduction to Neural Networks: components and architectures. Rosenblatt's perceptron model - 1 hour
  2. Multilayer perceptron model. Cybenko's theorem - 1 hour
  3. Logical functions (AND, OR, XOR) with neural networks: exercises. Examples of activation functions - 2 hours
  4. Supervised learning and approximation. Delta rule: proof. Example of supervised learning: the least mean square error problem with a perceptron - 2 hours
  5. The back propagation. Limitations of traditional networks - 1 hours
  6. Laboratory experience ("make your own neural network") - 3 hours

 

5. Deep Neural Networks (3 hours)

  1. Why Deep Networks
  2. Choice of the network size: example
  3. Laboratory experience

 

6. Bayesian Networks (7 hours)

  1. Structure - 3 hours
  2. Inference - 2 hours
  3. Applications - 2 hours

 

7. Boltzmann Machines (8 hours)

  1. Hidden Markov Networks - 1 hour
  2. Unit state probability. Proof of the sigmoidal pdf. Equilibrium state. - 2 hours
  3. Restricted Boltzmann Machines (part I: definition, energy function, conditional independence). - 2 hours
  4. Restricted Boltzmann Machines (part II: marginal probabilities, visible units distribution, interpretation, sigmoidal function with proof, the gradient of the log-likelihood). - 3 hours

 

Readings/Bibliography

Suggested books:

  1. Goodfellow et al, "Deep Learning", The MIT Press 2016, www.deeplearningbook.org
  2. S. Theodoridis, K. Koutroumbas, "Pattern Recognition", Elsevier AP, 2009
  3. L. E. Sucar, "Probabilistic Graphs Models", Springer 2015.

Teaching methods

Traditional lectures and experimental activity ("make your own neural network") at teacher's and students' laptops.

Assessment methods

Oral exam at the end of the course.

Students have to achieve the sufficiency in each one of the main topics.

Teaching tools

Blackboard, slides, laptop. Datacamp

Office hours

See the website of Flavio Zabini