86464 - Algorithms and Systems for Big Data Processing

Course Unit Page

Academic Year 2019/2020

Course contents

The two dimensions of "Big" in Big Data.

Data dimensionality

  • geometrical effect of high dimensionality and consequences

Dimensionality reduction

  • multidimensional Gaussian vectors and their properties
  • dimensionality reduction by Johnson-Lindenstrauss
  • dimensionality reduction by SVD/PCA (relationship with Gaussian clustering) 
  • dimensionality reduction by sparse signal recovery/compressed sensing
  • other uses of SVD/eigenstructures: the hub-authority ranking, the pagerank core idea, document collection summaries)

Interpolation

  • grid-data multilinear interpolation
  • grid-data piecewise-linear interpolation
  • scattered-data interpolation by radial-basis functions

Streaming algorithms

  • the streaming computation model
  • streaming random picks and multiplication of huge matrices
  • streaming estimation of features of occurences histogram
  • hashing for flattening of distributions
  • random computation: estimations instead of exact results

 

Teaching methods

Class teaching

Assessment methods

Oral examination

Office hours

See the website of Riccardo Rovatti

See the website of Luca Benini