97267 - Matrix Tensor Techniques for Data Science

Academic Year 2022/2023

  • Docente: Margherita Porcelli
  • Credits: 6
  • SSD: MAT/08
  • Language: English
  • Moduli: Margherita Porcelli (Modulo 1) Davide Palitta (Modulo 2)
  • Teaching Mode: Traditional lectures (Modulo 1) Traditional lectures (Modulo 2)
  • Campus: Bologna
  • Corso: Second cycle degree programme (LM) in Mathematics (cod. 5827)

Learning outcomes

At the end of the course, students have theoretical and computational knowledge on matrix and tensor techniques for analysing large amounts of data. In particular, students are able to examine large samples of discrete data and extract interpretable information of relevance in image and data processing, in medical and scientific applications, and in social and security sciences.

Course contents

The course presents fundamental matrix and tensor techniques commonly employed in Data Mining methods, together with optimization strategies designed to handle constrained problems typically arising in data science.

Mathematical tools will be developed for the following topics:

  • Compression and sparse approximation strategies (PCA, non-negative factorizations, OMP, etc)
  • Dictionary learning
  • Matrix completion

Part I. The course presents fundamental matrix and tensor techniques commonly employed in large Data analysis methods, typically arising in data science. These will also serve as preparatory material for Part 2 on optimization strategies for the analysis of big data.


Part 1

  • Vector and matrix norms (including sparsity promoting)
  • Linear regression and Least squares
  • Eigenvalues, SVD, pseudoinverse
  • Reduction and low rank representation
  • Sparse representation with l_0-norm
  • Dictionary Learning: the Orthogonal Matching Pursuit (OMP) algorithm
  • Principal Component Analysis (PCA) and Factor analysis
  • Tensors
  • Dealing with tensors and various representations
  • HOSVD, Tensor OMP, Dictionary Learning with tensors (Alternating Least-Squares)

Part 2

Numerical optimization

  • Basic concepts of unconstrained optimization and optimality conditions.
  • Algorithms for unconstrained smooth optimization (first and second order methods, line-search strategies, convergence and complexity analysis)
  • Algorithms for convex constrained optimization (projected gradient methods,convergence and complexity analysis)
  • Algorithms for composite optimization (proximal gradient methods, convergence and complexity analysis)

Matrix and tensor optimization

  • The Matrix Completion problem (problem analysis and proximal gradient approaches)
  • Applications: recommender systems, data predictions
  • The Dictionary Learning problem (problem analysis, greedy algorithms and proximal gradient approaches)
  • Applications: object and text classification, face recognition


Computational experience in Matlab/Octave on realistic data will accompany the lectures.

Readings/Bibliography

  • Nocedal, Jorge, and Stephen Wright. Numerical optimization. Springer Science & Business Media, 2006.
  • Beck, Amir. First-order methods in optimization. Society for Industrial and Applied Mathematics, 2017.
  • Gillis, Nicolas. Nonnegative Matrix Factorization. SIAM, 2020.
  • Dumitrescu, Bogdan, and Paul Irofti. Dictionary learning algorithms and applications. Springer, 2018.

Other textbooks, recent scientific articles and case studies from real world applications will be made available during the course.

Teaching methods

Blackboard, slides and computer lab sessions.

Assessment methods

Oral presentation of a takehome project (with slides)

Teaching tools

Slides made available as pdf file on the course webpage. Use of Matlab computational environment, and various toolboxes.

Office hours

See the website of Margherita Porcelli

See the website of Davide Palitta

SDGs

Quality education Partnerships for the goals

This teaching activity contributes to the achievement of the Sustainable Development Goals of the UN 2030 Agenda.