You are here:

97267 - Matrix Tensor Techniques for Data Science

Academic Year 2022/2023

                
                        Docente:
                        Margherita Porcelli
                    
                        Credits:
                        6
                    
                        SSD:
                        MAT/08
                    
                        Language:
                        English
                    
                        Moduli:
                        
                            Margherita Porcelli
                            (Modulo 1)
                        
                            Davide Palitta
                            (Modulo 2)
                        
                        Teaching Mode:
                        
                                    Traditional lectures (Modulo 1)
                                
                                    Traditional lectures (Modulo 2)
                                
                            Campus:
                            Bologna
                        
                            Corso:
                            Second cycle degree programme (LM) in
                            Mathematics (cod. 5827)

                            Teaching resources on Virtuale
                        
                                        Course Timetable
                                    
from Mar 29, 2023 to May 26, 2023

                                        Course Timetable
                                    
from Feb 22, 2023 to Mar 24, 2023

Learning outcomes

At the end of the course, students have theoretical and computational knowledge on matrix and tensor techniques for analysing large amounts of data. In particular, students are able to examine large samples of discrete data and extract interpretable information of relevance in image and data processing, in medical and scientific applications, and in social and security sciences.

Course contents

The course presents fundamental matrix and tensor techniques commonly employed in Data Mining methods, together with optimization strategies designed to handle constrained problems typically arising in data science.

Mathematical tools will be developed for the following topics:

Compression and sparse approximation strategies (PCA, non-negative factorizations, OMP, etc)
Dictionary learning
Matrix completion

Part I. The course presents fundamental matrix and tensor techniques commonly employed in large Data analysis methods, typically arising in data science. These will also serve as preparatory material for Part 2 on optimization strategies for the analysis of big data.

Part 1

Vector and matrix norms (including sparsity promoting)
Linear regression and Least squares
Eigenvalues, SVD, pseudoinverse
Reduction and low rank representation
Sparse representation with l_0-norm
Dictionary Learning: the Orthogonal Matching Pursuit (OMP) algorithm
Principal Component Analysis (PCA) and Factor analysis
Tensors
Dealing with tensors and various representations
HOSVD, Tensor OMP, Dictionary Learning with tensors (Alternating Least-Squares)

Part 2

Numerical optimization

Basic concepts of unconstrained optimization and optimality conditions.
Algorithms for unconstrained smooth optimization (first and second order methods, line-search strategies, convergence and complexity analysis)
Algorithms for convex constrained optimization (projected gradient methods,convergence and complexity analysis)
Algorithms for composite optimization (proximal gradient methods, convergence and complexity analysis)

Matrix and tensor optimization

The Matrix Completion problem (problem analysis and proximal gradient approaches)
Applications: recommender systems, data predictions
The Dictionary Learning problem (problem analysis, greedy algorithms and proximal gradient approaches)
Applications: object and text classification, face recognition

Computational experience in Matlab/Octave on realistic data will accompany the lectures.

Readings/Bibliography

Nocedal, Jorge, and Stephen Wright. Numerical optimization. Springer Science & Business Media, 2006.
Beck, Amir. First-order methods in optimization. Society for Industrial and Applied Mathematics, 2017.
Gillis, Nicolas. Nonnegative Matrix Factorization. SIAM, 2020.
Dumitrescu, Bogdan, and Paul Irofti. Dictionary learning algorithms and applications. Springer, 2018.

Other textbooks, recent scientific articles and case studies from real world applications will be made available during the course.

Teaching methods

Blackboard, slides and computer lab sessions.

Assessment methods

Oral presentation of a takehome project (with slides)

Teaching tools

Slides made available as pdf file on the course webpage. Use of Matlab computational environment, and various toolboxes.

Office hours

See the website of Margherita Porcelli

See the website of Davide Palitta

SDGs

This teaching activity contributes to the achievement of the Sustainable Development Goals of the UN 2030 Agenda.