92991 - Learning and Estimation of Dynamical Systems M

Course Unit Page

Academic Year 2021/2022

Learning outcomes

The course will provide students with the main data-driven approaches for learning mathematical models of dynamic sytems. The learned models can then be used for automation, control and systems engineering applications. The course covers system identification and machine learning techniques like linear regression, logistic regression, prediction error method, instrumental variable, maximum likelihood, support vector machine. The basics of estimation theory are also covered and the Kalman filter is presented as a tool for estimating the state of a dynamic system from input-output data. At the end of the course students are able to apply the main system identification and machine learning algorithms to solve application problems and to evaluate the quality of the learned models.

Course contents

Introduction
Systems and models. Mathematical models. Physical modeling and black-box (data-driven) modeling. System identification and machine learning. Types of learning. Regression and classification. Learning from data flowchart.

Brief review of stochastic processes

Stationary processes and their properties. Ergodic processes. Sample estimates of first and second order moments.

Stochastic models
Static models and dynamic models. Stationary processes and linear filters. Modeling disturbances. Equation error models.

The estimation problem
Definition of the estimation problem. Parameter estimation and model complexity estimation. Estimation problems as optimization problems. Estimator properties.

Linear regression
The linear regression form. The least squares (LS) method and its properties. Geometrical interpretations. Least squares identification of dynamic equation error models (FIR, ARX and AR). Identifiability properties of LS estimates: persistency of excitation (PE) of input signals. Optimal prediction and least squares. Recursive least squares identification. The role of regularization in learning from data. Ridge regression.

Statistical hypothesis testing

Binary tests: the null and the alternatives. Errors of type I and II. Size and power of the test. Tests of gaussian distribution and chi-square distribution.

Estimation of the model complexity

Underfitting and overfitting. Training set and validation set. The F-test. FPE, AIC and MDL criteria.

Model assessment (validation)

 Whiteness tests on residuals. Tests of uncorrelation between input and residuals. Cross-validation.

Prediction error method
Nonlinear regression and the prediction error method (PEM). Optimal one-step ahead prediction. Minimizing the loss function: the Newton-Raphson algorithm and the Gauss-Newton algorithm. PEM identification of ARMAX, ARMA and ARARX models.

Maximum likelihood
The likelihood function. Maximum likelihood (ML) estimation. Equivalence between ML and PEM in the gaussian case. The Cramer-Rao lower bound.

Classification problems: probabilistic models

 Classification problems and decision boundaries. The Bayes classifier. Logistic regression. Solution of the two-class classification problem by using maximum likelihood and the Newton-Raphson method. The gradient descent method. Multiclass problems: the one-vs-all approach. Dealing with nonlinear decision boundaries. Linear discriminant analysis: solution of the two-class and multiclass classification problems. Quadratic discriminant analysis.

Classification problems: deterministic models

 Solving constrained optimization problems: the method of Lagrange multipliers. Equality and inequality constraints. The Karush-Khun-Tucker conditions. Separating hyperplanes. The perceptron algorithm. The maximal margin classifier. Support vectors. The soft margin classifier: linear support vector machine. Dealing with nonlinear boundaries: the kernel trick. Support vector machine. Multiclass problems: one-vs-all and one-vs-one approaches.

Optimal estimation of signals
The fundamental theorem of estimation theory. Stochastic state space models. Optimal estimator and its properties. Kalman filter: predictor-corrector form. The Kalman predictor and the difference Riccati equation. The steady-state Kalman predictor.

Readings/Bibliography

T. Söderström and P. Stoica, System Identification, Prentice Hall, Englewood Cliffs, N.J., 1989. This book is now out of print and can be downloaded here [http://user.it.uu.se/~ts/sysidbook.pdf] .

R. Guidorzi, Multivariable System Identification: From Observations to Models, Bononia University Press, Bologna, 2003.

S. Bittanti, Model Identification and Data Analysis, John Wiley & Sons, 2019.

G. James, D. Witten, T. Hastie and R. Tibshirani, An Introduction to Statistical Learning, Springer, 2013. This book can be downloaded here [https://faculty.marshall.usc.edu/gareth-james/ISL/ISLR%20Seventh%20Printing.pdf]

 

Teaching methods

Traditional lectures.

Assessment methods

The final evaluation is based on a written exam and a project.

Teaching tools

Video projector, blackboard.

Office hours

See the website of Roberto Diversi