- Docente: Domenico Di Sante
- Credits: 6
- SSD: FIS/07
- Language: Italian
- Teaching Mode: Traditional lectures
- Campus: Bologna
- Corso: First cycle degree programme (L) in Materials Science (cod. 5940)
-
from Sep 18, 2024 to Jan 15, 2025
Learning outcomes
At the end of the course, the student has an understanding of fundamental methods for the analysis of complex data. They are familiar with the main and most modern algorithms for data analysis and can implement them in a programming language. Laboratory activities enable the student to implement and perform data analysis on a computer and apply the studied methodologies to test cases.
Course contents
Summary Syllabus:
- Part I: Introduction to Fundamental Methods of Scientific Computing (linear algebra, optimization, differential equations, integral calculus)
- Part II: Statistics (probability distributions, Bayesian statistics, linear models)
- Part III: Machine Learning
Detailed Syllabus:
Part I: Introduction to Fundamental Methods of Scientific Computing
-
Introduction: General and administrative information.
-
Linear Algebra:
- Vector spaces, matrix operations
- Eigenvalue problems
- SVD decomposition
- Other types of decomposition (LU, QR, Cholesky)
- Systems of linear equations
-
Optimization:
- Derivatives, gradients, Jacobian, Hessian
- Different types of optimization (local, global, convex, non-convex)
- Optimization methods (first-order, second-order)
- Stochastic Gradient Descent (SGD)
-
Differential Equations:
- First-order ordinary differential equations (Euler and Runge-Kutta methods)
- Systems of ordinary differential equations
- Partial differential equations
-
Integral Calculus:
- Numerical integration (midpoint, trapezoidal, and Simpson's rules)
- Interpolatory quadrature formulas
- Adaptive Simpson's formula
Part II: Statistics
-
Single-variable Probability Distributions:
- Single-variable models
- Random variables
- Bayes' rule
- Bernoulli and binomial distributions
- Multinomial distribution
- Gaussian distribution
- Other distributions (t-Student, Cauchy)
-
Multivariable Probability Distributions:
- Covariance and correlation
- Multivariable Gaussian distribution
-
Linear Models:
- Binary regression (logistic)
- Least squares linear regression
- Regularization (l1 and sparsity)
- Splines
- Generalized linear models
Part III: Machine Learning
-
Structuring Data Without Neural Networks:
- Dimensionality reduction
- Principal Component Analysis (PCA)
- Kernel PCA
- Clustering algorithms (k-means)
-
Supervised Learning:
- Neural networks
- Training and regularization
- Convolutional Neural Networks (CNN)
- Recurrent Neural Networks
-
Unsupervised Learning:
- Maximum Likelihood Estimation (MLE)
- Restricted Boltzmann Machine
- (Variational) Autoencoders (VAE)
- Generative Adversarial Networks (GANs)
Readings/Bibliography
Necessary Readings:
Lecture notes.
Suggested Readings:
- Calcolo scientifico. Esercizi e problemi risolti con MATLAB e Octave, A. Quarteroni et al., Springer Verlag (2017)
- Data-driven modeling & scientific computation: methods for complex systems & big data, J. N. Kutz, Oxford University Press (2013)
- Probabilistic Machine Learning: An Introduction, K. P. Murphy, The MIT Press (2022)
- Lecture Notes: Machine Learning for the Sciences, T. Neupert et al., arXiv:2102.04883v2 (2022)
- Modern applications of machine learning in quantum sciences, A. Dawid et al., arXiv:2204.04198 (2022)
Teaching methods
Lectures at the blackboard and attendance in computer labs.
Given the type of activities and teaching methods adopted, participation in this educational activity requires all students to have previously completed modules 1 and 2 of the safety training for study environments, in e-learning mode.
Assessment methods
The final evaluation consists of two components:
- Assessment of laboratory reports (30%)
- Oral exam on the course content (70%)
Teaching tools
Blackboard, Projector, Informatic laboratory.
Office hours
See the website of Domenico Di Sante