- Docente: Maximiliano Sioli
- Credits: 6
- SSD: FIS/01
- Language: English
- Moduli: Maximiliano Sioli (Modulo 1) Matteo Negrini (Modulo 2) Gabriele Sirri (Modulo 3)
- Teaching Mode: Traditional lectures (Modulo 1) Traditional lectures (Modulo 2) Traditional lectures (Modulo 3)
- Campus: Bologna
-
Corso:
Second cycle degree programme (LM) in
Physics (cod. 9245)
Also valid for Second cycle degree programme (LM) in Advanced Methods in Particle Physics (cod. 5810)
Second cycle degree programme (LM) in Physics (cod. 9245)
-
from Sep 22, 2023 to Dec 07, 2023
-
from Oct 13, 2023 to Dec 06, 2023
-
from Nov 10, 2023 to Dec 15, 2023
Learning outcomes
At the end of the course the student will be acquainted with the main statistical concepts used in physics. After a review of the fundamentals of probability theory, parametric inferential statistics will be introduced, from point estimates and confidence intervals to hypothesis testing and goodness-of-fit. Each item will be addressed both in the Bayesian and frequentist approaches. Dedicated practical sessions will allow the student to become familiar with these conceptual tools by studying applications in nuclear and subnuclear physics.
Course contents
The structure of the course is the following.
For all students:
- Module 1, theory (lecturer M. Sioli)
Only for Applied Physics Students:
- Module 2a, exercises and complements (lecturer C. Sala)
Only for Nuclear and Subnuclear Physics Students:
- Module 2b, exercises and complements (lecturer M. Negrini)
- Module 3b, laboratory (lecturer G. Sirri)
Program of Module 1:
Concept of probability: axiomatic, combinatorial, frequentist and subjective. Conditional probability. Statistical independence. Bayes' theorem.
Random variables and probability density functions.
Multivariate distributions. Marginal and conditional densities.
Functions of random variables. Distribution moments: expectation
value, variance, covariance. Error propagation in the presence of
correlated variables.
Examples of probability distributions: Binomial, Multinomial,
Poisson, Exponential, Normal (multivariate), Chi-square,
Breit-Wigner, Landau.
Characteristic functions and their applications. Central Limit
Theorem.
Statistical inference. Fisher information. Test statistics and sufficient test statistics.
Monte Carlo method: convergence criteria, law of large
numbers, calculation of integrals and their uncertainties. Variance reduction. Random
number generators. Sampling a generic distribution.
Generalities on statistical estimators. Test statistics and
estimators. Estimators for the expectation value, variance and
correlation. Variance of the estimators. The maximum likelihood
method. Score and Fisher information. Multi-parametric estimator
uncertainties with correlations. Extended Maximum Likelihood.
Bayesian estimators, Jeffrey's priors. Least squares method.
Hypothesis testing. Simple hypotheses. Efficiency and power of the test. Neyman-Pearson lemma. Linear test, Fisher's discriminant. Multivariate methods: Neural Networks, Boosted Decision Tree, k-Nearest Neighbor. Statistical significance. P-values. Look-Elsewhere Effect. Chi-square method for hypothesis testing.
Exact methods for the construction of confidence intervals.
Gauss and Poisson case. Unified approach. Bayesian method. CLs
method. Systematic errors and nuisance parameters in the
calculation of confidence intervals. Frequentist and Bayesian
methods. Asymptotic properties.
Program of Module 2a:
Introduction to R and RStudio.
Generation of random variables and probability distributions. Law of large numbers. Central limit theorem.
Hypothesis testing. Student's t-test. Fisher's F-test. P-value: statistical significance and power.
Maximum Likelihood Estimation. Linear regression. Correlation. Analysis Of VAriance. Generalized linear models.
Multivariate linear regression. Multicollinearity. Lasso and Ridge penalizations.
Program of Module 2b:
Exercises and complements.
Program of Module 3b:
Lab: Elements of C++ and ROOT. RooFit Workspace, Factory,
composite models, multi-dimensional models. Use of RooStats to
compute confidence intervals, Profile Likelihood, Feldman-Cousins,
Bayesian intervals, w/ and w/o nuisance parameters. Use of TMVA as
classifier, description of TMVAGui.
Readings/Bibliography
- Frederick James, Statistical Methods in Experimental Physics, World Scientific, 2007
Bibliography for Module 2a:
- Data Analysis and Graphics using R -an Example-based approach." by John Maindonald and W. John Braun (Cambridge University Press, 2003)
- An Introduction to Statistical Learning with Applications in R." by Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani (Springer, 2013)
Bibliography for Module 2b and Module 3b:
- Glen Cowan, Statistical Data Analysis, Oxford Univ. Press, 1998
- O. Behnke et al., Data Analysis in High Energy Physics: A Practical Guide to Statistical Methods, Wiley, 2013
- A. G. Frodesen, O. Skjeggestad, H. Toft, Probability and Statistics in Particle Physics, Universitetforlaget, 1979
- G. D'Agostini, Bayesian reasoning in data analysis - A critical introduction, World Scientific Publishing, 2003
Teaching methods
Frontal lessons and laboratory sessions with statistical tools to solve practical problems.
As concerns the teaching methods of this course unit, all students attending Modules 2a and 3b of the course must attend Module 1, 2 on Health and Safety online [https://www.unibo.it/en/services-and-opportunities/health-and-assistance/health-and-safety/online-course-on-health-and-safety-in-study-and-internship-areas].
Assessment methods
The assessment method is a written exam (two hours long) with:
1. a theory question
2. an exercise
3. a question for the Lab part, where you will be asked to comment of block of code
Some parts of the written exam may be different depending on the channel chosen (module 2a and modules 2b+3b).
For 30/30 cum laude you must have achieved 30/30 in the written exam and take an additional oral exam.
Note that admission to the written examination will be provided to students who fulfilled and delivered compulsory laboratory exercises (even if they will not be used in the final grading evaluation).
Teaching tools
Lecture notes are available Virtuale. In case of problems write an email to the respective lecturer.
Office hours
See the website of Maximiliano Sioli
See the website of Matteo Negrini
See the website of Gabriele Sirri