69703 - Statistical Analysis of Data in Nuclear and Subnuclear Physics

Academic Year 2013/2014

  • Moduli: Maximiliano Sioli (Modulo 1) Tommaso Chiarusi (Modulo 2) Gabriele Sirri (Modulo 3)
  • Teaching Mode: Traditional lectures (Modulo 1) Traditional lectures (Modulo 2) Traditional lectures (Modulo 3)
  • Campus: Bologna
  • Corso: Second cycle degree programme (LM) in Physics (cod. 8025)

Learning outcomes

At the end of the course students get knowledge of the main statistical tools used in high energy physics, with and without accelerators. The course is complemented with exercise and laboratory sessions.

Course contents

Concept of probability: axiomatic, combinatorial, frequentist and subjective. Conditional probability. Statistical independence. Bayes' theorem.

Random variables and probability density functions. Multivariate distributions. Marginal and conditional densities. Functions of random variables. Distribution moments: expectation value, variance, covariance. Error propagation in the presence of correlated variables.   Examples of probability distributions: Binomial, Multinomial, Poisson, Exponential, Normal (multivariate), Chi-square, Breit-Wigner, Landau.

Characteristic functions and their applications. Central Limit Theorem.

Monte Carlo method: convergence criteria, law of large numbers, calculation of integrals and their uncertainties. Random number generators. Sampling a generic distribution.

Hypothesis testing. Simple hypotheses. Efficiency and power of the test. Neyman-Pearson lemma. Linear test, Fisher's discriminant. Multivariate methods: Neural Networks, Boosted Decision Tree, k-Nearest Neighbor. Statistical significance. P-values. Look-Elsewhere Effect. Chi-square method for hypothesis testing.

Generalities on statistical estimators. Test statistics and estimators. Estimators for the expectation value, variance and correlation. Variance of the estimators. The maximum likelihood method. Score and Fisher information. Multi-parametric estimator uncertainties with correlations. Extended Maximum Likelihood. Bayesian estimators, Jeffrey's priors. Least squares method.

Exact methods for the construction of confidence intervals. Gauss and Poisson case. Unified approach. Bayesian method. CLs method. Systematic errors and nuisance parameters in the calculation of confidence intervals. Frequentist and Bayesian methods.

Lab: Elements of C++ and ROOT. RooFit Workspace, Factory, composite models, multi-dimensional models. Use of RooStats to compute confidence intervals, Profile Likelihood, Feldman-Cousins, Bayesian intervals, w/ and w/o nuisance parameters. Use of TMVA as classifier, description of TMVAGui.

Readings/Bibliography

  • Glen Cowan, Statistical Data Analysis, Oxford Univ. Press, 1998
  • Frederick James, Statistical Methods in Experimental Physics, World Scientific, 2007
  • G. D'Agostini, Bayesian reasoning in data analysis - A critical introduction, World Scientific Publishing, 2003 
  • B. P. Roe, Probability and Statistics in Experimental Physics, Springer, 1992

Teaching methods

Frontal lessons, exercises and laboratory sessions with statistical tools to solve practical problems.

Assessment methods

Oral examinatons. Students will be asked to face a typical HEP problem from theoretical and practical point of views, also quoting software tools presented in the Lab part of the course.

Links to further information

http://www.bo.infn.it/~sioli/asd.htm

Office hours

See the website of Maximiliano Sioli

See the website of Tommaso Chiarusi

See the website of Gabriele Sirri