# 69164 - Statistics and Data Analysis

• Docente: Luigi Ragni
• Credits: 5
• SSD: AGR/09
• Language: Italian
• Campus: Cesena
• Corso: Second cycle degree programme (LM) in Food Science and Technology (cod. 8531)
• from Sep 18, 2023 to Dec 11, 2023

## Learning outcomes

At the end of the course the student should know the criteria of inductive statistics that are the basis of the methods for the treatment of the experimental data and should be able to apply them to issues related to the quality control and testing process.

## Course contents

STATISTICS

A) Recall and insights

• Kind of variables (qualitative and quantitative), data (discrete and continuous), scales (nominal, ordinal, interval, ratio)

• Central tendency or position parameters: average (arithmetic, weighted geometric, harmonic) mode and median

• Data classification with the frequency distribution, determination of the optimal number of classes (methods of Sturges and Scott)

• Dispersion or variability parameters: variation range, percentiles, deviations from the mean (average absolute difference, deviance, variance, degrees of freedom, standard deviation, coefficient of variation)

• Shape indices : symmetry and kurtosis (moments, indices of Pearson and Fisher)

• Data representation, graphical and tabular

• Precision and accuracy of a measure

• Regression : independent and dependent variable, simple and multiple linear regressions, codeviance, regression equation, significance of the slope with the F of Fisher and with the t Student, significance of the intercept with the t Student; indices of the predictive power of the regression (coefficient of determination, adjusted, standard error, PRESS), analysis of residuals.

• Correlation (correlated variables, coefficient of correlation, covariance, relation between regression and correlation, significance of the correlation coefficient)

B) Probability and distribution

• Definitions of probability: classical, statistical, subjective

• Simple combinatorial calculation: permutations, arrangements and simple combinations

• Distributions : discrete binomial distribution (applications, properties), continuous normal distribution (density function of probability, standardization, moments and indices form, properties)

• Sampling distribution for inference (chi-square, t Student, F of Fisher) density functions of probability, property

C ) Analysis of the frequencies with the chi-square: a comparison between observed and expected distributions, null and alternative hypothesis, range of applicability, critical values of the chi-square distribution (tables)

D ) Levels of significance and probability: type I and II errors, power of a statistical test (significance level, verification of the significance of a difference, variability, unilateral or bilateral hypotheses dimension of the sample, parametric or non-parametric tests, confidence or fiducial intervals

E) Inferences on means

• t Student: comparison between observed and expected means, between an observation and the sample mean, between the means of two independent or dependent samples), unilateral or bilateral test, critical values of the t Student (tables)

• Analysis of the variance with one or more classification criteria: the F-test, F critical values (tables), variance and deviance "between" and "within", the errors, the comparisons "a priori" and "a posteriori" (LSD, Tukey)

• Not parameteric  tests (Kruskal-Wallis and U of Mann-Whitney, tables of critical values)

• Applicability conditions and validity of the inferential methods (characteristics of the sampling distribution, F-test, Levene test, homogeneity of the variance, characteristics of errors)

F) Introduction to the multivariate statistics

• Principal components analysis (PCA)
• Partial least square regression (PLSR)

DATA PROCESSING

A) Recall and insights

• Excel for Windows commands, syntax, statistical formulas, math and trigonometry, logic, filters, "data analysis" tool, "macro" and Visual Basic language, creating reports

B) Artificial neural networks: theory of neural networks: the artificial neuron, transfer functions, architecture, topology, learning, back-propagation; software examples, application in the food industry

C ) The image analysis of: acquisition, image processing, measurable parameters, the software ImagJ

D) Bibliographic resources on the Internet

Electronic resources and books

Lectur notes for the course and exercises done (downloadable from the exchange folder and from "Virtuale")

Reference material:

http://www.dsa.unipr.it/soliani/soliani.html

http://office.microsoft.com/it-it/support/guida-introduttiva-a-microsoft-office-2010-FX100996114.aspx

Murray R. Spiegel, Statistica, McGraw-Hill, 1994

Mario Castino, Ezio Roletto, Statistica Applicata, Piccin, 1991

Giancarlo Bettuzzi, Strumenti per lâ€™indagine statistica, Volume primo: Statistica descrittiva univariata, Clueb, 1995

D. Costantini, G. M. Giorgi, A Herzel, P. Monari, I. Scardovi, Metodi Statistici per le scienze economiche e sociali, Monduzzi editore, 1994

Manuale di Statistica, Edizioni Simone, 2010

## Teaching methods

Lectures on theoretical aspects in classroom and practical exercises in the computer room.

As concerns the teaching methods of this course unit, all students must attend Module 1, 2 on Health and Safety online

## Assessment methods

Final exam quiz-type with multiple choice and single answer, with the use of the PC (12 questions, duration of the test: 95 min).

See "Virtuale" for details on exam method.

The exam is designed to assess the student learning on all the arguments covered in class. In particular, the following aspects are evaluated:
1) the knowledge of the theoretical rudiments of the statistical test;
2) the ability of critical choice of the statistical test;
3) the technical knowledge of the computer tools for data processing and analysis.
During the course, the teacher has also opportunity to assess the degree of attention and critical participation of the students in the lectures and exercises looking for, although it is not required, the involvement of the students in the analysis and discussion. This is useful to better modulate the teaching according to the student's answer, without which the result of this interaction can represent prejudice for the final evaluation.

Dates of appeal for each session: two.

## Teaching tools

Personal computers connected to Internet used in classroom by the teacher. Blackboard. Audio systemVideo projector. Library.Computer room.

## Office hours

See the website of Luigi Ragni