79191 - ANALYSIS OF DATA

Anno Accademico 2024/2025

  • Docente: Marco Novelli
  • Crediti formativi: 4
  • Lingua di insegnamento: Inglese
  • Moduli: Marco Novelli (Modulo 1) Marco Novelli (Modulo 2)
  • Modalità didattica: Convenzionale - Lezioni in presenza (Modulo 1) Convenzionale - Lezioni in presenza (Modulo 2)
  • Campus: Bologna
  • Corso: Laurea in Scienze statistiche (cod. 8873)

Conoscenze e abilità da conseguire

The course gives students further experience of analysing data in a wide variety of contexts, using the R computer package. By the end of the course the students will be able to: - implement simple statistical techniques, such as the normal linear model, in R. - interpret the results from statistical procedures and draw appropriate conclusions. - develop and implement an appropriate modeling approach to answer questions of interest about a given data set. - critically assess the quality of a statistical analysis conducted by someone else. - write up the results of a statistical analysis concisely in the form of a report

Contenuti

This is not an introductory course on R, students without the required background in basic programming are supposed to fill their gap before preparing themselves for the examination.

1) Univariate, bivariate descriptive statistics and graphics with R.

2) Probability distributions in R: random sampling, discrete and continuous distributions, densities, cumulative distribution functions and quantiles.

3) Point estimation, confidence interval, hypotheses testing and power function in R.

4) Writing functions and advanced data handling routines.

5) Linear regression applications in R.

6) RMarkdown, ggplot

Students without the required background knowledge in statistics, probability and/or basic programming are supposed to fill their gap before preparing themselves for the examination.

Required background knowledge in informatics/programming in R

  • Basics of programming: definition and design of an algorithm, data types.
  • Structured programming, sequence, iteration, choice; procedures and functions. Structured programming.
  • Generation and usage of random values.
  • Design of programs. Implementation and test of programs with the R language.

Required background knowledge in statistics

  • Empirical frequency distributions.
  • Measures of location (mode, median, arithmetic mean).
  • Measures of dispersion, linear correleation and regression.
  • Fundamentals of parametric estimation and hypothesis testing.
  • Statistical tables for the standard normal, chi-squared and Student's t distributions.

Required background knowledge in mathematics

  • Rules for product and summation notation. Factorial, binomial coefficient and their properties.
  • Real functions, limit, derivative and integration.

Required background knowledge in probability

  • Random experiments and their sample spaces. Simple, compound and disjoint events. Impossible and certain events. Events obtained by intersection, union and negation.
  • Definitions and axioms of probability. Conditional probability. Independent events. The law of total probability. Bayes' theorem.
  • Random variables. Rules for computing probabilities for any random variable. The distribution function of a random variable. Probability mass function. Probability density function.
  • Sequences of random variables. Limit theorems and convergence.

Testi/Bibliografia

Students without the required background knowledge in statistics, probability and/or basic programming are supposed to fill their gap before preparing themselves for the examination.

Teacher's lecture notes and scripts are available on the platform "Virtual learning environment" of the University of Bologna (https://virtuale.unibo.it/) for all enrolled students. In order to have access to this platform, students must use their username and password.

Background information can be found in several chapters of the following books:

P. Dalgaard (2008) Introductory statistics with R - 2 ed. New York: Springer.

James (JD) Long, Paul Teetor (2019) R Cookbook, 2nd Edition Freely available at:  https://rc2e.com/

Metodi didattici

All the lectures will be held in the lab where several applications will be developed by using R.

Although attending lessons is not mandatory, it is strongly recommended.

Attending lectures is the first and easiest way to start learning and taking active part in all teaching activities is crucial and strongly recommended for all students. Thus, lessons are not recorded.

 

In consideration of the type of activity and the teaching methods adopted, the attendance of this training activity requires the prior participation of all students in the training modules 1 and 2 on safety in the study places, in e-learning mode. (https://elearning-sicurezza.unibo.it/)

Modalità di verifica e valutazione dell'apprendimento

Exams can only be taken in the official exam sessions.

For all students (regardless of the fact that they have attended lectures or not), the exam consists of a practical test in the computer lab lasting 60 minutes. The final exam aims at evaluating the achievement of the following educational targets:

- to know the main features of R

- write functions, routines and algorithms

- to be able to perform statistical applications in R

Students are not permitted to use text book, personal notes, mobile phone (and smart watch or similar electronic data storage or communication device). The written test consists of 3-4 exercises articulated in several points with a final pass/fail grade.

Further useful information about the exams:

  • In order to take the exam, students are required to put their names down for the exam through Almaesami platform. 
  • During the exam students must use the computer in the lab, therefore it is not possible to use their own laptop.
  • Exams can only be taken in the official exam sessions.
  • An identity card (or the UNIBO student card) is required to take part in the exam.

Strumenti a supporto della didattica

Computer and R scripts.

Orario di ricevimento

Consulta il sito web di Marco Novelli