B0925 - DATA SCIENCE APPLICATIONS

Academic Year 2023/2024

  • Docente: Cinzia Viroli
  • Credits: 6
  • SSD: SECS-S/05
  • Language: English
  • Teaching Mode: Traditional lectures
  • Campus: Bologna
  • Corso: Second cycle degree programme (LM) in Statistical Sciences (cod. 9222)

Learning outcomes

By the end of the course the student will develop advanced expertise in analysing complex real-world data. In particular the student will be able to: - identify and apply appropriate statistical techniques to real application problems; - implement the various stages of advanced statistical analysis; - work in a team to develop a data analysis project; -present results of analyses in a short talk and/or poster and demonstrate effective communication skills.

Course contents

During this course the students will work on data science applications related to specific disciplines such as biology, epidemiology, genetics, engineering, finance, and the social sciences.

The case studies will be analyzed by R with different statistical methodology, such as linear and mixed modeling, time series, Bayesian methods, and missing data.

Specifically, the outline of the course is the following:

  • Collection: Gathering raw data from diverse sources. This can include databases, sensors, social media, and more, ensuring a robust dataset for analysis.
  • Preprocessing and Exploration (EDA): Correcting inaccuracies, handling missing values, and removing outliers to prepare the data for effective analysis. Visualization techniques to understand underlying patterns, trends, and anomalies in the data.
  • Modeling: Applying statistical models and machine learning algorithms to the data to test hypotheses or make predictions. This stage involves selecting models, training them with datasets, and evaluating their performance.
  • Interpretation: Translating the outcomes of data models into insights that are meaningful and actionable. This involves understanding the significance of the data analysis results in the context of the research questions or business problems being addressed.
  • Application: Applying the model in real-world settings to make informed decisions and predictions.
  • Simulation;
  • Implementing algorithms;
  • Large data and efficiency;
  • Software design, development, and testing.

Teaching methods

To be posted on Virtuale.

Assessment methods

The course will be assessed by the reports that the groups write and submit on the case studies during the course. Students who cannot give a presentation during the course will be examined orally on flexible individually agreed dates. Missing reports have to be submitted at later deadlines individually (without group work).

The course is not marked quantitatively, but as "idoneo/non idoneo".

Teaching tools

Supporting material to be posted on Virtuale.

The course makes use of the R statistical system. Basic knowledge of R is required.

Office hours

See the website of Cinzia Viroli