85189 - Systems and Algorithms for Data Science

Academic Year 2025/2026

  • Docente: Stefano Lodi
  • Credits: 10
  • SSD: ING-INF/05
  • Language: English

Learning outcomes

By the end of the course, the should - understand the fundamentals of supervised and unsupervised machine learning algorithms, focusing on deep learning algorithms - understand the fundamental programming principles of the Python language and be able to apply them primarily to data management and analysis, under the umbrella of data science - understand the role, purpose and features of Python libraries for numerical computation, data representation, and machine learning, and their interconnectivity with frameworks, such as Jupyter Notebook - be able to apply data science practices and methods to construct models and solve problems for various data-science applications.

Course contents

Module 1

The Python language: Expressions, tuples, lists, comprehensions, sets, dictionaries. Repetitive and branching instructions. The NumPy and Panda packages.

Supervised Machine Learning: Deep Neural Networks, convolutional and recurrent networks, LSTM, generative models (GAN, Autoencoder, Transformer).

Laboratory classes: Integrated development environments for Python

Module 2

The course focuses on the paradigm and core features of Python, as well as environments suited for data manipulation in the context of data science, such as PyTorch and TensorFlow. Emphasis is placed on exploring its libraries, which support the use and definition of machine learning models in various domains, including text analysis, time series, and image analysis.

The practical part of the course involves the use of development tools and platforms such as Jupyter Notebook, Google Colab, Hugging Face, and GitLab, enabling data analysis sharing and support. The course also includes access to various datasets to illustrate the applicability of the material through real-world examples.

Part 1 – Neural Networks (from convolutional to recurrent)
Part 2 – Generative Models (from GANs and Autoencoders to Transformers)
Part 3 – Natural Language Processing (from the basics to topic modeling)
Part 4 – Large Language Models

Readings/Bibliography

Module 1

Course slides and exercises are available on Virtuale [http://virtuale.unibo.it/] .

By subject:

Python

Recommended reading (on-line):

Parker, J. R. (2016). Python: An Introduction to Programming. Mercury Learning & Information. Free to download (using student institutional credentials) E-book, searchable at http://sba.unibo.it [http://sba.unibo.it/] | Online resources | E-books | Ricerca un e-book nel Catalogo A-LinkMachine learning and neural networks

Machine Learning

Recommended reading (on-line):

Zhang, A., Lipton, Z. C., Li, M., & Smola, A. J. (2023). Dive into Deep Learning (No. arXiv:2106.11342). arXiv. http://arxiv.org/abs/2106.11342

Module 2

Online materials and other suggested readings will be indicated during the course.

 

Teaching methods

NOTE: As concerns the teaching methods of this course unit, all students must attend Module 1, 2 on Health and Safety online [https://www.unibo.it/en/services-and-opportunities/health-and-assistance/health-and-safety/online-course-on-health-and-safety-in-study-and-internship-areas] .

Module 1

The lessons of the course are divided into
• frontal lessons in a lecture room
• lessons in a laboratory, each comprising both frontal expositions and exercises on the techniques for the solution of data analysis problems.

The topics of the course will be divided by lesson type:
• The theoretical and practical notions for machine learning are explained in frontal lessons
• In laboratory lessons, students implement scripts for machine learning using the Python programming language.

Module 2

  • Theoretical lessons in teaching room
  • Tutorials in lab

During the classes the students will be guided in the implementation and practice of the presented concepts.

Assessment methods

Attendance does not contribute to the assessment in any way.

Module 1

The examination is composed of three parts.

Python programming
The student is given a digital text on Esami OnLine [http://eol.unibo.it/], containing the description of a simple analysis problem; the student must produce on Esami OnLine [http://eol.unibo.it/] a Python program solving the analysis problem.

  • Reading books and bound notes is allowed.

Multiple choice test

The student is given a collection of sentences, each of which has 3 possible completions, of which only one is correct. The test is performed entirely on Esami OnLine [https://www.unibo.it/en/study/course-units-transferable-skills-moocs/course-unit-catalogue/course-unit/2023/Esami%20OnLine]

  • Reading any material is not allowed.

Oral examination
The student must answer three questions which may concern any part of the contents of the course. In particular, the student must show: mastery of the theoretical notions of the discipline and of the logic, set theoretic, and mathematical formalism employed in it; knowledge of the elements of the machine learning techniques which were presented during lessons, and implemented in the tools used during lessons, and the ability to use such tools; knowledge of the Python language.

Computation of the grade of the module and validity of the parts

The grades of all parts are contained in the interval from 0 to 30, including the minimum and maximum.

The grades achieved in the Python programming part and the multiple choice test part are valid until the end of the exam period (there are three exam periods: January-February, June-July, and September) in which the part has been taken.

The assessment of the overall outcome of the module and the computation of the final grade of the module take place at the end of the oral examination.

The final grade of the module is computed as the average of the latest grades achieved in the Python programming part, in the multiple choice test part, and the in oral examination.

Module 2

Students will be evaluated through various assignments.

Homework 1 (group work) – to be submitted by the 2nd week of the module

In this group project, focused on a programming task related to Part 1, students will demonstrate their understanding of Part 1 by using Python libraries, PyTorch, and TensorFlow.They will answer task-related questions and share their project results on a public onlinerepository, such as GitLab.

Homework 2 (group work) – to be submitted by the 3rd week of the module
In this group project, focused on a programming task related to Part 2, students will demonstrate their understanding of Part 2 by using Python libraries, PyTorch, and TensorFlow. They will answer task-related questions and share their project results on a public online repository, such as GitLab.

Homework 3 (group work) – to be submitted by the 4th week of the module
In this group project, focused on a programming task related to Part 3, students will demonstrate their understanding of Part 3 by using Python libraries, PyTorch, and TensorFlow. They will answer task-related questions and share their project results on a public online repository, such as GitLab.

Homework 4 (group work) – to be submitted by the 5th week of the module
In this group project, focused on a programming task related to Part 4, students will demonstrate their understanding of Part 4 by using Python libraries, PyTorch, and TensorFlow. They will answer task-related questions and share their project results on a public online repository, such as GitLab.

Assignment 1 (group work)
In this group project, students will demonstrate their ability to analyse an assigned dataset using Python libraries, PyTorch, and TensorFlow. They will answer task-related questions and share their project results on a public online repository, such as GitLab.

Assignment 2 (individual assessment in the form of an oral exam)
In the individual assessment, students will present the results of their work on the homework assignments and Assignment 1. Each student will give a presentation discussing aspects of the projects and homework, and respond to relevant questions.

Grading Criteria and Deadlines

The final grade for Module 2 will be between 0 and 30, inclusive.
Each homework assignment is worth 1 point. To be considered valid, students must submit at least 3 out of the 4 homework
assignments; otherwise, they will not be counted. The project (Assignment 1) must be submitted no later than 3 days before the scheduled date of the oral exam (Assignment 2).

Teaching tools

Module 1

Presentation of the course topics using a overhead projector
Laboratory with desktop PCs equipped with Anaconda; teacher's PC connected to an overhead projector to guide laboratory exercises
Documents used in the presentations, distributed on the site http://virtuale.unibo.it [http://virtuale.unibo.it/] . Access to the documents is allowed only to students of the course.

Module 2

Course notes. Open source projects used as teaching examples.

Office hours

See the website of Stefano Lodi

See the website of Elisabetta Ronchieri