You are here:

93478 - COMPUTER VISION

Academic Year 2025/2026

Docente: Giuseppe Lisanti
Credits: 6
SSD: INF/01
Language: English

Teaching Mode: In-person learning (entirely or partially)
Campus: Bologna
Corso: Second cycle degree programme (LM) in Computer Science (cod. 6698)

Learning outcomes

At the end of the course the students will be able to implement algorithms addressing relevant computer vision tasks, such as: object detection, semantic segmentation, image and video captioning. During the course they will learn the basics of image, video analysis and computer vision. They will gain knowledge about the design and implementation of convolutional neural networks, recurrent neural networks and how to combine them. During the course they will also acquire familiarity with the relevant frameworks used to design modern deep architecture.

Course contents

The course introduces concepts, and tools for the design and implementation of image acquisition, processing, and analysis techniques. List of the contents:

Image Formation and Acquisition: geometry of image formation; lenses; field of view and depth of field; image sampling and quantization.

Spatial Filtering: convolution and correlation; mean and Gaussian filtering; median filtering; bilateral filtering.

Edge Detection: image gradient; non-maxima suppression; Laplacian of Gaussian; Canny edge detector.

Local Invariant Features: detectors and descriptors; Harris corners; scale-invariant features; SIFT features.

Camera calibration: projective coordinates and perspective projection matrix; intrinsic and extrinsic camera parameters; Zhang's algorithm.

Recall on Convolutional Neural Networks: filters, striding, and padding; Pooling Layers, Normalization Layers, successful CNN architectures.

Attention Mechanism: self-attention in RNN; Transformer architecture; Vision Transformer.

Object Detection: two-stages, one-stage, and anchor-free detectors; RoI pooling operator; feature pyramid networks.

Semantic Segmentation: fully convolutional networks; transposed and dilated convolutions; RoI Align operator; semantic, instance, and panoptic segmentation.

Metric Learning: deep metric learning; contrastive and triplet losses; application to different recognition/identification tasks.

Prerequisites:

- Linear Algebra
- Basic knowledge of Machine Learning and Deep Learning
- Programming and Python

Readings/Bibliography

All the slides from the lectures of the course will be made available on the Virtuale platform. There is no official textbook; further details on some of the topics of the course can be found in:

- Gonzalez, R. C., "Digital Image Processing", Pearson education, 2009.

- Hartley, R., & Zisserman, A., "Multiple View Geometry in Computer vision". Cambridge university press, 2003.

- http://d2l.ai/ - Aston Zhang and Zachary C. Lipton and Mu Li and Alexander J. Smola "Dive into Deep Learning", 2020

Further readings, such as scientific papers and online resources, might be recommended during the lectures of the course.

Teaching methods

Teaching methods include taught lessons and laboratory sessions. The datasets and code snippets for the laboratory sessions will be provided.

The code used in the lab is based on Python, the OpenCV library, the Scikit-learn library and the PyTorch framework.

Assessment methods

Student assessment consists of two components: a project and an oral exam.

Project:

The student must submit a list of scientific articles, preferably accompanied by publicly available source code. The articles should be selected from those published in the conferences or journals indicated during the introductory lecture of the course.
One article will be selected from the student’s proposed list.
The student is expected to thoroughly study the selected article and its associated source code, with the goal of replicating at least one experiment described in the paper or performing a test by modifying the proposed technique.
The student will then prepare a seminar explaining the technique introduced in the chosen article, as well as the experiment carried out.

During the seminar, the student will be asked questions regarding the studied work and the experiment performed.

The maximum grade for the project and the accompanying seminar is 30 with honors (30 cum laude)

Second part: An oral examination consisting of theoretical questions on the topics covered during the course.
The maximum grade for the oral exam is 30 with honors (30 cum laude).

The final grade is calculated as the average of the grades obtained in the two parts of the exam. The minimum grade required for the exam to be officially recorded is 18.

The oral exam and project discussion take place on the same date, published via the ALMAESAMI portal.

Students must register for the exam session through the same portal, and must submit the project source code and the seminar's slides via the Virtuale platform at least one week before the scheduled exam/discussion date.

Teaching tools

The pdf of the slides used in the course will be made available on the website of the course before each lecture.

The python scripts and datasets required for the lab sessions will be made available on the website of the course.

Office hours

See the website of Giuseppe Lisanti