69661 - Image Processing And Computer Vision M

Academic Year 2022/2023

Learning outcomes

Introducing basic knowledge about algorithms, tools and systems for the management, processing and analysis of digital images. The main topics of the course are filtering aspects of digital images, algorithms for image processing, algorithms for segmentation and classification of objects in digital images. Theoretical aspects that are introduced in the course are then applied to the design and manufacturing capabilities of simple systems oriented to real world applications. At the end of the course students are able to master basic digital image processing techniques and know potentials of this technology in applicative research and industrial contexts.

Course contents

  1. Introduction – Basic definitions related to image processing and computer vision. An overview across major application domains.
  2. Image Formation and Acquisition – Geometry of image formation. Pinhole camera and perspective projection. Geometry of stereopsis. Using lenses. Field of view and depth of field. Projective coordinates and perspective projection matrix. Camera calibration: intrinsic and extrinsic parameters, lens distortion. Camera calibration based on planar targets and homography estimation (Zhang's algorithm). Image rectification and stereo calibration. Basic notions on image sensing, sampling and quantization.
  3. Image Filtering –Convolution and correlation. Mean and Gaussian filtering. Median Filtering. Bilateral filtering. Non-local means.
  4. Image Segmentation and Blob Analysis – Gray-level Histogram. Binarization by global thresholding. Automatic threshold estimation. Spatially adaptive binarization. Colour-based segmentation. Binary Morphology Operators. Connected components labeling and blob analysis.
  5. Local Features – Edge features and image gradient, Smooth derivatives (Sobel), Canny edge detector. Keypoint detectors and descriptors. Harris Corners. Scale invariant features. SIFT features. Efficient feature matching by kd-trees.
  6. Instance Detection – Pattern matching by SSD, SAD, NCC and ZNCC. Shape-based mathing. Hough Transform for analystic shapes, Generalized Hough Transform. Object detection by local invariant features: Hough-based voting, least-squares similarity estimation.
  7. Deep Learning for Computer Vision – Review of machine learning basics. Image Classification. Linear Classifiers and Fully Connected Neural Networks. Convolutional Neural Networks (CNN). Successful CNN architectures for image classification: AlexNet, VGG, Inception, ResNet. Transfer Learning. CNN for Object Detection: R-CNN, Feature Pyramid Network (FPN), Faster R-CNN, YOLO.

Readings/Bibliography

  • Gonzales R., Woods R. : “Digital Image Processing”, Third Edition, Pearson Prentice-Hall, 2002.
  • Richard Hartley, Andrew Zisserman. “Multiple View Geometry in Computer Vision”, 2nd edition, Cambridge University Press, 2011.
  • Carsten Steger, Markus Ulrich, Christian Wiedemann “ Machine Vision Algorithms and Applications“ 2nd Edition, Wiley, 2018.
  • Richard Szeliski “Computer Vision: Algorithms and Applications”, 2nd Edition, Springer, 2021.
  • Ian Goodfellow, Yoshua Bengio, Aaron Courville "Deep Learning", MIT Press, 2016

Teaching methods

Theory taught in lectures is complemented by assisted hands-on lab sessions covering selected topics. Students are provided with the software tools, image/video archives and support that enable practical implementation and testing of most of the topics discussed in classes, so as to deepen significantly their understanding of the course subject matter.

Assessment methods

Students are required to carry out and present a software project related to solving a real-world image processing or computer vision problem. Such a project can be either chosen among a list provided by the teacher through the course web-site or proposed by the student.

The, the exam is oral and comprises both project discussion as well as assessment of theoretical knowledge.

Teaching tools

Available on the course website are:

  • All slides related to lectures and lab sessions.
  • A Software development framework based on OpenCV which allows students to practically implement the algorithms and methods taught in lectures.
  • Images and videos allowing students to easily test their implementations.
  • Links to image processing and computer vision resources freely available on the web (such as software tools and image/video archives).

Office hours

See the website of Luigi Di Stefano