73302 - Computer Vision And Image Processing M

Academic Year 2022/2023

Learning outcomes

At the end of the course the students know the basic principles of computer vision and image processing algorithms. Thereby, they are able to understand and apply a variety of algorithms and operators aimed at either extracting relevant semantic information from digital images or improving image quality. They also understand the diverse challenges and design choices characterizing the main applications and acquire familiarity with software tools widely adopted in these scenarios.

Course contents

  1. Introduction – Basic definitions related to image processing and computer vision. An overview across major application domains.
  2. Image Formation and Acquisition – Geometry of image formation. Pinhole camera and perspective projection. Geometry of stereopsis. Using lenses. Field of view and depth of field. Projective coordinates and perspective projection matrix. Camera calibration: intrinsic and extrinsic parameters, lens distortion. Camera calibration based on planar targets and homography estimation (Zhang's algorithm). Image rectification and stereo calibration. Basic notions on image sensing, sampling and quantization.
  3. Image Filtering –Convolution and correlation. Mean and Gaussian filtering. Median Filtering. Bilateral filtering. Non-local means.
  4. Image Segmentation and Blob Analysis – Gray-level Histogram. Binarization by global thresholding. Automatic threshold estimation. Spatially adaptive binarization. Colour-based segmentation. Binary Morphology Operators. Connected components labeling and blob analysis.
  5. Local Features – Edge features and image gradient, Smooth derivatives (Sobel), Canny edge detector. Keypoint detectors and descriptors. Harris Corners. Scale invariant features. SIFT features. Efficient feature matching by kd-trees.
  6. Instance Detection – Pattern matching by SSD, SAD, NCC and ZNCC. Shape-based mathing. Hough Transform for analystic shapes, Generalized Hough Transform. Object detection by local invariant features: Hough-based voting, least-squares similarity estimation.
  7. Deep Learning for Computer Vision – Review of machine learning basics. Image Classification. Linear Classifiers and Fully Connected Neural Networks. Convolutional Neural Networks (CNN). Successful CNN architectures for image classification: AlexNet, VGG, Inception, ResNet. Transfer Learning. CNN for Object Detection: R-CNN, Feature Pyramid Network (FPN), Faster R-CNN, YOLO.

Readings/Bibliography

  • Gonzales R., Woods R. : “Digital Image Processing”, Third Edition, Pearson Prentice-Hall, 2002.
  • Richard Hartley, Andrew Zisserman. “Multiple View Geometry in Computer Vision”, 2nd edition, Cambridge University Press, 2011.
  • Carsten Steger, Markus Ulrich, Christian Wiedemann “ Machine Vision Algorithms and Applications“ 2nd Edition, Wiley, 2018.
  • Richard Szeliski “Computer Vision: Algorithms and Applications”, 2nd Edition, Springer, 2021.
  • Ian Goodfellow, Yoshua Bengio, Aaron Courville "Deep Learning", MIT Press, 2016

Teaching methods

Theory taught in lectures is complemented by assisted hands-on lab sessions covering selected topics. Students are provided with the software tools, image/video archives and support that enable practical implementation and testing of most of the topics discussed in classes, so as to deepen significantly their understanding of the course subject matter.

Assessment methods

Students are required to carry out and present a software project related to solving a real-world image processing or computer vision problem. Such a project can be either chosen among a list provided by the teacher through the course web-site or proposed by the student.

The, the exam is oral and comprises both project discussion as well as assessment of theoretical knowledge.

Teaching tools

Available on the course website are:

  • All slides related to lectures and lab sessions.
  • A Software development framework based on OpenCV which allows students to practically implement the algorithms and methods taught in lectures.
  • Images and videos allowing students to easily test their implementations.
  • Links to image processing and computer vision resources freely available on the web (such as software tools and image/video archives).

Office hours

See the website of Luigi Di Stefano