You are here:

93398 - Architectures for Artificial Intelligence M

Academic Year 2025/2026

                
                        Docente:
                        Luca Benini
                    
                        Credits:
                        6
                    
                        SSD:
                        ING-INF/01
                    
                        Language:
                        English
                    
                        Moduli:
                        
                            Luca Benini
                            (Modulo 1)
                        
                            Francesco Conti
                            (Modulo 2)
                        
                        Teaching Mode:
                        
                                    In-person learning (entirely or partially) (Modulo 1); 
                                
                                    In-person learning (entirely or partially) (Modulo 2)
                                
                            Campus:
                            Bologna
                        
                            Corso:
                            Second cycle degree programme (LM) in
                            Electronic Engineering (cod. 0934)

                                Also valid for
                                
                                    Second cycle degree programme (LM) in
                                    
                                        Electronic Engineering (cod. 0934)
                                    
                                    Second cycle degree programme (LM) in
                                    
                                        Artificial Intelligence (cod. 9063)
                                    
                            Teaching resources on Virtuale
                        
                                        Course Timetable
                                    
from Sep 18, 2025 to Oct 31, 2025

                                        Course Timetable
                                    
from Nov 06, 2025 to Dec 19, 2025

Learning outcomes

The main goal of the class is to enable students to specify, configure, program and verify complex embedded electronic systems for the Internet of Things and for Artificial Intelligence. The importance of hardware-software interaction will be emphasized, as all practical IoT and AI systems are programmable. The class will provide working knowledge on state-of-the-art hardware platforms used in embedded AI and IoT applications - spanning a wide range of power and cost vs. performance tradeoffs. A detailed coverage will be given of software abstractions and methodologies for developing applications leveraging the capabilities of the above mentioned platforms. Design automation tools and flows will also be covered.

Course contents

Module 1 : (for students of 93398)

From ML to DNNs - a computational perspective

1. Introduction to key computational kernels (dot-product, matrix multiply...)

2. Inference vs training - workload analysis characterizatio

3. The NN computational zoo: DNNs, CNNs, RNNs, Attention-based Networks, State Based networks

Running ML workloads on programmable processor

1. recap of processor instruction set architecture (ISA) with focus on data processing

2. improving processor ISAs for ML: RISC-V and ARM use cases

3. fundamentals of parallel processor architecture and parallelization of ML workloads

Algorithmic optimizations for ML

1. Bottleneck analysis and algorithmic techniques

3. Model distillation: efficient NN models - depthwise convolutions, inverse bottleneck, optimized attention, introduction to Neural Architectural Search

3. Quantization and sparsity: scalar, block, vector.

Module 2 (for students of 93398)

Representing data in Deep Neural Networks
Recap of canonical DNN loops – a tensor-centric view
Data quantization in Deep Neural Networks
Brief notes on data pruning
From training to software-based deployment
High-performance embedded systems (NVIDIA Xavier, Huawei Ascend)
Microcontroller-based systems (STM32)
From software to hardware acceleration
Principles of DNN acceleration: spatial and temporal data reuse; dataflow loop nests and taxonomy; data tiling
The Neural Engine zoo: convolvers, matrix product accelerators, systolic arrays – examples from the state-of-the-art

Readings/Bibliography

Selected parts from the following books

1. Dive into Deep Learning (online: d2l.ai)

2. Efficient Processing of Deep Neural Networks (online: https://link.springer.com/book/10.1007/978-3-031-01766-7 )

3. Machine Learning Systems (online: mlsysbook.ai)

Teaching methods

Traditional lectures for theory.

All modules include hands-on sessions requiring a student laptop.

Assessment methods

Written exam with oral discussion: the written exam is compulsory and consists of solving problems and answering questions. The oral exam is optional and consists of in-depth questions on topics covered in class.

Teaching tools

Lectures using projector for slides provided by the instructors.

Hands-on sessions.

Online resources (virtuale)

Office hours

See the website of Luca Benini

See the website of Francesco Conti