- Docente: Cristian Forestan
- Credits: 6
- SSD: BIO/11
- Language: Italian
- Moduli: Cristian Forestan (Modulo Mod 1) Marco Russo (Modulo Mod 2)
- Teaching Mode: Traditional lectures (Modulo Mod 1) Traditional lectures (Modulo Mod 2)
- Campus: Bologna
- Corso: Second cycle degree programme (LM) in Plant and Agricultural Biotechnology (cod. 5948)
-
from Sep 24, 2025 to Dec 17, 2025
-
from Sep 17, 2025 to Dec 16, 2025
Learning outcomes
By the end of the course, students will have acquired the skills and knowledge of the main tools used to solve bioinformatics problems and tasks, including the programming basics of key languages (R, Bash/Unix) commonly used in this field. Students will become familiar with software and major algorithms for accessing and querying biological databases, performing pairwise or multiple alignments of nucleotide and protein sequences, conducting genomic analyses, and managing sequencing data from genomes and transcriptomes obtained using next-generation sequencing (NGS) technologies, including sequence alignment and assembly.
Course contents
The course aims to illustrate the use of key algorithms for genomic, transcriptomic, and epigenomic analysis using both online tools and locally installed programs. The course will be held entirely in a computer lab, where each student will have access to a PC equipped with a virtual machine running a Linux (Ubuntu) operating system and an R programming environment.
1-Introduction to Bioinformatics and Programming Basics
-
Introduction to the Unix environment and bash terminal
-
Programming elements for bioinformatics in Bash and R; design of pipelines for bioinformatics analysis
-
Introduction to the use of Galaxy
-
Handling of sequencing and alignment data formats (FASTA, FASTQ, SAM/BAM, GTF, etc.)
2-Methods for Sequence Alignment and Similarity Search
-
Local and global sequence alignments
-
Substitution matrices for sequence alignment (PAM, BLOSUM)
-
Heuristic approaches for similarity search in sequence databases (BLAST)
-
Multiple sequence alignment
-
Construction of phylogenetic trees
3-Methods for Mapping and Assembly of Sequenced DNA
-
Main algorithms for aligning short and long reads to a reference genome
-
Main genome assembly algorithms (greedy graph-based, OLC, and de Bruijn graph-based methods)
-
Quality assessment of assemblies and methods for ordering and orienting contigs on a reference genome
-
Annotation of repetitive elements and genes in eukaryotic genomes
4-Methods for SNP Calling from WGS and Variability Analysis
-
Overview of major formats for collecting and manipulating genomic variants (VCF, HapMap, geno)
-
Main algorithms for calling SNP variants from NGS data
-
Analysis of genetic variability, population structure, and identification of regions under selective pressure
-
Prediction of SNP effects
5-Methods for Gene Expression Analysis
-
Methods for transcriptome analysis
-
Main steps and algorithms for RNA-seq data analysis
-
Methods for gene expression quantification
-
Differential gene expression analysis and co-expression networks
-
Functional gene annotation and gene ontology
6-Methods for Epigenomic Data Analysis
-
Introduction to genome regulatory elements
-
Methods for the analysis of genome regulatory elements
-
ChIP sequencing and CUT&RUN sequencing
7-Biological Databases
-
Biological data repositories (NCBI, ENSEMBL, Entrez, SRA)
-
Methods for accessing biological databases
-
Visualization of biological data
-
Use of major genome browsers (UCSC, IGV)
Prerequisites
Basic knowledge of plant genetics, molecular biology, and structural and functional genomics.
Readings/Bibliography
M. Helmer Citterich, F. Ferrè, G. Pavesi, C. Romualdi, G. Pesole, Fondamenti di bioinformatica, Zanichelli, 2018.
Slides, lecture notes, and manuals provided by the teachers.
Teaching methods
The course includes lectures and laboratory sessions. During the lectures (36 hours), the main topics of the subject, as outlined above, will be presented. The laboratory sessions consist of 24 hours of hands-on exercises on specific case studies focused on agriculturally relevant plants.
All classes will take place in the computer lab, where each student will have access to a PC with a virtual machine running the Linux (Ubuntu) operating system, allowing them to apply the theoretical concepts learned.
On the Virtuale platform, students will have access to lecture slides and additional educational materials, including scientific articles, manuals, tutorials, videos, and links to web resources.
Assessment methods
The final exam is designed to assess whether the student has achieved the course objectives. In particular, it will evaluate the student’s knowledge of the topics covered in both the lectures and lab sessions.
The evaluation will be based on a written exam at the end of the course, consisting of a combination of multiple-choice and open-ended questions.
There will be 20 multiple-choice questions (1 point for each correct answer, 0 points for incorrect or blank answers) and 4 open-ended questions, each worth up to 3 points, based on accuracy and clarity of language.
The maximum score is 30 with honors (30 e lode). The minimum passing grade is 18/30.
Students with learning disorders and\or temporary or permanent disabilities: please, contact the office responsible (https://site.unibo.it/studenti-con-disabilita-e-dsa/en/for-students ) as soon as possible so that they can propose acceptable adjustments. The request for adaptation must be submitted in advance (15 days before the exam date) to the lecturer, who will assess the appropriateness of the adjustments, taking into account the teaching objectives.
Teaching tools
Computer lab with PCs equipped with virtual machines running Linux (Ubuntu) and an R working environment, and a video projector for displaying slides during lectures.
Office hours
See the website of Cristian Forestan
See the website of Marco Russo