66563 - Laboratory of Bioinformatics 1

Course Unit Page

Academic Year 2016/2017

Learning outcomes

At the end of the course, the student has the basic knowledge for developing and using tools for sequence and structure analysis of biomolecules and more generally for annotation problems in the genomic era. In particular, the student will be able to: discuss the theoretical basics of some machine learning tools (Neural Networks, Hidden Markov Models); selecting programs for problem solving; writing programs.

Course contents

Laboratory of Bioinformatics I (first semester, 5CFU)

Theory and application on :

1) The role of Bioinformatics
2) Archives and Next Generation Sequencing experiments
3) The problem of sequence annotation
4) Protein sequence, structure and function
5) Protein structure comparison: generating rules for sequence comparison
6) Local and global alignment methods; data base search with BLAST
7) Extreme value statistics
8) The protein universe and UniProtKB
9) Evolution did it: what can we learn from a pairwise structure comparison over the entire PDB
10) Theoretical foundation of building by homology
11) From sequence to structure and function
12) When a protein is a protein

Best practice on:

1) Handling of the different alignment methods

2) Modeller and statistical validation of computed 3D models

3) Comparison with SwissModeller

Laboratory of Bioinformatics I (second semester, 5CFU)

Theory and application on:

1) Protein geometrical features

2) Protein 3D, secondary and covalent structure

3) Protein Domains: SCOP and CATH

4) The notion of functional domains/Go terms

5) Functional domains and evolution

6) Protein families

7) Biosequence analysis: a historical perspective

8) Mapping structures into sequences and back

9) Propensity scales and propensity plots

10) The concept of averaging over a sliding window

11) Conditional probability and secondary structure prediction

12) Basics of feed-forward neural networks.

13) Training, testing and applications of NN

14) Critical evaluation of machine learning methods: HMM vs NN

15) Protein prediction under 30% sequence identity

Best practice on:

1) How to model a protein domain with a HMM

2) Best practice of hmmr and statistical validation of a computed protein domain

3) Comparison with PFAM


Readings/Bibliography

Online, selected articles and reviews in cloud sharing

Teaching methods

Lectures, practicum and tool development

Assessment methods

Students will be evaluated both with written tests and a final oral exam. Both methods assess the learning outcome of the course and aim at veryfying what the student has acquired in terms of Bioinformatics skills during the theoretical and practical parts of the program developped over the two semesters.
Since the course includes two semesters, a written test at the end of each semester will evaluate whether the student is idoneous to attend the final oral section. Students who do not attend the "in itinere" tests, will be requested to have a final general test before the final oral section.
Before attending the final oral section the student has to provide two written reports on the two tutored practical sections, executed in class. During the final oral assessment, the student is expected to answer questions on the following topics:

Laboratory of Bioinformatics I (first semester)
Theory and application on:
1) The role of Bioinformatics
2) Archives and Next Generation Sequencing experiments
3) The problem of sequence annotation
4) Protein sequence, structure and function
5) Protein structure comparison: generating rules for sequence comparison
6) Local and global alignment methods; data base search with BLAST
7) Extreme value statistics
8) The protein universe and UniProtKB
9) Evolution did it: what can we learn from a pairwise structure comparison over the entire PDB
10) Theoretical foundation of building by homology
11) From sequence to structure and function
12) When a protein is a protein
Best practice on:
1) Handling of the different alignment methods
2) Modeller and statistical validation of computed 3D models
3) Comparison with SwissModeller

Laboratory of Bioinformatics I (second semester)
Theory and application on:
1) Protein geometrical features
2) Protein 3D, secondary and covalent structure
3) Protein Domains: SCOP and CATH
4) The notion of functional domains/Go terms
5) Functional domains and evolution
6) Protein families
7) Biosequence analysis: a historical perspective
8) Mapping structures into sequences and back
9) Propensity scales and propensity plots
10) The concept of averaging over a sliding window
11) Conditional probability and secondary structure prediction
12) Basics of feed-forward neural networks.
13) Training, testing and applications of NN
14) Critical evaluation of machine learning methods: HMM vs NN
15) Protein prediction under 30% sequence identity
Best practice on:
1) How to model a protein domain with a HMM
2) Best practice of hmmr and statistical validation of a computed protein domain
3) Comparison with PFAM

Teaching tools

Online, Public Data Bases, PubMed, and materials (pdf of the lectures and selected articles) in cloud sharing

Office hours

See the website of Rita Casadio

See the website of Emidio Capriotti

See the website of Allegra Via