37629 - Bioinformatics Applied to Evolutionary Biology Laboratory

Course Unit Page

Academic Year 2018/2019

Learning outcomes

At the end of the course, the student is able to handle biologic data-bases (molecular and morphometric) and analyse data using the most important biostatistic and phylogenetic methods. All analyses will be performed with the powerful statistical software R and its add-on packages for population genetics and phylogenetics.

Course contents

1 - The R software. Basics and data handling.
2 - Probability and distributions. Discrete distributions (binomial). Continuous distributions (normal). Densities, cumulative probability, quantiles, confidence intervals.
3 - Statistic tests. Tests for comparing continuous variables (t-test, Wilcoxon test). Tests for discrete variables/tabular data (chi-squared, Fisher test).
4 - Simple linear regression and correlation. Analysis of Variance (ANOVA) and Kruskal-Wallis test. 
5 - Multivariate analyses. Principal Component Analysis. Distance matrices. Mantel test. Multi-dimensional Scaling. Cluster analysis. Bootstrap. Applications to molecular data.
6 - Phylogenetic methods. Phylogenetic trees. Newick and Nexus format. Phylogeny estimation with distance-based methods (UPGMA, NJT). DNA substitution models.
7 - Population genetics with R (libraries ape, ade 4, adegenet, pegas, phangorn). Hardy-Weinberg equilibrium. Mismatch Distribution. Tajima test. R2 test. Inbreeding. Fixation indices. Analysis of Molecular Variance (AMOVA).


P. Daalgard. Introductory statistics with R. Second Edition. Springer, 2008.
E. Paradis. Analysis of Phylogenetics and Evolution with R. Second Edition. Springer, 2012.

Teaching methods

Explanation of the biostatistic methods. Practical examples (laboratory of informatics, R software). 

Assessment methods

The final exam is aimed to the evaluation of the achievement of the following didactic goals:
- exhaustive knowledge of the statistical/bioinformatic tools introduced during the frontal lessons;
- ability to use the mentioned tools to analyse biologic datasets;
- ability to interpret the obtained results (in light of the studied biologic phenomenon).

The final exam includes:
- solution of a biologic/bioinformatic problem using the R software (the exercise will be sent by mail to all the participants one week before the exam);
- oral discussion of the obtained results;
- oral questions about the program of the course.

Teaching tools

Laboratory of informatics. Pdf slides. Example data.

Office hours

See the website of Alessio Boattini