- Docente: Laura Anderlucci
- Credits: 6
- SSD: SECS-S/01
- Language: English
- Moduli: Laura Anderlucci (Modulo 1) Laura Anderlucci (Modulo 2)
- Teaching Mode: Traditional lectures (Modulo 1) Traditional lectures (Modulo 2)
- Campus: Bologna
- Corso: First cycle degree programme (L) in Genomics (cod. 9211)
Learning outcomes
The course provides students with the current methods and techniques of data science using modern computational methods and software with an emphasis on rigorous statistical thinking. At the end of the course students are able to represent and organise knowledge about large-scale data collections, and to turn data into actionable knowledge by using concepts of statistical learning and data mining combined with data visualization techniques and reproducible data analysis.
Course contents
Part 0: Introduction to Statistical Learning
Part I: Classification
- Naïve Bayes
- Logistic Regression;
- Linear Discriminant Analysis
- k-Nearest Neighbors
Part II: Resampling Methods
- Cross-Validation
- The Bootstrap
Part III: Tree-Based Methods
- Classification trees
- Bagging; Random Forests; Boosting
Part IV: Unsupervised Learning
- k-means
- Hierarchical clustering
Part V: Overview of the main machine learning methods
- Support Vector Machines
- Neural Networks
Readings/Bibliography
The primary text for the course:
- James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to Statistical Learning. New York: Springer.
Freely available at: http://www-bcf.usc.edu/~gareth/ISL/ISLR%20Seventh%20Printing.pdf
In addition, we will use:
- T. Hastie, R. Tibshirani, and J. Friedman (2001) The Elements of Statistical Learning: data mining, inference and prediction. Springer Verlag.
Freely available at: https://web.stanford.edu/~hastie/Papers/ESLII.pdf - J. Han and M. Kamber (2000) Data mining: concepts and techniques. Morgan Kaufman.
Freely available at: http://myweb.sabanciuniv.edu/rdehkharghani/files/2016/02/The-Morgan-Kaufmann-Series-in-Data-Management-Systems-Jiawei-Han-Micheline-Kamber-Jian-Pei-Data-Mining.-Concepts-and-Techniques-3rd-Edition-Morgan-Kaufmann-2011.pdf
Teaching methods
Lectures and practical sessions.
Assessment methods
Written exam.
Teaching tools
The following material will be provided: slides of the lectures, exercises with solutions, mock exam.
Office hours
See the website of Laura Anderlucci