95662 - INTRODUCTION TO MACHINE LEARNING

Course Unit Page

  • Teacher Giovanni Della Lunga

  • Credits 3

  • SSD SECS-S/06

  • Language English

  • Campus of Bologna

  • Degree Programme Second cycle degree programme (LM) in Quantitative Finance (cod. 8854)

  • Course Timetable from Feb 26, 2022 to Mar 12, 2022

SDGs

This teaching activity contributes to the achievement of the Sustainable Development Goals of the UN 2030 Agenda.

Quality education Gender equality Industry, innovation and infrastructure

Academic Year 2021/2022

Learning outcomes

The main goal of the course is to present the first elements of Machine Learning accompanied by a brief reference to the most important elements of numerical analysis used in this field. We present also the Python ecosystem for machine learning and the functionality it provides with NumPy, Matplotlib and Pandas, scikit-learn. A general discussion of Supervised and Unsupervised is introduced. After discussing the idea of clustering, the student should learn that this class of algorithms explore input data without being given an explicit output variable. Students should also clearly understand when to use it. After that, we continue with the definition of Supervised Learning describing from a very general point of view how this class of algorithms works and when we should use it. Students should learn how to implement some of the most simple and standard methods for modelling relationship between independent input variables and dependent output variables. As regards decision trees and how they can be used for prediction, the student should learn what are potential advantages of this technique over linear or logistic regression and how to use it in classification problems. A simple introduction to bayesan learning is presented. Finally we explain how different machine learning algorithms can be combined to produce composite predictions. An important example of this is a random forest which is a procedure for generating many different decision trees and combining the results. Students should be familiar with the following concepts: Vector Spaces, Eigenfunctions and Eigenvectors, Operator and Matrix Calculus, Calculus of Extrema, the concept of Gradient, Condition for local and global minima, Conditional Probability, Bayes Rule. A basic experience with Python programming is required.

Course contents

Lesson 1 - Introduction

In this lesson the student should learn the first elements of the subject accompanied by a brief reference to the most important elements of numerical analysis used in the field of machine learning. We present also the Python ecosystem for machine learning and learn about Python and it’s rising use for machine learning, SciPy and the functionality it provides with NumPy, Matplotlib and Pandas, scikit-learn that provides all of the machine learning algorithms. The student also learn how to install the Python ecosystem for machine learning on your workstation.

Topics:

  • What is Machine Learning
  • Supervised and Unsupervised Learning
  • Mathematics for Machine Learning (just a quick reminder)
  • Python Ecosystem for ML

Lesson 2 - Unsupervised Learning

In this class we present a simple example of an Unsupervised Learning Algorithm. After discussing the idea of clustering, the student should learn that this class of algorithms explore input data without being given an explicit output variable. The studend shoud also clearly understand when to use it that means when we don't know how to classify the data and we need an algorithm to find patterns.

Topics:

  • What is Unsupervised Learning
  • When to use it and how it work
  • k-Means clustering
  • Portfolio clustering with K-means

Lesson 3 - Supervised Learning

We begin with the definition of Supervised Learning, after that we describe from a very general point of view how this class of algorithms works and when we should use it. The student also lears how to implement two of the most simple and standard methods for modelling the past relationship between independent input variables and dependent output variables. Finally we present a technique that is tipically used for clustering but can be transformed to perform regression.

Topics:

  • What is Supervised Learning
  • When to use it and how it work
  • Linear Regression
  • Logistic Regression
  • SVM (Support Vector Machines)

Lesson 4 - Supervised Learning

In this lesson, we continue our discussion of supervised learning by considering how decision trees can be used for prediction. The student should learn what are potential advantages of this technique over linear or logistic regression and how to use it in classification problems. A simple introduction to bayesan learning is presented along with a simple application to credit feature selection. After that we explain how different machine learning algorithms can be combined to produce composite predictions. An important example of this is a random forest which is a procedure for generating many different decision treees and combining the results.

Topics:

  • Decision Tree
  • Naive Bayes
  • Random Forest

Lesson 5 - Introduction to Deep Learning

In this lesson the student learns what an Artificial Neural Network is. We then move on to provide some applications and esplain some extensions of the basic idea.

Topics:

  • What is a Neural Network
  • Single Layer NN
  • Gradient Descent Algorithm

Readings/Bibliography

  • John C. Hull, Machine Learning in Business, An Introduction to the World of Data Science, Amazon (2019)
  • Paul Wilmott, Machine Learning, An Applied Mathematics Introduction, Panda Ohana Publishing (2019)

Teaching methods

These introductory lessons assume a basic level of statistical and mathematical knowledge. No previous knowledge of Machine Learning is assumed but a basic knowledge of Python language is mandatory.

Lessons are based on slides and Jupyter Notebook, delivered online in advance.

Teaching tools

  • Slides (power point/pdf)
  • Selected literature
  • Jupyter Notebook
  • Python Code Snippet

Office hours

See the website of Giovanni Della Lunga