30711 - RECORD LINKAGE

Anno Accademico 2021/2022

  • Docente: Daniela Cocchi
  • Crediti formativi: 6
  • SSD: SECS-S/01
  • Lingua di insegnamento: Inglese
  • Modalità didattica: Convenzionale - Lezioni in presenza
  • Campus: Bologna
  • Corso: Laurea in Scienze statistiche (cod. 8873)

Conoscenze e abilità da conseguire

At the end of the course the student will know the methods for linking the information referred to the same statistical unit. This information belongs to different archives and the statistical unit is not identified by means of a code free of errors. The student will be able to use the exact matching, by means of deterministic and probabilistic record linkage and the basic tools of statistical matching.

Contenuti

Improving data quality through editing, imputation and record linkage.

The conditions for using a data base for statistical purposes.

Data quality properties and how to measure it.

The question of merging lists.

Conditional independence and capture and recapture methods.

Automatic data editing and imputation.

Non random and probabilistic record linkage.

Blocking techniques.

The problem of duplication.

The problem of disclosure and access to microdata.

Examples in economics, official statistics, health statistics

Testi/Bibliografia

N. Herzog, F. J. Scheuren, W. E. Winkler (2007) Data Quality and Record Linkage Techniques, Springer ISBN 978-0-387-69502-0

Istat (2002) Metodi statistici per il record linkage (a cura di M. Scanu)

P. Christen (2012) Data Matching. Concepts and Techniques for Record Linkage, Entity Resolution, and Duplicate Detection, Springer ISBN 978-3-642-43001-5

Further bibliographical references will be given during the course

Metodi didattici

Lectures

Modalità di verifica e valutazione dell'apprendimento

The final exam for this module of the course  is a written test that contains also questions of theory.

The online test will be performed via “Esami on Line” (EOL). Zoom will be the platform for identification and monitoring.

A final overall mark for the two modules of the course will be proposed to each student, after the exams for BOTH modules (record linkage and data bases).   

Strumenti a supporto della didattica

Slides sketching the content of the lessons will be available

Orario di ricevimento

Consulta il sito web di Daniela Cocchi

SDGs

Istruzione di qualità Imprese innovazione e infrastrutture Ridurre le disuguaglianze Partnership per gli obiettivi

L'insegnamento contribuisce al perseguimento degli Obiettivi di Sviluppo Sostenibile dell'Agenda 2030 dell'ONU.