Course Unit Page
-
Teacher Daniela Cocchi
-
Credits 6
-
SSD SECS-S/01
-
Teaching Mode Traditional lectures
-
Language English
-
Campus of Bologna
-
Degree Programme First cycle degree programme (L) in Statistical Sciences (cod. 8873)
SDGs
This teaching activity contributes to the achievement of the Sustainable Development Goals of the UN 2030 Agenda.




Academic Year 2021/2022
Learning outcomes
At the end of the course the student will know the methods for linking the information referred to the same statistical unit. This information belongs to different archives and the statistical unit is not identified by means of a code free of errors. The student will be able to use the exact matching, by means of deterministic and probabilistic record linkage and the basic tools of statistical matching.
Course contents
Improving data quality through editing, imputation and record linkage.
The conditions for using a data base for statistical purposes.
Data quality properties and how to measure it.
The question of merging lists.
Conditional independence and capture and recapture methods.
Automatic data editing and imputation.
Non random and probabilistic record linkage.
Blocking techniques.
The problem of duplication.
The problem of disclosure and access to microdata.
Examples in economics, official statistics, health statistics
Readings/Bibliography
N. Herzog, F. J. Scheuren, W. E. Winkler (2007) Data Quality and Record Linkage Techniques, Springer ISBN 978-0-387-69502-0
Istat (2002) Metodi statistici per il record linkage (a cura di M. Scanu)
P. Christen (2012) Data Matching. Concepts and Techniques for Record Linkage, Entity Resolution, and Duplicate Detection, Springer ISBN 978-3-642-43001-5Further bibliographical references will be given during the course
Teaching methods
Lectures
Assessment methods
The final exam for this module of the course is a written test that contains also questions of theory.
The online test will be performed via “Esami on Line” (EOL). Zoom will be the platform for identification and monitoring.
A final overall mark for the two modules of the course will be proposed to each student, after the exams for BOTH modules (record linkage and data bases).
Teaching tools
Slides sketching the content of the lessons will be available
Office hours
See the website of Daniela Cocchi