85173 - Analysis Of Categorical Data

Academic Year 2022/2023

  • Teaching Mode: Traditional lectures
  • Campus: Bologna
  • Corso: Second cycle degree programme (LM) in Statistical Sciences (cod. 9222)

Learning outcomes

By the end of the course the student acquires the knowledge of descriptive and probabilistic methods for the analysis of contingency tables. The student is also able to choose the best method to perform multivariate analyses of a given categorical dataset and to interpret the obtained results.

Course contents

Introduction (2 hours)

  • Data matrices and contingency tables.
  • Descriptive and inferential techniques for the analysis of contingency tables.

Probabilistic methods (11 hours)

  • Probability structure for contingency tables.
  • Loglinear models for contingency tables:

- model specification;

- parameter estimation and interpretation;

- goodness-of-fit measures;

- model selection and comparison;

- diagnostics for checking models.

Descriptive methods (11 hours)

  • Geometric concepts in multidimensional space.
  • Matrix decompositions (spectral, singular value), low-rank matrix approximation and multidimensional analysis.
  • Theory and algebra of simple correspondence analysis.
  • Canonical correlation analysis of contingency tables.
  • Multiple correspondence analysis.

R functions and packages for the analysis of contingency tables (6 hours)

  • Syntax, usage and output of functions and packages available in the R environment for the analysis of contingency tables.
  • Examples of analyses based on the use of such functions and packages.

The reported number of hours is an estimate which takes account of both theoretical and practical lessons. Practical lessons focus on specific topics and are generally scheduled after the lessons devoted to the theoretical treatment of that topics.

Readings/Bibliography

Compulsory readings

  • A. Agresti. Categorical data analysis, Second edition. Hoboken: John Wiley & Sons, 2002. Chapters 1-3, 8-9.
  • M. Greenacre. Theory and applications of correspondence analysis. London: Academic Press, 1984. Chapters 1-5.
  • O. Nenadic, M. Greenacre. Correspondence analysis in R, with two- and three-dimensional graphics: the ca package. Journal of Statistical Software. May 2007, Volume 20, Issue 3.
  • N. Redfern. Correspondence analysis of genre preferences in UK film audiences. Participations: Journal of audience and reception studies. 2012, Vol. 9, No. 2, pp. 45-55.
  • Additional readings concerning topics not included in the recommended textbooks (to be announced during the lessons).

Additional materials useful for the preparation of the exam will be available on the platform "Virtual learning environment" of the University of Bologna (https://virtuale.unibo.it/) for all enrolled students:

  • slides containing a summary of the topics discussed by the teacher during the theoretical lessons;
  • a list of some exercises useful for the preparation of the exam;
  • materials useful for the practical lessons.

In order to have access to the platform "Virtual learning environment", students must use their username and password.

Slides discussed by the teacher during the theoretical lessons will be made available on the platform "Virtual learning environment" on a weekly basis. Since in those slides the teacher's oral explanations are completely missing, they are not self-explanatory and only represent a tool to see which specific topics have been treated during the lessons.

For the preparation of the exam students are required to make use all the compulsory readings.

Teaching methods

Theoretical lessons in a lecture hall and practical lessons in a computer laboratory through the R computing package. R scripts employed during the practical lessons will be made available on the platform "Virtual learning environment" of the University of Bologna (https://virtuale.unibo.it/) for all enrolled students.

Although attending lessons is not mandatory, it is strongly recommended.

Attending lectures is the first and easiest way to start learning and taking active part in all teaching activities is crucial and strongly recommended for all students (https://corsi.unibo.it/2cycle/StatisticalSciences/lecture-attendance). Thus, lessons will not be recorded.

Assessment methods

The exam tests the qualifications of each student on both a theoretical and a practical level. For all students (regardless of the fact that they have attended lectures or not), the exam is written, lasts two hours and takes place in a room. It consists of a given number of exercises, each of which is composed of open-ended questions concerning either the theoretical aspects of the statistical methods or the ability of using methods for data analysis and interpreting results. These latter questions require solving numerical exercises. Consulting textbooks or notes during the written exam is not allowed. A pocket calculator is necessary. The maximum mark for each exercise is fixed in advance and is visible by the student who takes the exam. The sum of the marks for all exercises is 32. The overall final grade is given by the sum of the marks in the exercises, which is expressed on a scale of 30. If such a sum results in 31 or 32 the overall final grade will be 30/30 cum laude. The minimum passing grade is 18/30. Marks are given on the basis of the completeness, accuracy and appropriateness of the student's responses.

Further useful information about the exams
  • In order to take the exam, students are required to put their names down for the exam through Almaesami platform.
  • Exams can only be taken in the official exam sessions.
  • An identity card (or the UNIBO student card) is required to take part in the exam.
  • Students will not be permitted to use any electronical device (mobile phones, smart watches, electronical data storage devices, etc.). If you have your mobile phone with you during an exam, you should turn it off and place it under your chair in the exam venue. Students found with a mobile phone on their person are in breach of University regulations and they will not be allowed to finish the exam.
  • Students who wish to withdraw from the exam must do so within the first 60 minutes of the exam.
  • Students who pass the exam but are not satisfied with their final grade are allowed to retake it at least once but no more than twice.

Teaching tools

Explanations are generally given by using the slides available on the platform "Virtual learning environment" of the University of Bologna (https://virtuale.unibo.it/) for all enrolled students.

As concerns the teaching methods of this course unit, all students who attend the planned in-presence practical activities in computer lab must also attend the Modules 1 and 2 of an online course for training on Health and Safety in study and research areas (https://www.unibo.it/en/services-and-opportunities/health-and-assistance/health-and-safety/online-course-on-health-and-safety-in-study-and-internship-areas). Please note that General training (Module 1) and specific training (Module 2) of this course are compulsory for all students who for study purposes work in IT laboratories and other work places that present specific risks.

Office hours

See the website of Gabriele Soffritti

SDGs

Quality education

This teaching activity contributes to the achievement of the Sustainable Development Goals of the UN 2030 Agenda.