85579 - Laboratory (1) (LM) (G.A)

Academic Year 2023/2024

  • Teaching Mode: Traditional lectures
  • Campus: Bologna
  • Corso: Second cycle degree programme (LM) in Digital Humanities and Digital Knowledge (cod. 9224)

Learning outcomes

The laboratory is designed to train students on some practical aspects. The activity is designed to allow students to apply specialized techniques in the management of disciplinary contents. The laboratories will deal with the themes related to one or more the areas of learning: computer science; literary, linguistic, historical/cultural and related to the arts in the digital context; transversal: economics, law and communication.

Course contents

Applied Data Analysis (ADA)

This laboratory offers an introduction to data analysis techniques of practical use in the Humanities and GLAM sector. Topics include: data cleaning and wrangling, the Python data analysis stack (Pandas), how to get from messy to tidy data, basics of data analysis and visualization, advanced topics (modelling) and applications (geo mapping, network analysis), best practices to communicate and share your results (licensing, repositories).

Learning objectives:

  • Learn how to use the main Python libraries for data wrangling and analysis to perform a variety of practical tasks (e.g. exploration, reporting, visualization).
  • Apply the main data analysis tools and techniques in dealing with cultural data.
  • Critically understand the surplus-value and limitations of data analysis from a humanities perspective.

Course overview

Week 1: Introduction to ADA

  • Introduction to the course
  • Example of a data analysis application
  • Introduction to Pandas: data types, data loading, data access

Week 2: Tidy data

  • Basic concepts of tidy data modelling
  • Data manipulation and wrangling with Pandas
  • Operations on tidy data frames (set/union/join, select/apply/transform)

Week 3: Exploratory data analysis

  • Basic plotting
  • Descriptive statistics
  • Variation, distributions
  • Descriptive statistics and plotting with Pandas, matplotlib and seaborn

Week 4: Applied Data Analysis

  • Explorative data visualization
  • Primer on good and bad data visualization practices
  • Data analysis: geo mapping and (social) network analysis

Week 5: Publishing code/datasets & communicating results

  • Constructing datasets for research
  • Communicating data analysis results
  • Best practices about publishing datasets, licensing issues, reproducibility, data repositories

Readings/Bibliography

General references

Requirements

Due to the practical nature of this course, familiarity with the main concepts of Python programming is required (e.g. main data types, `for` loops, `if/else` statements, using and writing functions).

Teaching methods

The course uses a mix of lectures and interactive Jupyter notebooks, and it gives ample space to hands-on practice and exercises.

Assessment methods

The final exam consists of a presentation of an original project. The students suggest the project, which can consist of any combination of ADA pipeline elements: data cleaning, exploratory data analysis, modelling, static/interactive visualizations. Projects can be completed on an existing dataset or on a new one (made ad hoc or previously unpublished).

The project guidelines will be shared at the beginning of the course. Students will be asked to work in small groups (2-3 max.). Individual projects are allowed for motivated reasons. The students are highly encouraged to work on their projects during the duration of the course, and present their results shortly after the end of the course, during a dedicated session.

The personal contribution of each member of a group will be assessed during an individual oral colloquium at a regular exam session. In the oral colloquium, both the project and the course contents will be assessed. The project and the oral examination each contribute 50% of the final grade. Further information about the assessment are provided during the first class.

Teaching tools

Classes are held in a classroom equipped with personal computers connected to the Intranet and Internet.

A communication channel in a free messaging application (e.g. Slack) will be set up so as to facilitate communication among students and with the professor.

Office hours

See the website of Matteo Romanello