- Docente: Matteo Romanello
- Crediti formativi: 6
- Lingua di insegnamento: Inglese
- Modalità didattica: Convenzionale - Lezioni in presenza
- Campus: Bologna
- Corso: Laurea Magistrale in Digital Humanities and Digital Knowledge (cod. 9224)
-
dal 20/09/2023 al 26/10/2023
Conoscenze e abilità da conseguire
The laboratory is designed to train students on some practical aspects. The activity is designed to allow students to apply specialized techniques in the management of disciplinary contents. The laboratories will deal with the themes related to one or more the areas of learning: computer science; literary, linguistic, historical/cultural and related to the arts in the digital context; transversal: economics, law and communication.
Contenuti
Applied Data Analysis (ADA)
This laboratory offers an introduction to data analysis techniques of practical use in the Humanities and GLAM sector. Topics include: data cleaning and wrangling, the Python data analysis stack (Pandas), how to get from messy to tidy data, basics of data analysis and visualization, advanced topics (modelling) and applications (geo mapping, network analysis), best practices to communicate and share your results (licensing, repositories).
Learning objectives:
- Learn how to use the main Python libraries for data wrangling and analysis to perform a variety of practical tasks (e.g. exploration, reporting, visualization).
- Apply the main data analysis tools and techniques in dealing with cultural data.
- Critically understand the surplus-value and limitations of data analysis from a humanities perspective.
Course overview
Week 1: Introduction to ADA
- Introduction to the course
- Example of a data analysis application
- Introduction to Pandas: data types, data loading, data access
Week 2: Tidy data
- Basic concepts of tidy data modelling
- Data manipulation and wrangling with Pandas
- Operations on tidy data frames (set/union/join, select/apply/transform)
Week 3: Exploratory data analysis
- Basic plotting
- Descriptive statistics
- Variation, distributions
- Descriptive statistics and plotting with Pandas, matplotlib and seaborn
Week 4: Applied Data Analysis
- Explorative data visualization
- Primer on good and bad data visualization practices
- Data analysis: geo mapping and (social) network analysis
Week 5: Publishing code/datasets & communicating results
- Constructing datasets for research
- Communicating data analysis results
- Best practices about publishing datasets, licensing issues, reproducibility, data repositories
Testi/Bibliografia
General references
- F. Karsdorp, M. Kestemont and A. Riddell, Humanities Data Analysis: Case Studies with Python, Princeton University Press (2021)
- P. Juola and S. Ramsay, Six Septembers: Mathematics for the Humanist (especially chapt. 2)
- M. Walsh, Introduction to Cultural Analytics & Python, Version 1 (2021)
- W.J.B. Mattingly, Introduction to Pandas in Python (2021 but work-in-progress)
Requirements
Due to the practical nature of this course, familiarity with the main concepts of Python programming is required (e.g. main data types, `for` loops, `if/else` statements, using and writing functions).
Metodi didattici
The course uses a mix of lectures and interactive Jupyter notebooks, and it gives ample space to hands-on practice and exercises.
Modalità di verifica e valutazione dell'apprendimento
The final exam consists of a presentation of an original project. The students suggest the project, which can consist of any combination of ADA pipeline elements: data cleaning, exploratory data analysis, modelling, static/interactive visualizations. Projects can be completed on an existing dataset or on a new one (made ad hoc or previously unpublished).
The project guidelines will be shared at the beginning of the course. Students will be asked to work in small groups (2-3 max.). Individual projects are allowed for motivated reasons. The students are highly encouraged to work on their projects during the duration of the course, and present their results shortly after the end of the course, during a dedicated session.
The personal contribution of each member of a group will be assessed during an individual oral colloquium at a regular exam session. In the oral colloquium, both the project and the course contents will be assessed. The project and the oral examination each contribute 50% of the final grade. Further information about the assessment are provided during the first class.
Strumenti a supporto della didattica
Classes are held in a classroom equipped with personal computers connected to the Intranet and Internet.
A communication channel in a free messaging application (e.g. Slack) will be set up so as to facilitate communication among students and with the professor.
Orario di ricevimento
Consulta il sito web di Matteo Romanello