87780 - SEM. AN INTRODUCTION TO WEB SCRAPING USING PYTHON

Academic Year 2017/2018

  • Teaching Mode: Traditional lectures
  • Campus: Bologna
  • Corso: Second cycle degree programme (LM) in Economics and Economic Policy (cod. 8420)

Learning outcomes

Students will be able to extract information and data from the web

Course contents

This seminar introduces the main concepts of a particular programming language, i.e. Python, and discusses some tools for extracting, collecting, and analysing data contained in Web pages. In particular, the Web architecture and the main Web technologies (e.g. HTML) will be presented, as well as the standard languages for browsing the structure of pages, such as DOM and XPath. The second part of the seminar will be about Scrapy, a Python framework for scraping data from websites, in which real case scenarios will be discussed.

Readings/Bibliography

Study material will be provided by the professors. Students will need to download the Python software.

Teaching methods

Teaching in class and exercises in class and at home

Assessment methods

final exam in class

Teaching tools

slides

Office hours

See the website of Laura Bottazzi