77968 - Big Data and Cybersecurity

Academic Year 2017/2018

Learning outcomes

By the end of the course, studemts will be alble to: 1) Understand the foundations of big data, including it’s foundations in computing technology and statistics. 2) Understand the social implications of increased knowledge, surveillance, and behavioral prediction made possible by big data, and the ethical tradeoffs faced. 3) Demonstrate the ability to formulate specific study questions concerning cybersecurity. 4) Understand accepted tools and practices concerning cyberterrorism and cyberwarfare. 5) Demonstrate the ability to communicate complex concepts to multidisciplinary teams including students from computing and international affairs backgrounds. 6) Be familiar with text mining techniques.

Course contents

The course is devoted to the study of methods, techniques and tools for big data analyses. In particular, basic elements of data mining and text mining will be discussed, as well as graph theory and social network analysis (SNA) The students will be introduced, through examples, to the use of KNIME and Rapidminer software for the analysis of big data and Gephi for the analysis of networks. Also, relevant data bases – such as GDELT – will be introduced.

There is a monograph section that will center on application of big data in the field of cyber security. The technical side of it includes, for instance, web logs, but it will be touched only superficially. On the other hand, there is a "social" side, made of e-mails, Tweets, and texts that is potentially huge and is relevant for counter-terrorism, law enforcement and cyberdefense purposes and that will constitute the focus of this section. 

The course requires previous knowledge of basic statistical methods and techniques to perform mono and bivariate statistical analyses.

Readings/Bibliography

- Billari F., D’Amuri F. e Marcucci J. (2013) Forecasting births using google. Paper presented at PAA Annual Meeting, New Orleans.

- Burrows, R. & Savage, M. (2014) “After the crisis? Big data and the methodological challenges of empirical sociology”, Big Data & Society, April-June: 1-6.

- Curini, L. Iacus, S. e Canova, L. (2015) “Measuring idiosincratic happiness through the analysis of twitter: An application to the Italian case”, in Social Indicators Research, 121(2): 525-542.

- Easley D. & Kleinberg J. (2010), Networks, Crowds, and Markets: Reasoning about a Highly Connected World, Cambridge: Cambridge University Press (chapters 1,2,3,4, 13, 14, 20, 21). [Available online https://www.cs.cornell.edu/home/kleinber/networks-book/networks-book.pdf]

- Giacomello G. (2014) “Introduction: Security in Cyberspace”, in G. Giacomello (ed.) Security in Cyberspace: Targeting Nations, Infrastructures, Individuals, New York: Bloomsbury, pp. 1-19.

- Hanneman, R. A. & Riddle M. (2005) Introduction to Social Network Methods. Riverside, CA: University of California, Riverside (chapters 1,2,3,5,6,7,10,11). Available online http://faculty.ucr.edu/~hanneman/

- Ignatow G. & Mihalcea R. (2016) Text Mining: A Guidebook for the Social Sciences. Los Angeles: Sage.

- Kimball M. (2014) Graph Theory and Social Networks. Spring Notes. (Introduction & chapters 1, 2). [Available online http://www2.math.ou.edu/~kmartin/graphs/graphs.pdf]

- Leventhal B. (2010) “An introduction to data mining and other techniques for advanced analytics”, Journal of Direct, Data and Digital Marketing Practice, 12(2): 137-153.

- Mayer-Schonberger V. & Cukier C. (2014) Big Data: A Revolution that will transform how we live, work and think, Eamon Dolan/Mariner Books.

- Pentland A (2012) Reinventing society in the wake of big data. Edge, 30 August 2012. Available at: https://edge.org/conversation/alex_sandy_pentland-reinventing-society-in-the-wake-of-big-data

- Singh V.K., Freeman L., Lepri, B. & Pentland A. (2013) “Classifying spending behavior using socio-mobile data”, Human 2(2): 99-111.

- State B., Rodriguez M., Helbing D. and Zagheni E. (2014) Migration of Professionals to the US. Evidence from LinkedIn Data. Proceedings of SocInfo 2014. Springer's Lecture Note Series in Computer Science, 531-543

- Sudhahar, S. Veltri G.A. and Cristianini A. (2015) “Automated analysis of US presidential elections using big data and network analysis”, Big Data & Society, January-June: 1-28.

 

Depending on the students' interests additional readings might be indicated by the instructors during the course 

Teaching methods

Lectures, lab. exercises with KINME, GEPHI and other software for big data analysis.

Assessment methods

For students attending at least 80% of face-to-face lessons 

50% one final written exams (only exam sessions in December 2017 and January 2018). 

50% one course paper  (maximum of 4,000 words) must be written on a topic to be agreed in advance with the instructors. The paper should be based on original empirical analyses using the software introduced during the course. The code/program utilized for the analyses will also be evaluated.The paper needs to be sent (by e.mail and in pdf format) within the 31st January 2018 

Those students who fail to pass the exam in the sessions of December or January or to meet the deadline for delivering the final paper will have to take the exam as non attending students.

For students who fail to attend at least 80% of class, or fail to pass the exam & delivering the final paper by the end of January 2018. 

A written exam on the topic covered by the compulsory readings and a practical data analysis/information exercise using KNIME or Gephi. 

Teaching tools

This course is also supported by a dedicated e-learning module available at https://elearning-cds.unibo.it/

Students are required to register on the e.learning platform.

Office hours

See the website of Marco Albertini

See the website of Giampiero Giacomello