84531 - Infrastructures For Cloud Computing and Big Data M

Academic Year 2021/2022

  • Teaching Mode: Traditional lectures
  • Campus: Bologna
  • Corso: Second cycle degree programme (LM) in Computer Engineering (cod. 5826)

Learning outcomes

The class tends to enhance the capacity of orientating in the process of defining the strategies for a distributed system and applying them in different related applications. The students face the principles and the main problems of distributed large systems and are exposed to some standard and widely solutions, by following the class and via individual work. At the end, students are expected to be able to know the properties of most diffused middleware and the evolutions one can expect from that technology, by mastering the properties for designing a real application: most well spread strategies are presented and discussed.

Course contents

The course covers several topics central in modern global data and processing infrastructures, such as Data Centers, MultiCloud Systems, Federation of resources, etc. typically supporting Industry 5.0 and Smart city applications:

  • Advanced models for large distributed & cloud systems, from C/S to message exchange.
  • Replication, group and many-to-many communication, and systems for QoS
  • Middleware for development and management of large distributed & cloud systems
  • Infrastructures for global data storage and processing

The class explores the following topics:

Advanced models for large distributed & cloud systems

  • Class Starting: general information and presentation of the Class (use cases)
  • Goals, Basics, and Models: classifications, C/S vs. Message exchange, service and cloud models, parallelization models
  • Middleware & Cloud Models: definitions, categories, basic organization, and patterns for large distributed and cloud systems, Cloud internals design.

Replication, group and many-to-many communication, and systems for QoS

  • Different consistency degrees and impact on service properties (BASE and CAP)
  • Replication: models, strategies and protocols
  • Communication and groups: models, protocols and algorithms
  • Systems and protocols for QoS
  • Multicast and MOM middleware

Middleware for large distributed & cloud systems

  • CORBA: middleware and operating environment
  • MOM: examples of very thin environments
  • OpenStack: an example of a widely-diffused cloud IaaS

Novel infrastructures for global data storage and processing

  • Global data storage: solutions for non-traditional NoSQL data memorization (Cassandra and MongoDB)
  • Global data processing: batching and streaming based big data processing (Map-Reduce, Spark, and Storm and S4)
  • Main properties for effective design projects

Readings/Bibliography

G. Coulouris, J. Dollimore, T. Kindberg: "Distributed Systems: concepts and Design", Addison-Wesley, (fifth edition) 2012.

M. Kleppmann:, "Designing Data-Intensive Applications“, O’Reilly, 2017

A.S. Tanenbaum, M. v. Steen "Distributed Systems: Principles and Paradigms", Prentice-Hall, (second edition) 2006 / 2013.

B. Forouzan, F. Mosharraf: “Computer Networks, a top down approach”, McGraw-Hill, 2011.

M.L. Liu, "Distributed Computing", Addison-Wesley, 2003.

D.L. Galli: "Distributed Operating Systems: Concepts and Practice", Prentice-Hall, 2000.

L. Peterson, B. Davie, "Computer Networks, A Systems Approach", (fifth edition) Morgan Kaufmann Series in Networking, 2011,

V.K. Garg, “Elements of Distributed Computing”, Wiley, 2002.

L. Carlson, “Programming for PaaS, A Practical Guide to Coding for Platform-as-a-Service”, O Reilly, 2013

J. Siegel, “Pure CORBA: a code-intensive reference”, (second edition), SAMS Publishing, 2002.

F. Halsall, “Multimedia Communications”, Addison-Wesley, 2001.

D.A. Chappel, T. Jewell, “Java Web Services”, O'Reilly, 2002.

E. Newcomer, “Understanding Web Services”, Addison-Wesley, 2002.

T. Erl et al., “Cloud computing : concepts, technology, & architecture”, Prentice Hall, 2013.

B. Wilder, “Cloud architecture patterns”, Beijing, 2013.

A. T. Velte et al., “Cloud computing: a practical approach”, McGraw-Hill, 2010.

J. Rhoton, “Cloud computing explained”, Recursive Press, 2009.

T. Fifield et al., “Openstack operations guide: set up and manage your OpenStack cloud”, O'Reilly, 2014.

S. Holla, “Orchestrating Docker”, Packt Publishing, 2015.

O. Hane, “Build your own PaaS with Docker”, Packt Publishing, 2015.

T.D. Nadeau and K. Gray, “SDN: software defined networks”, O'Reilly, 2013.

L. Carlson, “Programming for Paas”, O'Reilly, 2013.

T. White, “Hadoop: the definitive guide”, O'Reilly, 2012.

E. Sammer, “Hadoop operations”, O'Reilly, 2012.

K. Rankin, “DevOps troubleshooting”, Addison-Wesley, 2013.

D. Sui et al., “Crowdsourcing geographic knowledge”, Springer, 2013.

Z. Yan et al., “Semantics in mobile sensing”, Morgan & Claypool, 2014.

R. Copeland, “MongoDB applied design patterns”, O'Reilly, 2013.

J. Carpenter, “Cassandra: The Definitive Guide”, 3rd Edition, O'Reilly, 2020.

Silvano Gai, “Building a Future-Proof Cloud Infrastructure: A Unified Architecture for Network, Security, and Storage Services, Addison-Wesley, 2020.

Brendan Gregg, “Systems Performance”, 2nd Edition, Pearson, 2020.

Teaching methods

This course tends to give importance and to leave room for discussion to develop critical project attitude: students are encouraged to face topics, by stimulating classroom interaction and perspective exchange. The main important objective is to build a critical understanding of the properties of middleware design and to acquire an orienteering capacity in analyzing systems and in defining properties and trends of global systems.

Within the course, also additional space is provided for seminars and experience reporting about issues relevant to the course topics, usually given by people outside the academic environment coming from Industrial stakeholders, so to expose students to a ‘work' environment.
Those seminar contents are importance in adding to the professional background of the computer engineering students, by offering reflections and insights toward the professional growth.

Assessment methods

The final exam consists of a deep colloquium about all the most important and basic issues in the distributed systems area; the evaluation depends on the understanding of the issues and capacity of orientating more than on the memory effort.
Again, the objective of all the lines is to develop a capacity of orienteering and of identifying critical problems and issues toward solutions, and that skill is much more important than memorizing specific paths and answers.


It is possible to define an assignment of a project about some of  these issues, choice recommended to the students in the path of Distributed Systems.


Teaching tools

All the contents are presented in class and via compulsory discussions. The slides on the Web site of the class are the main start for going into the topics.

Students are encouraged to discuss on topics and to develop ideas and opinions on the main current areas of debate. Students can go deep on selected areas they intend to focus on.

Web site of the course: inside http://www.middleware.unibo.it/,
http://www.middleware.unibo.it/?page_id=56

Some classes are hands-on on specific topics in the lab to go into the practical knowledge you need in some areas: the hands-on are given by industrial parties.

Students are referred to papers and articles to choose from and go deep individually into chosen topics.

Links to further information

http://www.middleware.unibo.it/?page_id=56

Office hours

See the website of Antonio Corradi

SDGs

Quality education Gender equality Industry, innovation and infrastructure Sustainable cities

This teaching activity contributes to the achievement of the Sustainable Development Goals of the UN 2030 Agenda.