87945 - SOFTWARE AND COMPUTING FOR NUCLEAR AND SUBNUCLEAR PHYSICS

Course Unit Page

Academic Year 2020/2021

Learning outcomes

At the end of the course the student will learn the basic concepts of programming and modern scientific computation, as they are currently used in several physics fields. He/she will have an understanding of the major software development techniques and strategies and an understanding of the various computational frameworks, database, data maintenance and collection. By the end of the course, the student will be able to solve advanced problems in scientific software design for nuclear and subnuclear physics that will be developed as small group projects.

Course contents

MODULE 1: Styles and standards of programming. Software development and design. Techniques and paradigms of programming. Programming languages. Interpreted and compiled languages. Object Oriented Programming. Functional Programming. Strenght, weaknesses, domains of application and metodologies of variuos programming languages (commonly used in modern physics). The example of python, including in multi-language environments. How to write, debug, document, share and maintain a software project (software versioning, testing, life-cycle modeling, software maintenance, selected software engineering methods and tools) Computational infrastructures and resources. Relational Databases. Data management: formats, duplication, manipulation and reduction. File systems. From single machines to medium-large computing farms. HTC distributed computing and grid computing. Cloud. Vectorization and parallelization, GPU use and general concepts of HPC. General concepts of: Big data, analytics techniques, machine learning, deep learning, artificial intelligence, computational algebra systems and applications to deep learning.

MODULE 2: Extension of the topics (tagged as "general concepts" only) in Module 1, with main focus on the needs of the nuclear and subnuclear sector, mainly Grid Computing, Cloud Computing. Scientific computing models: components and interactions. Data management. Storage solutions in HEP. Workflow management. The challenge of efficient job scheduling and resource matching/execution in HEP. Networking at high performances. Monitoring. Accounting. Security. How to design a computing model for a next-generation HEP experiment starting from the technical parameters of the experiments. How to operate with success a computing model, how to evolve it with time. Data handling of Big Data in HEP. Data-driven paradigms. Extension of the topics (tagged as "general concepts" only) in Module 1, regarding real-world application of Machine Learning, Deep Learning, Artificial Intelligence in HEP.

Readings/Bibliography

Reading material available in a public repository, and free online resources to focus on speficic topics among those presented in class.

Teaching methods

The course will use frontal lessons, with active partecipation from the students. During classes the students will be guided in the implementation and practice of the discussed concepts. Optional seminars will be organized to focus on specific topics of interest.

Assessment methods

Students will be evaluated on a programming project. This project needs to be hosted on a public repository. The student is free to choose the programming language of the project. Accepted control version systems are git and fossil. The evalaution method is the following: clarity of the repository commit history (6 point) clarity and completeness of the documentation and source code (12 point) presence and executability of test routines (12 point) Optionally, the evaluation will be improved on the basis of the following topics: Usage of innovative technologies and libraries (up to 3 points) Contribution to open source projects (up to 6 points)

Teaching tools

Course notes, available on public repositories, and open source projects used as teaching examples.

Office hours

See the website of Daniele Bonacorsi

See the website of Enrico Giampieri

See the website of Lorenzo Rinaldi