69443 - Algorithms and Data Structures for Computational Biology

Academic Year 2023/2024

  • Teaching Mode: Traditional lectures
  • Campus: Bologna
  • Corso: Second cycle degree programme (LM) in Bioinformatics (cod. 8020)

Learning outcomes

At the end of the course, the student has the basic knowledge in design and analysis of correct and efficient algorithms and data structures. In particular, the student has basic knowledge on algorithms and data structures. The student will be able to: design correct and efficient algorithm for common computational tasks; analyze existing algorithms and data structures; design and analyze new algorithms and data structures.

Course contents

The course is divided into two main parts, each covering essential topics in algorithms and data structures.

Part 1: Foundations of Algorithms and Data Structures

  • Introduction to Algorithmic Complexity Analysis: explore the concept of computational complexity, including asymptotic notation.
  • Sorting Algorithms: dive into various sorting techniques such as insertion sort, merge sort, quick sort, lower bounds of comparison-based sorting. Additionally, learn about sorting in linear time using counting sort and radix sort.
  • Elementary Data Structures: gain a comprehensive understanding of fundamental data structures like arrays, linked lists, stacks, and queues.
  • Advanced Data Structures: delve into more complex data structures such as binary trees, trees, hash tables, and graphs.
  • In-depth Analysis of Python Built-in Data Structures: understand the inner workings of Python's built-in data structures.
  • Algorithmic Techniques: explore essential algorithmic paradigms, including brute force, divide-and-conquer, dynamic programming, and greedy algorithms.
  • Handling Hard Problems: study algorithmic approaches for challenging problems, such as approximation algorithms, branch-and-bound, and heuristics.

Part 2: Algorithms and Data Structures for Computational Biology

  • Exact and Approximate String Matching: learn about techniques for exact string matching strings and approximate string matching.
  • Sequence Alignment: dive into local and global pairwise alignments, as well as multiple sequence alignments.
  • Suffix Data Structures: understand suffix trees, suffix tries, and suffix arrays, which play a crucial role in string processing.
  • Burrows-Wheeler Transform: explore the Burrows-Wheeler transform, a reversible string transformation technique with various applications.
  • Sequence Assembly: study overlap graphs and the greedy algorithm for solving the shortest common supersequence problem in sequence assembly.

Throughout the course, algorithms will be presented using pseudocode, and practical Python implementations will be provided as well.

Readings/Bibliography

It is fundamental to use the electronic slides and material provided during the classes, including live exercises. The slides will be made available on the course web page.

The readings below are suggested, but need not to be necessarily purchased.

Books more specific on computational biology topics:

- Carlos Setubal, Joao Meidanis, Introduction to computational molecular biology, Cengage Learning eds., 1997;
- C. Gibas, P. Jambeck, Developing Bioinformatics Computer Skills, O'Reilly media, 2001

Books more general on computer algorithms and data structures topics:

- Alfred V. Aho, Jeffrey D. Ullman, John E. Hopcroft. Data Structures and Algorithms. Addison Wesley, 1983.
- Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein. Introduction to Algorithms. McGraw-Hill, 2001.
- Donald E. Knuth. The Art of Computer Programming, Volumes 1-3. Addison-Wesley Professional, 1998.
- S.B. Kishor Data Structures, Edition 3. Das Ganu Prakashan, Nagpur, 2008.

Teaching methods

The teaching methods consist of frontal lectures (supported by slides and whiteboard) and the discussion of exercises to be solved on the paper, using the student laptops or the whiteboard.


Assessment methods

The final exam is composed of a written part (3 hours) and possibly an oral part (30 minutes). During the written part, it is NOT allowed the use of books and notes. The teacher may require a further orals exam if the student’s score in the written part is close to 18/30.

Teaching tools

Electronic slides and projector.
Personal computer.
White board.

Office hours

See the website of Pietro Di Lena