# 69443 - Algorithms and Data Structures for Computational Biology

### Course Unit Page

• Teacher Pietro Di Lena

• Credits 10

• SSD INF/01

• Language English

• Campus of Bologna

• Degree Programme Second cycle degree programme (LM) in Bioinformatics (cod. 8020)

## Learning outcomes

At the end of the course, the student has the basic Knoledge in design and analysis of correct and efficient algorithms and data structures. In particular, the student has basic knowledge on algorithms and data structures. The student will be able to: design correct and efficient algorithm for common computational tasks; analyze existing algorithms and data structures; design and analyze new algorithms and data structures.

## Course contents

Algorithm design: incremental algorithm design, divide-and-conquer algorithm design, greedy algorithm design, dynamic programming algorithm design

Algorithm analysis: runtime analysis, asymptotic behaviour of algorithms, functions in asymptotic analysis, solving recurrence equations for runtime analysis of recursive algorithms

Sorting algorithms: insertion sort, merge sort, heap sort, quick sort, lower bounds of comparison based sorting, sorting in linear time, counting sort, radix sort, shellsort algorithm

Data structures: array, decision tree, binary tree, heap, priority queue, dynamic set, dictionary, stack, queue, linked list, rooted tree as linked list, hash table, balanced trees, graphs, merge-find sets, suffix tries, trees and arrays, Burrows-Wheeler transform

Trees and graphs algorithms: visits, topological sort, shortest path, Kruskal's algorithm for the minimum spanning tree problem

String algorithms: Huffman algorithm, approximate string matching, longest common subsequence, edit distance, global sequence alignment, local sequence alignment, semiglobal sequence alignment

Algorithms for NP-complete problems: backtracking, enumeration algorithms, approximation algorithms, branch-&-bound, heuristics (based on greedy / local search), examples for the traveling salesperson problem

It is fundamental to use the electronic slides and material provided during the classes, including live exercises. The slides will be made available on the course web page.

The readings below are suggested, but need not to be necessarily purchased.

Books more specific on computational biology topics:

- Carlos Setubal, Joao Meidanis, Introduction to computational molecular biology, Cengage Learning eds., 1997;
- C. Gibas, P. Jambeck, Developing Bioinformatics Computer Skills, O'Reilly media, 2001

Books more general on computer algorithms and data structures topics:

- Alfred V. Aho, Jeffrey D. Ullman, John E. Hopcroft. Data Structures and Algorithms. Addison Wesley, 1983.
- Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein. Introduction to Algorithms. McGraw-Hill, 2001.
- Donald E. Knuth. The Art of Computer Programming, Volumes 1-3. Addison-Wesley Professional, 1998.
- S.B. Kishor Data Structures, Edition 3. Das Ganu Prakashan, Nagpur, 2008.

## Teaching methods

The teaching methods consist of frontal lectures (supported by slides and whiteboard) and the discussion of exercises to be solved on the paper, using the student laptops or the whiteboard.

## Assessment methods

The final exam is composed of a written part (3 hours) and possibly an oral part (30 minutes). During the written part, it is NOT allowed the use of books and notes, calculator and communication devices, including Internet access. Provided that the candidate passes the written exam (with evaluation greater or equal to 18/30), s/he may be scheduled for the oral part in case the written exam evaluation is low. In that case, to pass the course, it is mandatory to receive a sufficient evaluation in both the written and oral parts.

## Teaching tools

Electronic slides and projector.
Personal computer.
White board.

## Office hours

See the website of Pietro Di Lena