Last update: Mon Dec 15, 2014 by prins@cs.unc.edu

COMP 555 - Bioalgorithms
http://www.cs.unc.edu/~prins/Classes/555/

Fall 2014
(Tue Aug 19 - Tue Dec 2)
TTh 2:00 - 3:15 PM, FB 009

Instructor: Jan Prins
FB 334, Brooks Comp Sci Building
Tel: 919-590-6213, Email: prins@cs.unc.edu
Office hours: Wed 10:30 am - noon and by appointment


@ Overview

Computational methods are fueling a revolution in the biological sciences. As a result, two new multidisciplinary fields, bioinformatics and computational biology, have emerged. This course will explore the computational methods and algorithmic principles driving this revolution. It will cover basic topics in molecular biology, genetics, and proteomics. The course also addresses basic computational theory and algorithms including asymptotic notation, recursion, divide-and-conquer approaches, graph algorithms, dynamic programming, and greedy algorithms. These fundamental concepts from computer science will be taught within the context of motivating problems drawn from contemporary biology. Example biological topics include sequence alignment, motif finding, gene rearrangement, DNA sequencing, protein peptide sequencing, phylogeny, and gene expression analysis.

Syllabus

The course syllabus is available here. Please familiarize yourself with the prerequisites and course requirements and the honor code rules in effect for this course.

Text

[Course text] An Introduction to Bioinformatics Algorithms
by Neil C. Jones and Pavel A. Pevzner
MIT Press © 2004
ISBN: 0262101068.

Prerequisites


@ Announcements

@ Schedule

Date Lecture Topic Reading Assignment Homework
August 19 Introduction Chapter 1, Chapter 3 Secns 3.1 - 3.7  
August 21 High Throughput Biology Chapter 3  
August 26 Algorithms and Complexity Chapter 2  
August 28 Restriction Mappings Chapter 4 secns 4.1 - 4.3 Assignment 1
September 2 Finding Regulatory Motifs Chapter 4 secns 4.4 - 4.9
September 4 Greedy Algorithms Chapter 5 secns 5.1 - 5.2
September 9 Genome Rearrangements Chapter 5 secns 5.3 - 5.5
September 11 Dynamic Programming Preliminaries Chapter 6 secns 6.1 - 6.3
September 16 Sequence Alignments Chapter 6 secns 6.4 - 6.8 Assignment 1 due
September 18 Local Sequence Alignment Chapter 6 secns 6.8 - 6.10
September 23 Gapped and Multiple Alignment
(see addl slides in Lecture 10)
Chapter 6 secns 6.8 - 6.10
September 25 Gene Prediction Chapter 6 secns 6.11 - 6.14 Assignment 2
September 30 Divide and Conquer Algorithms Chapter 7 secns 7.1 - 7.4
October 2 Graph Algorithms Chapter 8 secns 8.1 - 8.8
October 7 homework solutions and midterm review Assignment 2 due
October 9 Midterm Exam Chapters 1 - 7, Lecs 1 - 12
October 14 midterm solutions
October 16 Fall Break (no class)
October 21 DNA Sequencing and Assembly Chapter 8 secn 8.9 Assignment 3
October 23 Protein Sequencing and Identification Chapter 8 secns 8.10 - 8.15
October 28 Combinatorial Pattern Matching Chapter 9 secns 9.1 - 9.5
October 30 Suffix Arrays and BWT (not in text)
November 4 Approximate Pattern Matching Chapter 9 secns 9.6 - 9.8 Assignment 3 due
November 6 Clustering Chapter 10 secns 10.1 - 10.3
November 11 Clustering and Evolution Chapter 10 secns 10.4 - 10.8 Assignment 4
due Nov 25
November 13 Imperfect Tree Construction Chapter 10 secns 10.9 - 10.11
November 18 Hidden Markov Models Chapter 11 secns 11.1 - 11.3
November 20 Randomized Algorithms Chapter 12
November 25 Complete material from last class Chapter 12 Assignment 4 due
Assignment 5
due Wed Dec 3
December 2 Special topic on RNA-seq alignment techniques (no reading)


@ Submission instructions for programming assignment 4 (due Nov 25)

  1. The data for this assignment consists of 255 points in 4-space. Download the file cdata.txt to the machine where you are working. This file is an ascii file but the contents are "pickled" so it can be unpacked as a python value with the right form (i.e. array of 4-tuples). To read the file as a python value you should add the following import
              import pickle
    
    at the head of your program, and use the following code in your program
              datafile = open("cdata.txt","r")
              vals = pickle.load(datafile)
    
    Prepend a path to the filename if the file is not in python's working directory.
  2. Please upload your python code hw4.py to your hw4 submission directory. Be sure this file contains the python function euler.
  3. To transfer your file for submission use one of the following two techniques. Here yourlogin stands for your computer science login id.

    • (on any linux system, or on a mac with an open shell window)
            scp path-to-your-file-hw4.py   yourlogin@classroom.cs.unc.edu:/afs/unc/project/courses/comp555/Submit/yourlogin/hw4
      you will be prompted for your cs passwd to complete the transfer.

    • (on any Windows PC with SSH secure file transfer)
      Start the SSH sftp program and connect to classroom.cs.unc.edu. You will be prompted for your CS login and password. Change the remote directory to
            /afs/unc/project/courses/comp555/Submit/yourlogin/hw4
      Locate your file hw4.py on your machine in the sftp window and drag it to the remote side.

  4. You can update (replace) your submission whenever you wish up to the deadline. Hence you may wish to make a trial upload.
  5. If you are not able to get the hw4.py file copied, send it to me by email and be sure to include comp555 in the subject of the mail.


@ Examples


@ Reference Section


This page is maintained by prins@cs.unc.edu. Send mail if you find problems.