CSCI 4490-6490 Algorithms for Computational Biology Spring 2016
Homework Assignment 2
This homework is to exercise for dynamic programming. Code the LCS algorithm and its associated function PrintLCS using a programming language.
Some requirements are:
- The input to your program is two DNA sequences. They can be read from the
keyboard or (preferably) from a text file in the fasta format;
- The program first calls LCS to accomplish the major task;
- The LCS algorithm should be modified a little to output the computed dynamic programming table for the si,j values once the tabe is computed;
- The program should then call PrintLCS to output the longest common subsequence found for the two inputted sequences;
- Outputs can be to the screen or (preferably) to a text file.
Alternatively, you may choose to code the algorithm GlobalPairwiseAlignment instead. For this task, the input remains the same; but the output is an alignment between the two inputted sequences. There are also options of scoring schemes for substitution (including match) and gap penaltiesi: (a) the scheme given in Eddy's short paper; (b) some biologically more meaningful scheme that you may find; (c) as simple as '1 for a match', '0 for everything else' as in the LCS algorithm. You may NOT want to hard-code the scoring scheme (matrix) in the program; instead, you should try to code your program to read the scoring matrix from a text file.
This homework is due on Tuesday February 16, 2016.