Jonathan Myers

"Computation of Overlap Graph for DNA fragments"

Abstract:

The generation of an Overlap Graph is the first phase in a three-phase sequence assembly method for shotgun DNA sequencing. An overlap graph consists of a vertex for each DNA fragment produced by the shotgun DNA fragmentation and electrophoresis sequencing. Each directed edge, either dovetail or containment, has an associated set of values that defines the degree to which the two fragments overlapped, as defined by the edge, in the original sequence. The set of values for any edge can be defined by comparing all possible overlaps of the two fragments to find the overlap that occurred with least probability by chance. This paper presents the incremental string comparison algorithm to compute the edit distance of all overlaps between all fragments and to estimate of the probability of an overlap occurring by chance by counting, using common subsequences or random sampling.