Department of Computer Science, University of Western Ontario
Plenary Talk 4: Exploring the Space of DNA Signatures
Approaches to understanding and utilizing the fundamental properties of bioinformation take many forms, from using DNA to build lattices and polyhedra, to studies of algorithmic DNA self- assembly, to harnessing DNA strand displacement for computations. This talk presents a view of the structural properties of DNA sequences from yet another angle, by proposing the use of graphic signatures of DNA sequences to measure and visualize their interrelationships.
This methods starts by computing an “image distance” for each pair of graphic Chaos Game Representations of DNA sequences, and then uses multidimensional scaling to visualize the resulting interrelationships in a two- or three-dimensional space. The result of applying this method to a collection of DNA sequences is an easily interpretable Molecular Distance Map wherein sequences are represented by points in a common Euclidean space, and the spatial distance between any two points reflects the differences in their subsequence composition.
This general-purpose method does not require DNA sequence alignment and can thus be used to compare similar or vastly different DNA sequences, genomic or computer-generated, of the same or different lengths. Our analysis of a dataset of 3,176 complete mitochondrial genomes confirms that these graphic signatures, which reflect the oligomer composition of the originating DNA sequences, can be a source of taxonomic information. This method also correctly finds the mtDNA sequences most closely related to that of the anatomically modern human (the Neanderthal, the Denisovan, and the chimp), and that the sequence most different from it in this dataset belongs to a cucumber.