Skip Navigation
Text:
Increase font size
Decrease font size

    Genome Dynamics: Evolution, Organization and Function

    Principal Investigator:Wei Wang
    Funding Agency:Jackson Lab
    Agency Number:00000062

    Abstract
    In 2009-2010 the UNC computer science group (Co-PIs McMillan and Wang) plan to focus on two projects supporting the Center of Genome Dynamics.

    Data Management: A key component of our informatics effort will be the construction of a data-base (Figure 1) to host the genotyping, transcript, copy-number variation, and methylation data for CGD projects, and external users of the AFFY-MouseDiv array. This database will serve as a primary data repository warehousing the center’s experimental results. It will also serve as a portal to provide public access to the results of expensive analysis and mining tasks in the fu-ture. In this proposal we will exploit underutilized capabilities of modern database management systems (DBMS) such as the active database and temporal database technologies.

    Genome Compatibility and Tree-based Association Analysis: We propose to continue our development of efficient methods for partitioning a genome into parsimonious sets of compatible intervals. We define SNP compatibility in terms of the four-gamete test. A pair of SNPs are compatible if and only if at most three of the four possible allele combinations occur. If all four allele combinations occur, there must have been either a recombination or a repeated mutation between these two sites. The four-gamete test is of particular interest because of its close rela-tion to perfect phylogeny. Specifically, a necessary and sufficient condition for a constructing perfect phylogeny is that all pairs of SNPs satisfy the four-gamete test. The perfect phylogeny model is commonly used for short genomic regions, under the assumption that they have not undergone any apparent recombination. We have developed efficient methods for partitioning genomes into a set of potentially overlapping, maximal compatible intervals, each of which admits a perfect phylogeny, and whose union covers the entire genome. We address the question of what is the fewest number of such intervals required, and we identify suspect SNPs whose removal would reduce the overall complexity of the haplotype-block structure (suspect in the sense that they might indicate genotyping errors, homoplasy, or a recent gene conversion event).

    Document Actions