TitleCG dinucleotide clustering is a species-specific property of the genome.
Publication TypeJournal Article
Year of Publication2007
AuthorsGlass, Jacob L., Thompson Reid F., Khulan Batbayar, Figueroa Maria E., Olivier Emmanuel N., Oakley Erin J., Van Zant Gary, Bouhassira Eric E., Melnick Ari, Golden Aaron, Fazzari Melissa J., and Greally John M.
JournalNucleic Acids Res
Volume35
Issue20
Pagination6798-807
Date Published2007
ISSN1362-4962
KeywordsAnimals, CpG Islands, Dinucleoside Phosphates, DNA Methylation, Genome, Humans, Mice, Species Specificity, Takifugu
Abstract

<p>Cytosines at cytosine-guanine (CG) dinucleotides are the near-exclusive target of DNA methyltransferases in mammalian genomes. Spontaneous deamination of methylcytosine to thymine makes methylated cytosines unusually susceptible to mutation and consequent depletion. The loci where CG dinucleotides remain relatively enriched, presumably due to their unmethylated status during the germ cell cycle, have been referred to as CpG islands. Currently, CpG islands are solely defined by base compositional criteria, allowing annotation of any sequenced genome. Using a novel bioinformatic approach, we show that CG clusters can be identified as an inherent property of genomic sequence without imposing a base compositional a priori assumption. We also show that the CG clusters co-localize in the human genome with hypomethylated loci and annotated transcription start sites to a greater extent than annotations produced by prior CpG island definitions. Moreover, this new approach allows CG clusters to be identified in a species-specific manner, revealing a degree of orthologous conservation that is not revealed by current base compositional approaches. Finally, our approach is able to identify methylating genomes (such as Takifugu rubripes) that lack CG clustering entirely, in which it is inappropriate to annotate CpG islands or CG clusters.</p>

DOI10.1093/nar/gkm489
Alternate JournalNucleic Acids Res
PubMed ID17932072
PubMed Central IDPMC2175314
Grant ListGM007288 / GM / NIGMS NIH HHS / United States
R01 AG022859 / AG / NIA NIH HHS / United States
R01 AG024950 / AG / NIA NIH HHS / United States
T32 GM007288 / GM / NIGMS NIH HHS / United States
R01 HD044078 / HD / NICHD NIH HHS / United States