[ Home ] [ Software ] [ Lab Members ] [ Publications ] [ Data Archive ] [ Contact Us ]

Conserved Non-Coding Elements Data

Analysis of the CNC data by Kim and Pritchard, 2007.
See text for further details.  Please contact
Su Yeon Kim (skim@galton.uchicago.edu) for questions or comments.

Abstract

Variable Names:

    CNCID Chrom Start End Size Sizec Group

    GeneID GeneName GeneStart GeneEnd DistClass

    SRT_H SRT_C SRT_M SRT_R SRT_D SRT_P SRT_T SRT

    BestModel NumParas Sign IsChimpQualLow

Datafiles:

   Mammalian.txt

   Amniotic.txt

Variable Description:

   CNCID: CNC ID

   Chrom: Chromosome

   Start: CNC start position

   End: CNC end position

   Size: number of bases in the corresponding ``most conserved'' region

   Sizec: number of bases used for statistical inferences

   Group: Amniotic (A) or Mammalian (M)

   GeneID: Entrez gene ID of the nearest gene

   GeneName: name of (the nearest) gene

   GeneStart: transcription start site

   GeneEnd: transcription end site

   DistClass: distance classes defined based on the distance to the nearest gene

   **SYMBOL**: Human (H); Chimpanzee (C); Mouse (M); Rat (R); Dog (D); Primate (P); rodenT (T)

   SRT_H: signed version of SRT statistic testing for rate changes on the Human (H) lineage

   SRT: SRT statistic with the (extreme) alternative hypothesis in which each of seven lineages evolves with its own rate

   BestModel: best fitted model by the modified AIC (for details, see text); branches that share a substitution rate are separated by '|', and the empty space following '|' covers the remaining branches.  (e.g., "M|T|" corresponds to the model in which each the mouse and the rodent lineage evolves with its own rate, and the remaining lineages share a background substitution rate.)

   NumParas: number of parameters in the best fitted model

   Sign: direction of acceleration, available only for the two-parameter models (fitted by AIC)

   IsChimpQualLow: indication of having a low-quality chimpanzee sequence