CLENCH: A program for calculating cluster enrichment using the Gene Ontology
Analysis of microarray data most often produces lists of genes with similar expression patterns, which are then subdivided into functional categories for biological interpretation. Such functional categorization is most commonly accomplished using Gene Ontology (GO) categories. Although there are several programs that identify and analyze functional categories for human, mouse and yeast genes none, of them accept Arabidopsis thaliana data. In order to address this need for A. thaliana community, we have developed a program that retrieves GO annotations for A. thaliana genes and performs functional category analysis for lists of genes selected by the user.
Cluster enrichment analysis and visualization of expression, annotation and transcription factor binding site data.
Download Clench2 code
See a power point presentation on Clench2.0
The end result of analyzing microarray datasets is a list of differentially expressed genes. Such gene lists are grouped by the functions of the gene products and common transcription factor binding sites contained in their promoters. Functional categorization is most commonly accomplished using Gene Ontology (GO) catego-ries and promoters are analyzed for the presence and enrichment of binding sites for transcription factors known to be involved in the process under study. Although there are several programs that iden-tify and analyze functional categories, few of them analyze both promoter sequences and functional categories. Moreover, the inte-grated visualization of the three data types, expression, annotation and transcription factor binding sites in the promoters, is important for drawing meaningful inferences.
In order to address this need for A. thaliana community, we have radically modified and extended our CLENCH tool, which per-formed functional category analysis. Clench2.0 performs functional analysis, searches promoter sequence for known TF binding sites, and visualizes the expression, annotation and transcription factor binding site data for lists of genes provided by the user. Although developed for A. thaliana, Clench2.0 can be easily adapted to work for other model organisms.
New features in Clench2
- Can process one file or a file list.
- For a list of files an 'easyview' page is writted that allows fast navigation between results from different clusters.
- A graph showing the relationships among GO terms is provided for each category where:
holding/hovering the mouse on a node shows its label in text clicking on a node shows its row in the result
- For genes assigned to a term in one top level GO category, the term assignments in the remaining two GO categories are show as a matrix
- Promoters for the genes assigned to a term are now analyzed for enrichment of user provided TF binding sites and the results are presented along with the term enrichment results
holding/hovering the mouse on the promoter image shows the p-values for the enriched promoters
- Expression profiles for genes assigned to a term are shown in the results
The user can download the expression data for genes assigned to each term by clicking the 'Get Data' link above the expression image in the result row
- Two ways of Slimterm mapping are now allowed:
Using a custom list of terms Using slimterms provided in the TAIR output
- Choice to perform a 'hard' or 'complete' slim mapping. Hard mapping chooses one slim term if there are multiple parents and assigns a gene to one and only one slim term.
- Clench2 can run using local GO association files (available from www.geneontology.org) and hence can be used for ANY organism.
- Clench2 can run simulations to estimate the FDR for enriched GO terms and change the p-value cutoff to reduce the FDR.
- Clench2 can be set up to run as a web-based tool for small user groups as shown at http://wartik19.biotec.psu.edu/RunClench.html