Genome Data Browser
Caulobacter crescentusis a premier model organism for studying the molecular basis of cellular asymmetry. The Caulobacter community has generated a wealth of high-throughput spatiotemporal databases including data from gene expression profiling experiments (microarrays, RNA-seq, ChIP-seq, ribosome profiling, LC-ms proteomics), gene essentiality studies (Tn-seq), genome wide protein localization studies, and global chromosome methylation (SMART sequencing). A major challenge involves the integration of these diverse datasets into one comprehensive community resource. To address this need, we have generated CauloBrowser (www.cauloprism.org), an online resource for Caulobacter studies. This site provides a user-friendly interface for quickly searching genes of interest and downloading genome-wide results. Search results about individual genes are displayed as tables, graphs of time resolved expression profiles, and schematics of protein localization throughout the cell cycle. In addition, the site provides a genome data browser that enables customizable visualization of all published high-throughput genomic data. The depth and diversity of datasets collected by the Caulobacter community makes CauloBrowser a unique and valuable systems biology resource.
CauloBrowser is a community resource.
We encourage users to send us general questions and suggestions, updates or corrections.
After loading CauloBrowser you should see something like this:
You can search for genes by either name or by genomic coordiantes. If data is included in name this is used and not the genomic coordinates.
Time resolved gene expression in wild type cells
Gene expression collection
Time resolved gene expression collection
The overview section provides a table listing curated information per gene.
Provides the standard name of the gene, the CCNA locus ID (based on the genome of the NA1000 laboratory strain), and the CC locus ID, as well as links to central databases.
Provides a description of the gene product NCBI predicted function and a link to the UniProt entry. In addition, if the protein product was identified by LC-MS mass spectrometry, the product contains the LC-MS Verified tag, indicating that the gene product is present during normal laboratory growth conditions.
Denotes the protein coding nucleotides of the gene. The ORF coordinates present in the database reflect a recent major reannotation (NCBI accession number CP001340.1) based on results from ribosome profiling, LC-MS proteomics, and RNA-seq experiments. Clicking on the genome coordinate links to our dedicated genome data browser.
Summarizes the community knowledge on the subcellular localization of the protein.
Lists the experimentally identified transcriptional start sites for the gene of interest as well as any known cell cycle-regulated transcription factor binding sites as determined in Zhou et al. PLoS Genet. 2015. The numbers indicate the genomic coordinate of the TSS and the strand.
The sixth row indicates whether the gene was found to be disruptable (i.e. dispensable) or non-disruptable (i.e. essential) for growth in a recent saturating transposon mutagenesis experiment (Christen et al. Mol Syst Biol 2011).
This section provides up to eight graphs each of which displays time resolved gene expression data. The data include individual promoter activity, RNAseq and DNA microarrays, ribosome profiling, or proteomics data across the cell cycle for each gene of interest. The top three datasets have been collected and analyzed using the new annotation of the Caulobacter genome. The RNA-seq, tiling arrays, and ribosome profiling datasets have been normalized such that levels can be compared between genes. The CauloBrowser user interface is suited to comparing expression data between multiple genes. All genes searched by the user are displayed in color-coding on the same graph making it easy to perform gene-by-gene comparisons. The graphs provide a toggle (by simply clicking on the gene in the legend) to hide and show the selected genes.
|Experiment||x axis||y axis||reference|
|Transcription Start Site||Time in minutes||Levels, arbitrary units||Zhou et al. PLOS Genetics 2015|
|RNAseq||Time in minutes||Levels, arbitrary units|
|Ribosome profiling||Time in minutes||Levels, arbitrary units|
|RNAseq (ABI Solid)||Time in minutes||Levels, arbitrary units||Fang et al. BMC Genomics 2013|
|Tiling Arrays||Time in minutes||Intensity, arbitrary units||McGrath et al. Nat Biotech 2007|
|Oligo Arrays||Time in minutes||Log2 ratio||Laub et al. Science 2000|
|PCR Arrays||Time in minutes||Log2 ratio||Hottes et al. Mol Micro 2005|
|S35||Time in minutes||?||Grunenfelder et al. PNAS 2001|
This section lists a collection of 35 microarrays and 4 proteomics experiments from various different growth conditions and/or strains containing mutations. The table reports the log2 ratio between the mRNA or protein levels between the indicated conditions. Two of the datasets show proteins identified as substrates to ClpP protease or substrates of the tmRNA ribosomal rescue system.
This section provides graphs for a collection of additional cell
cycle microarrays performed on mutant strains and/or in various
different growth conditions. These experiments track mRNA levels at
the indicated time points across the cell cycle.
At times the graphs are broken, for example dnaA Depletion for the ctrA gene. A broken graph indicates missing time points.
Time points for divKD90G (temperature sensitive) microarrays start at (R)estrictive temprature following (P)ermissive temprature.
Time points for dnaA depletion microarrays start with dnaA (D)epletion following (I)nduction.
references list was gathered from the gene-centered information resource at NCBI.
We are in the process of completing this list by manually adding relevant papers to the CauloBrowser database.
We invite the community to participate in this effort by emailing us regarding missing citations.
The genome data browser provides a useful platform for visualizing high-throughput genomic datasets on the Caulobacter chromosome. The start page of the genome info browser looks sometime likes this:
Locate the control buttons on the top right of the browser, and press the '+' button to add tracks.
Select 'Defaults' tab (it should be selected by default)
A list of available pre-loaded tracks will be shown in the drop down menu.
View desired tracks by putting a ✓ next to its description.
The tracks selected will be remembered by the browser using cookies, for your browsing convenience.
There are three main ways of navigating around the genome:
To view a specific genomic region on the genome, enter its genomic coordinate into the textbox near the top of the browser (red box) and press enter.
You should see something similar to this:
The genomic position should be entered in the format: Chromosome:[start]..[end] Or Chromosome:[start]-[end]
To zoom to the region around a gene of interest, search for the gene in the same textbox: enter the gene name and press the enter key.
You should obtain the following view after a couple of seconds:
Notice that the browser has automatically adjusted the coordinates to the regions surrounding the gene of interest. The gene of interest is highlighted in a faint pink box. To clear this highlight, press the eraser button (boxed in red) to the right of the coordinate text box.
This is useful for quacking navigating to the next called peak or gene. First, select the track that will be used as the reference for jumping. Only tracks that have a sensible definition of features can be used. Examples of suitable tracks are the gene track and peak/summit tracks (In general, tracks that are backed by a BED or VF file work). Select a track by clicking on the grey box with its name. The selected track has a blue shadow around it. In this case, we have selected the CtrAU peaks track. To jump to the next feature (or peak) in the track, press ctrl + right simultaneously. If you are using a Mac and this combination does not work, try command + right. Alternatively, press the jump button to the top right of the browser If you have navigated to the next feature successfully, you should see this: You can navigate to the previous feature by pressing the left key instead of the right.
Zoom levels can be changed using the (de)magnifying glasses near the top of the browser (red boxes). Pressing the + and keys on the keyboard will achieve the same effect.
Notice the 2 circles along the axis between the 2 magnifying classes. They are the 2 'saved favorite' zoom levels for the browser, the current one in use is highlighted in blue. The numbers below the axis shows how wide is the genomic region in view. In this example, we expect to see a width between 2kb and 50kb (in fact, it is about a 5kb region).
We can toggle to the other favorite saved zoom level by pressing the space bar. This should be the view you get:
To change the favorite saved zoom level, drag the circles along the axis.
Try dragging the blue circle all the way to the left (most zoomed-in).
Notice now that the individual bases of the genome are visible:
As an interactive browser, you can also navigate around by clicking and dragging on the track, like this.
Tracks can be reordered by clicking on its name and dragging it up or down.
First add multiple ChipSeq Pileup tracks by adding tracks from the default list. Reordering Tracks Tracks The pileups are automatically and dynamically scaled to maximize screen real estate. However, different pileup tracks may be scaled differently. Look closely at the ruler at the middle of the screen: In this case, the top track has a maximum value of 1070, while the bottom track has a maximum value of 245. Although they look equally tall on the screen, the peaks have a very different actual depth in the sequencing data. Sometimes, however, ensuring that the different tracks have the same scaling is important. To do this, use the Scaling option located at the top of the page. Fill in a desired max value, and click Scale: Now, the tracks have been scaled to the same maximal level, and the relative depth can be easily inferred and compared. To go back to the default dynamic scaling, click Use Default.
Q: I get a spinning wheel (right) next to the track name, and the
information in the track never seems to be loading.
A: The spinning wheel indicates that the browser is fetching data from remote files (the browser fetches a small portion of the files at a time instead of the entire file to speed up loading). If this problem persists beyond a few seconds, check that you have a stable high speed internet connection. If you are very zoomed out (viewing >50kb) or have many tracks, loading of files may be slow; please be patient or close tracks that are not currently in use. Tracks can be closed by unchecking them from the add track window, or by pressing the X next to the track name.
Q: I want to customize this browser more, is it developer friendly?
A: Yes. The browser is built using the dalliance browser, developer information can be found here. You can do a view source on the NA1000 browser to see the configuration used for the default tracks.