Getting Started
These simple steps will help you integrate SpliceMap into your transcriptomics analysis pipeline.
- Read the requirements for running SpliceMap.
- Download and set-up the SpliceMap package
- Follow the tutorial to see how SpliceMap works on some example data.
- Check the manual if anything is unclear.
- You're ready, Happy SpliceMapping!
Publication
Kin Fai Au, Hui Jiang, Lan Lin, Yi Xing, and Wing Hung WongDetection of splice junctions from paired-end RNA-seq data by SpliceMap
Nucleic Acids Research, Advance access published on April 5, 2010. [preprint]
Latest News
10-23-2010: SpliceMap 3.3.5.2 Released
This release fixes a bug regarding multi-hits.
- (another) Multi-hit bug fix
10-1-2010: SpliceMap 3.3.5.1 Released
This release fixes a bug regarding multi-hits and has some new features.
- Multi-hit bug fix
- New option "max_multi_hit" specifies the maximum number of multi-hits allowed in the seeding (not the full reads).
- Small formatting issue in SAM/coveraege file fixed
9-20-2010: SpliceMap 3.3.5 Released
This release has a large part re-implemented for increased reliability and speed. There are also new options to improve the alignment. Future versions will focus on improving the alignment sensitivity by rescuing some pair reads. In the next point release we will try to associate junctions with an annotated gene name.
- Able to set the number of maximum number of mismatches in each read
- SAM output shows the number of mismatches in each read
- SAM output shows where the mismatches lie
- SAM output shows the number of soft clipped bases in each read
- Able to set the number of maximum number of bases clipped in each read
- SAM output shows the number multiple hits of each read
- SAM output indicates the inferred insert size
- SAM output correctly indicates if a pair is not mapped
- Memory usage much less dependent on total number of reads
- Non-unique coverage outputted by default, this is no longer an option and requires no extra memory
- Includes latest version of Bowtie (0.12.7)
9-5-2010: SpliceMap 3.3.3.2 Released
This release reduces the memory requirement significantly for large datasets, improves stability and introduces some new features. The next version will further reduce memory consumption.
- Support for almost unlimited number of chromosomes and read files
- Able to read concatenated genome files (you and even mix and match multiple FASTA reference files)
- Checks bowtie index to make sure it is valid
- Able to map 'bad' reads with residual adaptor sequences
- Stability improvements
- Reduced memory consumption
8-23-2010: Memory usage issue
Note that SpliceMap 3.3.3.1 uses over 50GB for 200 million reads. In the next version we will reduce this requirement significantly.
8-19-2010: SpliceMap 3.3.3.1 Released
This release fixed a bug in the SAM file output of quality scores.
8-18-2010: SpliceMap 3.3.3 Released
SpliceMap 3.3.3 bring a number of new features. Most notable is the support for uneven read lengths in both pairs. For example, if reads in your pairs range from [50-73bp] due to trimming, SpliceMap will happily align them. Upcoming features in future releases include a "NM" tag for each read in the SAM file, support for concatenated genome files and options for extra specificity from long reads. The changes in 3.3.3 are listed below:
- Support for trimmed reads with uneven length
- Option to run multiple chromosomes are the same time (at the cost of memory) [num_chromosome_together]
- Read quality is now copied to the SAM file
- Read names are copied to the SAM file
- Option to change the location of "temp" and "output" folders
- Non-unique hits are marked with MAPQ = 0 in the SAM file
- Faster alignment by ignoring redundant reads
- Bowtie index is automatically built, if it is not found
- SAM file sorted properly for cufflinks
- Less verbose, output logs redirected to "debug_logs" folder
- More checking of reads/genome
7-6-2010: SpliceMap 3.3.1.3 Released -- Single-read bug fixed
SpliceMap 3.3.1.3 fixes the single-read bug. Now, you may use single-read data just like pair-reads. The next update of SpliceMap will be quite significant (so may take a while).
- Also fixed a bug with the display of "max_intron" and "min_intron"
7-2-2010: Single-read bug
There has been a bug discovered regarding the use of single-reads. When the new code was added for improved long pair-reads, it was not updated for single-reads by me (John Mu). Very sorry about that! There have been a number of problems as a result of this for people who use single-reads, including not discovering many junctions.
A fixed version will be out in a few days, with an announcement.
Please note: If you use pair-reads, there is nothing to be worried about. It is fine.
6-30-2010: SpliceMap 3.3.1.2 Released
SpliceMap 3.3.1.2 fixes some bugs and adresses compatibility issues. If you have an older system and had problems running previous version of SpliceMap, this version might help. SAMtools now works properly with the SAM output.
- Removed trailing tab from the SAM output. If you have existing SAM files and do not want to re-run SpliceMap, simply strip off the last tab character and the file will work with SAMtools.
- Addressed a file handling issue with older versions of g++
- Binaries are now compiled by default by an older version of g++ for maximum compatibility. If you have a newer version of linux, I suggest you recompile the binaries with "install.sh" and "install-bowtie.sh" for maximum performance.
6-17-2010: SpliceMap 3.3.1 Released
SpliceMap 3.3.1 is a bug fix release. There were some problems with the new wild-card expression for chromosome files so we have reverted to the old one and added an option in the .cfg file.
- Reverted to using "chr*.fa" to detect chromosome by default.
- Added "chromosome_wildcard" option to the .cfg file for people with chromosome in different formats
For detailed information about this release, please see the release notes.
6-14-2010: SpliceMap 3.3 Released
SpliceMap 3.3 brings major improvements to sensitivity when aligning longer reads. The 3.3.x strain will next add more reliable handling of file formats as well as handling of read quality. In future versions we will add an option to use the extra information in the long reads to improve specificity rather than sensitivity.
- Up to 40%+ improved sensitivity when using long reads (100bp+) compared to SpliceMap 3.2.2, see the effect in features
- Added a section describing the optional filters.
- Added options in the .cfg file:
- Intron size
- clipping of bases from the front (as well as end) of read
- Some minor fixes to the SAM output
For detailed information about this release, please see the release notes.
6-5-2010: SpliceMap 3.2.2 Released
SpliceMap 3.2.2 is another minor release, it fixes some issues people had with compiling on the latest version of g++ (4.4.3). The next version will include some fixes to improve sensitivity with long reads (100bp+).
- Added some libraries
- Added better instructions on how to build from source (it seems that this is necessary for many non-mac users)
For detailed information about this release, please see the release notes.
6-1-2010: SpliceMap 3.2.1 Released
SpliceMap 3.2.1 fixes some minor problems. If you are not currently experiencing issues you can skip this release.
- Misc changes
For detailed information about this release, please see the release notes.
5-31-2010: SpliceMap 3.2 Released
SpliceMap 3.2 includes Bowtie support and usability improvements. The next version will continue to add support for more read formats and mappers. Major changes for this version include:
- Fixed bug in SAM output flags
- Bowtie support for reads mapping, much faster.
- Mac OS X support. (This is experimental, but considering development was done on a Mac, you should be safe)
- Support for FASTQ and FASTA file formats. However, read quality is not preserved. This will be added in the next version.
- Uses configuration file, instead of command line input. Please see tutorial.
- Updated manual to remove sections that end-users won't really need
- Removed support for running SpliceMap on a single chromosome. However, you can still do this by making your own Bowtie-index and genome directory for that chromosome.
For detailed information about this release, please see the release notes.
5-18-2010: SpliceMap 3.1.1 Released
SpliceMap 3.1.1 brings SAM output support and many general improvements. The next version will add support for more read mappers such as Bowtie and BWA. Major changes for this version include:- SAM format support, you may now select an option to output the read alignments in SAM format as well as .bed format.
- Optional filtering of the junctions outputted as outlined in the paper is included.
- More accurate coverage output
- .bed output is now colored for easy viewing
- Executable name changed from "SpliceMapCC" to "SpliceMap"
- Fixed bugs relating to the handling of long reads. If your data did to run properly before please try again.