unifyMappings {AnnBuilder} | R Documentation |
Given a base file and mappings from different sources, this function resolves the differences among sources in mapping results using a voting sheme and derives unified mapping results for targets in the base file
unifyMappings(base, ll, ug, otherSrc, fromWeb)
base |
base a matrix with two columns. The first column
contains the target items (genes) to be mapped and the second the
know mappings of the target to GenBank accession numbers or UniGene ids |
ll |
ll an object of class LL |
ug |
ug an object of class UG |
otherSrc |
otherSrc a vector of character strings for
names of files that also contain mappings of the target genes in
base. The files are assumed to have two columns with the first one
being target genes and second one being the desired mappings |
fromWeb |
fromWeb a boolean to indicate whether the source
data will be read from the web or a local file |
ll and ug have methods to parse the data from LocusLink and UniGene to obtain desored mappings to target genes in base. Correct source urls and parsers are needed to obtain the desired mappings
The function returns a matrix with four columns. The first two are the same as the columns of base, the third are unified mappings, and forth are statistics of the agreement among sources.
This function is part of the Bioconductor project at Dana-Farber Cancer Institute to provide Bioinformatics functionalities through R
Jianhua Zhang
myDir <- file.path(.path.package("AnnBuilder"), "temp") geneNMap <- matrix(c("32468_f_at", "D90278", "32469_at", "L00693", "32481_at", "AL031663", "33825_at", " X68733", "35730_at", "X03350", "36512_at", "L32179", "38912_at", "D90042", "38936_at", "M16652", "39368_at", "AL031668"), ncol = 2, byrow = TRUE) colnames(geneNMap) <- c("PROBE", "ACCNUM") write.table(geneNMap, file = file.path(myDir, "geneNMap"), sep = "\t", quote = FALSE, row.names = FALSE, col.names = FALSE) temp <- matrix(c("32468_f_at", NA, "32469_at", "2", "32481_at", NA, "33825_at", " 9", "35730_at", "1576", "36512_at", NA, "38912_at", "10", "38936_at", NA, "39368_at", NA), ncol = 2, byrow = TRUE) temp write.table(temp, file = file.path(myDir, "srcone"), sep = "\t", quote = FALSE, row.names = FALSE, col.names = FALSE) temp <- matrix(c("32468_f_at", NA, "32469_at", NA, "32481_at", "7051", "33825_at", NA, "35730_at", NA, "36512_at", "1084", "38912_at", NA, "38936_at", NA, "39368_at", "89"), ncol = 2, byrow = TRUE) temp write.table(temp, file = file.path(myDir, "srctwo"), sep = "\t", quote = FALSE, row.names = FALSE, col.names = FALSE) otherMapping <- c(srcone = file.path(myDir, "srcone"), srctwo = file.path(myDir, "srctwo")) baseFile <- file.path(myDir, "geneNMap") llParser <- file.path(.path.package("AnnBuilder"), "data", "gbLLParser") ugParser <- file.path(.path.package("AnnBuilder"), "data", "gbUGParser") if(.Platform$OS.type == "unix"){ llUrl <- "http://www.bioconductor.org/datafiles/wwwsources/Tll_tmpl.gz" ugUrl <- "http://www.bioconductor.org/datafiles/wwwsources/Ths.data.gz" fromWeb = TRUE }else{ llUrl <- file.path(.path.package("AnnBuilder"), "data", "Tll_tmpl") ugUrl <- file.path(.path.package("AnnBuilder"), "data", "Ths.data") fromWeb = FALSE } ll <- LL(srcUrl = llUrl, parser = llParser, baseFile = baseFile) ug <- UG(srcUrl = ugUrl, parser = ugParser, baseFile = baseFile, organism = "human") # Only works interactively if(interactive()){ unified <- unifyMappings(base = geneNMap, ll = ll, ug = ug, otherSrc = otherMapping, fromWeb = fromWeb) read.table(unified, sep = "\t", header = FALSE) unlink(c(file.path(myDir, "geneNMap"), file.path(myDir, "srcone"), file.path(myDir, "srctwo"), unified)) }