


Next:How
to cite thisUp:Interpreting
the text outputPrevious:Printout
of estimated allele
Contents
Site by site output for linkage
model.
When the SITEBYSITE option is chosen, the output file contains posterior
population of origin assignments for each allele copy at each locus for
each individual. For large datasets, this information can require several
megabases to store. Each line shows the assignment probabilities for one
locus for one individual.The first two columns of the line indicate the
number of the individual (ranging from 1 to NUMINDS) and the number of
the locus (ranging from 1 to NUMLOCI) in the order that they occur in the
data file. If the data file contains labels for each individual or marker
names for each locus, these are given in subsequent columns. The format
of the posterior assignment probabilities depends on the parameter combinations.
If LINKAGE=0 or PHASED=1 then the first K rows of output give the probability
that the first allele copy at the locus comes from populations 1..K. For
diploid or polyploid data, analogous probabilities for subsequent allele
copies are shown in further columns. If the linkage model is used (LINKAGE=1)
and the data is not fully phased (PHASED=0) the posterior assignment probabilites
for the allele copies at each locus can be strongly co-dependent. Structure
therefore outputs joint assignment probabilities for the two allele copies
implying
entries for each locus (note that this option is not available for PLOIDY
).
If MARKOVPHASE=1 then the first K columns give the probabilities that the
first allele copy in the datafile is in population 1 and the second allele
copy is in population 1..K, with subsequent columns relating to probabilities
with the first allele copy in populations 2..K. If MARKOVPHASE=0, then
instead of referring to the first and second listed allele copies in the
data file, the probabilities refer to the population of origin of maternal
and paternal strands. If there is no phase information (PHASEINFO=0), then
the posterior probability matrix should theoretically be symmetric, such
that the probability the maternal allele is in population
and the paternal allele is in
will be equal to the probability that the maternal allele is in population
and the paternal allele is in population
.
In practice, because MCMC is used to estimate the matrix, there will be
noticeable deviations from symmetry if NUMREPS is small. For example, suppose
that the below is site-by-site output for two loci for a diploid individual
with no phase information, with MARKOVPHASE=0.
| 1 |
1 |
locX1 |
Ind1 |
5.65E-4 |
2.18E-6 |
7.95E-3 |
2.16E-6 |
1.22E-5 |
|
|
|
|
| |
|
|
|
|
7.07E-4 |
7.95E-3 |
7.89E-4 |
9.82E-1 |
|
|
|
|
| 1 |
2 |
locX2 |
Ind1 |
5.20E-4 |
1.47E-6 |
7.93E-3 |
3.25E-6 |
1.33E-5 |
|
|
|
|
| |
|
|
|
|
6.91E-4 |
8.01E-3 |
7.97E-4 |
9.82E-1 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Then in order to calculate the assignment probabilities of the maternal
and paternal allele copies at for the first locus the numbers are summed
as follows:
| locus 1 |
pop1 |
pop2 |
pop3 |
|
origin of maternal(X) |
|
|
|
|
| |
|
|
|
|
chromosome |
|
|
|
|
| pop1 |
5.65E-04 |
2.18E-06 |
7.95E-03 |
|
8.52E-03 |
|
|
|
|
| pop2 |
2.16E-06 |
1.22E-05 |
7.07E-04 |
|
7.21E-04 |
|
|
|
|
| pop3 |
7.95E-03 |
7.89E-04 |
9.82E-01 |
|
9.91E-01 |
|
|
|
|
| |
|
|
|
|
|
|
|
|
|
| origin of paternal |
8.52E-03 |
8.03E-04 |
9.91E-01 |
|
|
|
|
|
|
| chromosome (missing) |
|
|
|
|
|
|
|
|
|
In this example, the data is from an X chromosome of a male, so in fact
the second allele copy is missing.



Next:How
to cite thisUp:Interpreting
the text outputPrevious:Printout
of estimated alleleContents
William Wen 2004-07-13