GAIN: Efficient genome ancestry inference in complex pedigrees with inbreeding

[ Paper in ISMB 2010 ]

[ Download the software ]


Usage: GAIN.exe <FounderFile> <StrainFile> <FunnelFile> <#Generation> [options]

        -cm: file containing genetic locations of SNP markers (in CM distance)
        -bp: file containing physical locations of SNP markers
        -s: m/f to indicate gender
        -e: estimated error rate
        -x: x chromosome
        -of: output format (0/1)
        either one of -cm or -bp must be used


( The program only models chromosome 1-19 accurately. The calculation on
X chromosome is a crude approximation. )


Example:
GAIN.exe Founders.txt Strain.txt Funnel.txt 11 -bp Bp_location.txt -e 0.01


Input Format:
All SNP values must be in 0,1,2,N (not A/T/C/G)

Founder File: The genotype sequence of the 8 founders are to be provided in 8 columns separated by comma:
No missing value is allowed in founder file. Only 0 and 1 can appear in founder file.
( since all founders are supposed to be fully inbred).
0: allele 1
1: allele 2

Example:


Strain Genotype File:
The genotype value of the strain to be analyzed can be in 0,1,2,N.
Each line consists of one value:
0: both allele 1
1: allele 1 + allele 2
2: both allele 2
N: unkown

Example:


Funnel Configuration File:
The file contains the funnel order of the 8 founders.
Example:


Genetic/Physical location file:
The file contains the Genetic/Physical locations of all SNP sites.
Example:



Output:
The result is stored in "<StrainFile>.prob".

Format 0:
Each line consists of segments in the pattern of: "(Founder X, Founder Y), probability"

Format 1:
Each line consists of 36 probability values. They represents the probability of founder pairs:
AA,BA,BB,CA,CB,CC,DA,DB,DC,DD,EA,EB,EC,ED,EE,FA,FB,FC,FD,FE,FF,GA,GB,GC,GD,GE,GF,GG,HA,HB,HC,HD,HE,HF,HG,HH
where ABCDEFGH are the 8 founders in the founder file.