User Tools

Site Tools


ped

This is an old revision of the document!


PED


PED


PED
The “ped” file format refers to the widely-used format for linkage pedigree data and used as input for the program PLINK. PLINK is a free, open-source whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner.


Program information

  • written in C/C++
  • Mac
  • Windows
  • Unix


Data type handled

  • diploid
  • AFLP
  • MICROSAT
  • Standard


Input Files

  • whitespace (spaces and or tabs) separated text file *.ped
  • each line correspond to one individual
  • following first 6 columns are mandatory (The IDs are alphanumberic):
    • Family ID
    • Individual ID
    • Paternal ID
    • Maternal ID
    • Sex (1=male; 2=female; any other character=unknown)
    • Phenotype (only 1 phenotype! The phenotype can be either a quantitative trait or an affection status column: PLINK will automatically detect which type (i.e. based on whether a value other than 0, 1, 2 or the missing genotype code is observed))
  • Comments: line starts with #
  • Affection status, by default, should be coded:
    • -9 missing
    • 0 missing
    • 1 unaffected
    • 2 affected
  • column 7 onwards: Genotypes
    • any character (e.g.: 1,2,3,4 or A,C,G,T or anything else)
    • missing genotype: 0
    • all markers must be biallelic (diploid). Either both alleles should be missing or neither. Haploid data: encode them as diploid homozygot. Two alleles are shown after each other.


If specially specified following columns can be missing:

  • Family ID
    • Individual ID
    • Paternal ID and Maternal ID
    • Sex
    • Phenotype


Example

  • Lumped data file:
NumIndivs 2
NumLoci 6
Digits 1
Format Lumped
LocusNames sAAT1 sAAT2 sAAT3 ADA1 ADA2 ADH
1 11 11 11 0 11 32
2 21 11 21 11 11 12 
  • NonLumped data file:
NumIndivs 2
NumLoci 6
Digits 1
Format NonLumped
LocusNames sAAT1 sAAT2 sAAT3 ADA1 ADA2 ADH
1 123 143 -1 -1 144 144 120 122 157 158 144 144 
2 135 135 134 140 144 144 120 122 161 161 144 144 
  • AFLP data file (4 Microsat loci, 5 AFLP loci):
NumIndivs 2
NumLoci 9
Digits 1
Format Lumped
LocusNames m1 m2 m3 m4 A1 A2 A3 A4 A5
1 11 12 13 11 + + + - +
2 22 33 11 22 - - 0 - -
3 12 13 13 11 + - - - +

How to cite

Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, Maller J, Sklar P, de Bakker PIW, Daly MJ & Sham PC (2007) PLINK: a toolset for whole-genome association and population-based linkage analysis. American Journal of Human Genetics, 81.

ped.1307459927.txt.gz · Last modified: 2011/06/07 17:18 by heidi