User Tools

Site Tools


ped

PED


PED


PED
The “ped” file format refers to the widely-used format for linkage pedigree data and used as input for the program PLINK. PLINK is a free, open-source whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner.


Program information

  • written in C/C++
  • Mac
  • Windows
  • Unix


Data type handled

  • diploid
  • SNP


Input Files

  • whitespace (spaces and or tabs) separated text file *.ped
  • each line correspond to one individual
  • following first 6 columns are mandatory (The IDs are alphanumberic):
    • Family ID
    • Individual ID
    • Paternal ID
    • Maternal ID
    • Sex (1=male; 2=female; any other character=unknown)
    • Phenotype (only 1 phenotype! The phenotype can be either a quantitative trait or an affection status column: PLINK will automatically detect which type (i.e. based on whether a value other than 0, 1, 2 or the missing genotype code is observed))
  • Comments: line starts with #
  • Affection status, by default, should be coded:
    • -9 missing
    • 0 missing
    • 1 unaffected
    • 2 affected
  • column 7 onwards: Genotypes
    • any character (e.g.: 1,2,3,4 or A,C,G,T or anything else)
    • missing genotype: 0
    • all markers must be biallelic (diploid). Either both alleles should be missing or neither. Haploid data: encode them as diploid homozygot. Two alleles are shown after each other.


If specially specified following columns can be missing:

  • Family ID
  • Individual ID
  • Paternal ID and Maternal ID
  • Sex
  • Phenotype


MAP files

  • Each line of the MAP file describes a single marker and must contain exactly 4 columns:
    • chromosome (1-22, X, Y, MT or 0 if unplaced)
    • rs# or snp identifier
    • Genetic distance (morgans) (missing: 0)
    • Base-pair position (bp units) (Base-pair positions are expected to correspond to positive integers within the range of typical human chromosome sizes)
  • The MAP file must contain as many markers as are in the PED file.
  • The markers in the PED file do not need to be in genomic order: (i.e. the order MAP file should align with the order of the PED file markers).


Example

  • PED files:
FAM001  1  0 0  1  2  A A  G G  A C 
FAM001  2  0 0  1  2  A A  A G  0 0 
1 1 0 0 1   1   A A    A A    A A    A A    A A
2 1 0 0 1   1   A C    A C    A C    A C    A C
3 1 0 0 2   1   A A    A A    A A    A A    A A
4 1 0 0 2   1   A C    A C    A C    A C    A C
  • MAP files:
1  rs123456  0  1234555
1  rs234567  0  1237793
1  rs224534  0  -1237697 
1  rs233556  0  1337456
1    snp1   0   1000
X    snp2   0   1000
Y    snp3   0   1000
XY   snp4   0   1000
MT   snp5   0   1000


How to cite

Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, Maller J, Sklar P, de Bakker PIW, Daly MJ & Sham PC (2007) PLINK: a toolset for whole-genome association and population-based linkage analysis. American Journal of Human Genetics, 81.

ped.txt · Last modified: 2011/06/08 09:42 by heidi