ped
Table of Contents
PED
PED
The “ped” file format refers to the widely-used format for linkage pedigree data and used as input for the program PLINK. PLINK is a free, open-source whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner.
Program information
- written in C/C++
- Mac
- Windows
- Unix
Data type handled
- diploid
- SNP
Input Files
- whitespace (spaces and or tabs) separated text file *.ped
- each line correspond to one individual
- following first 6 columns are mandatory (The IDs are alphanumberic):
Family ID
Individual ID
Paternal ID
Maternal ID
Sex
(1=male; 2=female; any other character=unknown)Phenotype
(only 1 phenotype! The phenotype can be either a quantitative trait or an affection status column: PLINK will automatically detect which type (i.e. based on whether a value other than 0, 1, 2 or the missing genotype code is observed))
- Comments: line starts with
#
- Affection status, by default, should be coded:
- -9 missing
- 0 missing
- 1 unaffected
- 2 affected
- column 7 onwards: Genotypes
- any character (e.g.: 1,2,3,4 or A,C,G,T or anything else)
- missing genotype:
0
- all markers must be biallelic (diploid). Either both alleles should be missing or neither. Haploid data: encode them as diploid homozygot. Two alleles are shown after each other.
If specially specified following columns can be missing:
Family ID
Individual ID
Paternal ID
andMaternal ID
Sex
Phenotype
MAP files
- Each line of the MAP file describes a single marker and must contain exactly 4 columns:
- chromosome (1-22, X, Y, MT or 0 if unplaced)
- rs# or snp identifier
- Genetic distance (morgans) (missing: 0)
- Base-pair position (bp units) (Base-pair positions are expected to correspond to positive integers within the range of typical human chromosome sizes)
- The MAP file must contain as many markers as are in the PED file.
- The markers in the PED file do not need to be in genomic order: (i.e. the order MAP file should align with the order of the PED file markers).
Example
- PED files:
FAM001 1 0 0 1 2 A A G G A C FAM001 2 0 0 1 2 A A A G 0 0
1 1 0 0 1 1 A A A A A A A A A A 2 1 0 0 1 1 A C A C A C A C A C 3 1 0 0 2 1 A A A A A A A A A A 4 1 0 0 2 1 A C A C A C A C A C
- MAP files:
1 rs123456 0 1234555 1 rs234567 0 1237793 1 rs224534 0 -1237697 1 rs233556 0 1337456
1 snp1 0 1000 X snp2 0 1000 Y snp3 0 1000 XY snp4 0 1000 MT snp5 0 1000
How to cite
Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, Maller J, Sklar P, de Bakker PIW, Daly MJ & Sham PC (2007) PLINK: a toolset for whole-genome association and population-based linkage analysis. American Journal of Human Genetics, 81.
ped.txt · Last modified: 2011/06/08 09:42 by heidi