HOME

TheInfoList



OR:

PLINK is a free, commonly used, open-source whole-genome association analysis toolset designed by
Shaun Purcell Shaun M. Purcell is a British genetic epidemiologist and statistical geneticist. He is a senior associate member of the Broad Institute of MIT and Harvard and its Stanley Center for Psychiatric Research. He is also a faculty member at the Brigha ...
. The software is designed flexibly to perform a wide range of basic, large-scale genetic analyses. PLINK currently supports following functionalities: * data management; * basic statistics ( FST, missing data, tests of Hardy–Weinberg equilibrium,
inbreeding coefficient The coefficient of relationship is a measure of the degree of consanguinity (or biological relationship) between two individuals. The term coefficient of relationship was defined by Sewall Wright in 1922, and was derived from his definition of ...
, etc.); *
Linkage disequilibrium Linkage disequilibrium, often abbreviated to LD, is a term in population genetics referring to the association of genes, usually linked genes, in a population. It has become an important tool in medical genetics and other fields In defining LD, it ...
(LD) calculation; *
Identity by descent A DNA segment is identical by descent (IBD) in two or more individuals if: * they have inherited it from a common ancestor without recombination, that is, the segment has the same ancestral origin in these individuals * the segment is maximal, t ...
(IBD) and
identity by state A DNA segment is identical by descent (IBD) in two or more individuals if: * they have inherited it from a common ancestor without recombination, that is, the segment has the same ancestral origin in these individuals * the segment is maximal, t ...
(IBS) matrix calculation; *
population stratification Population structure (also called genetic structure and population stratification) is the presence of a systematic difference in allele frequencies between subpopulations. In a randomly mating (or ''panmictic'') population, allele frequencies ar ...
, such as a
Principal component analysis Principal component analysis (PCA) is a linear dimensionality reduction technique with applications in exploratory data analysis, visualization and data preprocessing. The data is linearly transformed onto a new coordinate system such that th ...
; * association analysis such as
genome-wide association study In genomics, a genome-wide association study (GWA study, or GWAS), is an observational study of a genome-wide set of Single-nucleotide polymorphism, genetic variants in different individuals to see if any variant is associated with a trait. GWA s ...
for both basic case/control studies and quantitative traits; *tests for
epistasis Epistasis is a phenomenon in genetics in which the effect of a gene mutation is dependent on the presence or absence of mutations in one or more other genes, respectively termed modifier genes. In other words, the effect of the mutation is depe ...


Input and output files

PLINK has its own format of text files () and binary text files () that serve as input files for most analyses. A .map accompanies a file and provides information about variants, while and files accompany files as part of the binary dataset. Additionally, PLINK accepts inputs of VCF, BCF, Oxford, and 23andMe files, which are typically extracted into the binary format prior to performing desired analyses. With certain formats such as VCF, some information such as phase and dosage will be discarded. PLINK has a variety of output files depending on the analysis. PLINK has the ability to output files for BEAGLE and can recode a file into a VCF for analyses in other programs. Additionally, PLINK is designed to work in conjunction with R, and can output files to be processed by certain R packages.


Extensions and current developments

* PLINK 2.0 a comprehensive update to PLINK, developed by Christopher Chang, with the improved speed of various Genome-wide association (GWA) calculations, including identity-by-state (IBS) matrix calculation, LD-based pruning and association analysis. * PLINK/SEQ is an open-source C/C++ library designed for analyzing large scale whole-genome and whole-exome studies. * MQFAM is a multivariate test of association (MQFAM) that can be efficiently applied to large population-based samples and is implemented in PLINK.


References


External links


PLINK 1.07 homepagePLINK 1.9 homepage
{{free-software-stub Bioinformatics software Computational biology Genetics software