The Chimpanzee Genome Project was an effort to determine the
DNA
Deoxyribonucleic acid (; DNA) is a polymer composed of two polynucleotide chains that coil around each other to form a double helix. The polymer carries genetic instructions for the development, functioning, growth and reproduction of al ...
sequence of the
chimpanzee
The chimpanzee (; ''Pan troglodytes''), also simply known as the chimp, is a species of Hominidae, great ape native to the forests and savannahs of tropical Africa. It has four confirmed subspecies and a fifth proposed one. When its close rel ...
genome
A genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA (or RNA in RNA viruses). The nuclear genome includes protein-coding genes and non-coding genes, other functional regions of the genome such as ...
. Sequencing began in 2005 and by 2013 twenty-four individual chimpanzees had been sequenced. This project was folded into the
Great Ape Genome Project.
[ ]

In 2013 high resolution sequences were published from each of the four recognized
chimpanzee subspecies:
Central chimpanzee, ''Pan troglodytes troglodytes'', 10 sequences;
Western chimpanzee
The western chimpanzee or West African chimpanzee (''Pan troglodytes verus'') is a Critically Endangered subspecies of the common chimpanzee. It inhabits western Africa, specifically Côte d'Ivoire, Guinea, Liberia, Mali, Senegal, Ghana, Guinea-B ...
, ''Pan troglodytes verus'', 6 sequences;
Nigeria-Cameroon chimpanzee, ''Pan troglodytes ellioti'', 4 sequences; and
Eastern chimpanzee
The eastern chimpanzee (''Pan troglodytes schweinfurthii'') is a subspecies of the common chimpanzee. It is native to the Central African Republic, South Sudan, the Democratic Republic of the Congo, Uganda, Rwanda, Burundi, and Tanzania.
Taxono ...
, ''Pan troglodytes schweinfurthii'', 4 sequences. They were all sequenced to a mean of 25-fold coverage per individual.
The research showed considerable genome diversity in chimpanzees with many population-specific traits. The central chimpanzees retain the highest diversity in the chimpanzee lineage, whereas the other subspecies demonstrate signs of
population bottleneck
A population bottleneck or genetic bottleneck is a sharp reduction in the size of a population due to environmental events such as famines, earthquakes, floods, fires, disease, and droughts; or human activities such as genocide, speciocide, wid ...
s.
Background
Human
Humans (''Homo sapiens'') or modern humans are the most common and widespread species of primate, and the last surviving species of the genus ''Homo''. They are Hominidae, great apes characterized by their Prehistory of nakedness and clothing ...
and
chimpanzee
The chimpanzee (; ''Pan troglodytes''), also simply known as the chimp, is a species of Hominidae, great ape native to the forests and savannahs of tropical Africa. It has four confirmed subspecies and a fifth proposed one. When its close rel ...
chromosome
A chromosome is a package of DNA containing part or all of the genetic material of an organism. In most chromosomes, the very long thin DNA fibers are coated with nucleosome-forming packaging proteins; in eukaryotic cells, the most import ...
s are very alike. The primary difference is that humans have one fewer pair of chromosomes than do other
great apes
The Hominidae (), whose members are known as the great apes or hominids (), are a taxonomic family of primates that includes eight extant species in four genera: '' Pongo'' (the Bornean, Sumatran and Tapanuli orangutan); '' Gorilla'' (the ...
. Humans have 23 pairs of chromosomes and other great apes have 24 pairs of chromosomes. In the human evolutionary lineage, two ancestral ape chromosomes fused at their
telomere
A telomere (; ) is a region of repetitive nucleotide sequences associated with specialized proteins at the ends of linear chromosomes (see #Sequences, Sequences). Telomeres are a widespread genetic feature most commonly found in eukaryotes. In ...
s, producing human
chromosome 2
Chromosome 2 is one of the twenty-three pairs of chromosomes in humans. People normally have two copies of this chromosome. Chromosome 2 is the second-largest human chromosome, spanning more than 242 million base pairs and representing almost ei ...
. There are nine other major chromosomal differences between chimpanzees and humans: chromosome segment inversions on human chromosomes
1,
4,
5,
9,
12,
15,
16,
17, and
18. After the completion of the
Human genome project
The Human Genome Project (HGP) was an international scientific research project with the goal of determining the base pairs that make up human DNA, and of identifying, mapping and sequencing all of the genes of the human genome from both a ...
, a
common chimpanzee
The chimpanzee (; ''Pan troglodytes''), also simply known as the chimp, is a species of great ape native to the forests and savannahs of tropical Africa. It has four confirmed subspecies and a fifth proposed one. When its close relative the ...
genome project was initiated. In December 2003, a preliminary analysis of 7600 genes shared between the two genomes confirmed that certain genes such as the
forkhead-box P2 transcription factor
In molecular biology, a transcription factor (TF) (or sequence-specific DNA-binding factor) is a protein that controls the rate of transcription (genetics), transcription of genetics, genetic information from DNA to messenger RNA, by binding t ...
, which is involved in speech development, are different in the human lineage. Several genes involved in hearing were also found to have changed during human evolution, suggesting selection involving human
language
Language is a structured system of communication that consists of grammar and vocabulary. It is the primary means by which humans convey meaning, both in spoken and signed language, signed forms, and may also be conveyed through writing syste ...
-related behavior. Differences between individual humans and common chimpanzees are estimated to be about 10 times the typical difference between pairs of humans.
Another study showed that patterns of DNA methylation, which are a known regulation mechanism for gene expression, differ in the prefrontal cortex of humans versus chimpanzees, and implicated this difference in the evolutionary divergence of the two species.
Draft genome sequence of the common chimpanzee
An analysis of the chimpanzee genome sequence was published in ''
Nature
Nature is an inherent character or constitution, particularly of the Ecosphere (planetary), ecosphere or the universe as a whole. In this general sense nature refers to the Scientific law, laws, elements and phenomenon, phenomena of the physic ...
'' on September 1, 2005, in an article produced by the
Chimpanzee Sequencing and Analysis Consortium, a group of scientists which is supported in part by the
National Human Genome Research Institute
The National Human Genome Research Institute (NHGRI) is an institute of the National Institutes of Health, located in Bethesda, Maryland.
NHGRI began as the Office of Human Genome Research in The Office of the Director in 1988. This Office transi ...
, one of the
National Institutes of Health
The National Institutes of Health (NIH) is the primary agency of the United States government responsible for biomedical and public health research. It was founded in 1887 and is part of the United States Department of Health and Human Service ...
. The article marked the completion of the draft genome sequence.
A database now exists containing the genetic differences between human and chimpanzee genes, with about thirty-five million
single-nucleotide changes, five million
insertion/deletion events, and various
chromosomal
A chromosome is a package of DNA containing part or all of the genetic material of an organism. In most chromosomes, the very long thin DNA fibers are coated with nucleosome-forming packaging proteins; in eukaryotic cells, the most importa ...
rearrangements.
Gene duplication
Gene duplication (or chromosomal duplication or gene amplification) is a major mechanism through which new genetic material is generated during molecular evolution. It can be defined as any duplication of a region of DNA that contains a gene ...
s account for most of the sequence differences between humans and chimps. Single-base-pair substitutions account for about half as much genetic change as does gene duplication.
Typical human and chimpanzee
homologs
Homologous chromosomes or homologs are a set of one maternal and one paternal chromosome that pair up with each other inside a cell during meiosis. Homologs have the same genes in the same loci, where they provide points along each chromosome th ...
of
protein
Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residue (biochemistry), residues. Proteins perform a vast array of functions within organisms, including Enzyme catalysis, catalysing metab ...
s differ in only an average of two
amino acid
Amino acids are organic compounds that contain both amino and carboxylic acid functional groups. Although over 500 amino acids exist in nature, by far the most important are the 22 α-amino acids incorporated into proteins. Only these 22 a ...
s. About 30 percent of all human proteins are identical in sequence to the corresponding chimpanzee protein. As mentioned above, gene duplications are a major source of differences between human and chimpanzee genetic material, with about 2.7 percent of the genome now representing differences having been produced by gene duplications or deletions during approximately 6 million years
since humans and chimpanzees diverged from their common evolutionary ancestor. The comparable variation within human populations is 0.5 percent.
About 600 genes were identified that may have been undergoing strong positive selection in the human and chimpanzee lineages; many of these genes are involved in
immune system
The immune system is a network of biological systems that protects an organism from diseases. It detects and responds to a wide variety of pathogens, from viruses to bacteria, as well as Tumor immunology, cancer cells, Parasitic worm, parasitic ...
defense against microbial disease (example:
granulysin is protective against ''
Mycobacterium tuberculosis
''Mycobacterium tuberculosis'' (M. tb), also known as Koch's bacillus, is a species of pathogenic bacteria in the family Mycobacteriaceae and the causative agent of tuberculosis.
First discovered in 1882 by Robert Koch, ''M. tuberculosis'' ha ...
''
) or are targeted receptors of pathogenic microorganisms (example:
Glycophorin C
Glycophorin C (GYPC; CD236/CD236R; glycoprotein beta; glycoconnectin; PAS-2) plays a functionally important role in maintaining erythrocyte shape and regulating membrane material properties, possibly through its interaction with protein 4.1. Moreo ...
and ''
Plasmodium falciparum
''Plasmodium falciparum'' is a Unicellular organism, unicellular protozoan parasite of humans and is the deadliest species of ''Plasmodium'' that causes malaria in humans. The parasite is transmitted through the bite of a female ''Anopheles'' mos ...
''). By comparing human and chimpanzee genes to the genes of other mammals, it has been found that genes coding for
transcription factors
In molecular biology, a transcription factor (TF) (or sequence-specific DNA-binding factor) is a protein that controls the rate of transcription of genetic information from DNA to messenger RNA, by binding to a specific DNA sequence. The fun ...
, such as forkhead-box P2 (
FOXP2
Forkhead box protein P2 (FOXP2) is a protein that, in humans, is encoded by the ''FOXP2'' gene. FOXP2 is a member of the forkhead box family of transcription factors, proteins that Regulation of gene expression, regulate gene expression by DNA- ...
), have often evolved faster in the human relative to chimpanzee; relatively small changes in these genes may account for the morphological differences between humans and chimpanzees. A set of 348 transcription factor genes code for proteins with an average of about 50 percent more amino acid changes in the human lineage than in the chimpanzee lineage.
Six human chromosomal regions were found that may have been under particularly strong and coordinated selection during the past 250,000 years. These regions contain at least one marker
allele
An allele is a variant of the sequence of nucleotides at a particular location, or Locus (genetics), locus, on a DNA molecule.
Alleles can differ at a single position through Single-nucleotide polymorphism, single nucleotide polymorphisms (SNP), ...
that seems unique to the human lineage while the entire chromosomal region shows lower than normal genetic variation. This pattern suggests that one or a few strongly selected genes in the chromosome region may have been preventing the random accumulation of neutral changes in other nearby genes. One such region on chromosome 7 contains the
FOXP2
Forkhead box protein P2 (FOXP2) is a protein that, in humans, is encoded by the ''FOXP2'' gene. FOXP2 is a member of the forkhead box family of transcription factors, proteins that Regulation of gene expression, regulate gene expression by DNA- ...
gene (mentioned above) and this region also includes the
Cystic fibrosis transmembrane conductance regulator
Cystic fibrosis transmembrane conductance regulator (CFTR) is a membrane protein and anion channel in vertebrates that is encoded by the ''CFTR'' gene.
Geneticist Lap-Chee Tsui and his team identified the ''CFTR'' gene in 1989 as the gene lin ...
(CFTR) gene, which is important for ion transport in tissues such as the salt-secreting epithelium of sweat glands. Human mutations in the CFTR gene might be selected for as a way to survive
cholera
Cholera () is an infection of the small intestine by some Strain (biology), strains of the Bacteria, bacterium ''Vibrio cholerae''. Symptoms may range from none, to mild, to severe. The classic symptom is large amounts of watery diarrhea last ...
.
Another such region on chromosome 4 may contain elements regulating the expression of a nearby
protocadherin
Protocadherins (Pcdhs) are the largest mammalian subgroup of the cadherin superfamily of homophilic cell-adhesion proteins. They were discovered by Shintaro Suzuki's group, when they used PCR to find new members of the cadherin family. The PCR fra ...
gene that may be important for
brain
The brain is an organ (biology), organ that serves as the center of the nervous system in all vertebrate and most invertebrate animals. It consists of nervous tissue and is typically located in the head (cephalization), usually near organs for ...
development and function. Although changes in expression of genes that are expressed in the brain tend to be less than for other organs (such as liver) on average, gene expression changes in the brain have been more dramatic in the human lineage than in the chimpanzee lineage.
This is consistent with the dramatic divergence of the unique pattern of human brain development seen in the human lineage compared to the ancestral great ape pattern. The protocadherin-beta gene cluster on chromosome 5 also shows evidence of possible positive selection.
Results from the human and chimpanzee genome analyses should help in understanding some human diseases. Humans appear to have lost a functional
Caspase 12 gene, which in other primates codes for an enzyme that may protect against
Alzheimer's disease
Alzheimer's disease (AD) is a neurodegenerative disease and the cause of 60–70% of cases of dementia. The most common early symptom is difficulty in remembering recent events. As the disease advances, symptoms can include problems wit ...
.
Genes of the chromosome 2 fusion site
The results of the chimpanzee genome project suggest that when ancestral chromosomes 2A and 2B fused to produce human chromosome 2, no genes were lost from the fused ends of 2A and 2B. At the site of fusion, there are approximately 150,000 base pairs of sequence not found in chimpanzee chromosomes 2A and 2B. Additional linked copies of the PGML/FOXD/CBWD genes exist elsewhere in the human genome, particularly near the p end of
chromosome 9
Chromosome 9 is one of the 23 pairs of chromosomes in humans. Humans normally have two copies of this chromosome, as they normally do with all chromosomes. Chromosome 9 spans about 138 million base pairs of nucleic acids (the building blocks of DN ...
. This suggests that a copy of these genes may have been added to the end of the ancestral 2A or 2B prior to the fusion event. It remains to be determined if these inserted genes confer a selective advantage.
*''PGM5P4''. The
phosphoglucomutase
Phosphoglucomutase () is an enzyme that transfers a phosphate group on an α-D-glucose monomer from the 1 to the 6 position in the forward direction or the 6 to the 1 position in the reverse direction.
More precisely, it facilitates the interconv ...
pseudogene of human chromosome 2. This gene is incomplete and doesn't produce a functional transcript.
*''FOXD4L1''. The
forkhead box D4-like gene is an example of an intronless gene. The function of this gene is not known, but it may code for a transcription control protein.
*''
CBWD2''. Cobalamin synthetase is a bacterial enzyme that makes
vitamin B12. In the distant past, a common ancestor to mice and apes incorporated a copy of a cobalamin synthetase gene (see:
Horizontal gene transfer
Horizontal gene transfer (HGT) or lateral gene transfer (LGT) is the movement of genetic material between organisms other than by the ("vertical") transmission of DNA from parent to offspring (reproduction). HGT is an important factor in the e ...
). Humans are unusual in that they have several copies of cobalamin synthetase-like genes, including the one on chromosome 2. It remains to be determined what the function of these human cobalamin synthetase-like genes is. If these genes are involved in vitamin B
12 metabolism, this could be relevant to human evolution. A major change in human development is greater post-natal brain growth than is observed in other apes. Vitamin B
12 is important for brain development, and vitamin B
12 deficiency during brain development results in severe neurological defects in human children.
*''WASH2P''. Several
transcripts of unknown function corresponding to this region have been isolated. This region is also present in the closely related chromosome 9p terminal region that contains copies of the PGML/FOXD/CBWD genes.
*''RPL23AP7''. Many
ribosomal protein L23a pseudogene
Pseudogenes are nonfunctional segments of DNA that resemble functional genes. Pseudogenes can be formed from both protein-coding genes and non-coding genes. In the case of protein-coding genes, most pseudogenes arise as superfluous copies of fun ...
s are scattered through the human genome.
See also
*
Human evolutionary genetics
Human evolutionary genetics studies how one human genome differs from another human genome, the evolutionary past that gave rise to the human genome, and its current effects. Differences between genomes have anthropological, medical, historical and ...
*
Human Genome Project
The Human Genome Project (HGP) was an international scientific research project with the goal of determining the base pairs that make up human DNA, and of identifying, mapping and sequencing all of the genes of the human genome from both a ...
Further reading
*
References
{{Apes
Primatology
Genome projects
Genome
A genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA (or RNA in RNA viruses). The nuclear genome includes protein-coding genes and non-coding genes, other functional regions of the genome such as ...