A zinc finger is a small
protein
Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residue (biochemistry), residues. Proteins perform a vast array of functions within organisms, including Enzyme catalysis, catalysing metab ...
structural motif that is characterized by the
coordination of one or more
zinc
Zinc is a chemical element; it has symbol Zn and atomic number 30. It is a slightly brittle metal at room temperature and has a shiny-greyish appearance when oxidation is removed. It is the first element in group 12 (IIB) of the periodic tabl ...
ions (Zn
2+) which stabilizes the fold. The term ''zinc finger'' was originally coined to describe the finger-like appearance of a hypothesized structure from the
African clawed frog
The African clawed frog (''Xenopus laevis''), also known as simply xenopus, African clawed toad, African claw-toed frog or the ''platanna'') is a species of African Aquatic animal, aquatic frog of the family Pipidae. Its name is derived from the ...
(''Xenopus laevis'')
transcription factor IIIA. However, it has been found to encompass a wide variety of differing protein structures in
eukaryotic
The eukaryotes ( ) constitute the Domain (biology), domain of Eukaryota or Eukarya, organisms whose Cell (biology), cells have a membrane-bound cell nucleus, nucleus. All animals, plants, Fungus, fungi, seaweeds, and many unicellular organisms ...
cells.
''
Xenopus laevis'' TFIIIA was originally demonstrated to contain zinc and require the metal for function in 1983, the first such reported zinc requirement for a gene regulatory protein
followed soon thereafter by the
Krüppel factor in ''
Drosophila
''Drosophila'' (), from Ancient Greek δρόσος (''drósos''), meaning "dew", and φίλος (''phílos''), meaning "loving", is a genus of fly, belonging to the family Drosophilidae, whose members are often called "small fruit flies" or p ...
''.
It often appears as a
metal-binding domain in multi-domain proteins.
Proteins that contain zinc fingers (zinc finger proteins) are classified into several different structural families. Unlike many other clearly defined
supersecondary structures such as
Greek keys or
β hairpins, there are a number of types of zinc fingers, each with a unique three-dimensional architecture. A particular zinc finger protein's class is determined by its three-dimensional structure, but it can also be recognized based on the primary structure of the protein or the identity of the
ligands
In coordination chemistry, a ligand is an ion or molecule with a functional group that binds to a central metal atom to form a coordination complex. The bonding with the metal generally involves formal donation of one or more of the ligand's ...
coordinating the zinc ion. In spite of the large variety of these proteins, however, the vast majority typically function as interaction modules that bind
DNA
Deoxyribonucleic acid (; DNA) is a polymer composed of two polynucleotide chains that coil around each other to form a double helix. The polymer carries genetic instructions for the development, functioning, growth and reproduction of al ...
,
RNA
Ribonucleic acid (RNA) is a polymeric molecule that is essential for most biological functions, either by performing the function itself (non-coding RNA) or by forming a template for the production of proteins (messenger RNA). RNA and deoxyrib ...
, proteins, or other small, useful molecules, and variations in structure serve primarily to alter the binding specificity of a particular protein.
Since their original discovery and the elucidation of their structure, these interaction modules have proven ubiquitous in the biological world and may be found in 3% of the genes of the human genome.
In addition, zinc fingers have become extremely useful in various therapeutic and research capacities. Engineering zinc fingers to have an affinity for a specific sequence is an area of active research, and
zinc finger nucleases and
zinc finger transcription factors are two of the most important applications of this to be realized to date.
History
Zinc fingers were first identified in a study of transcription in the
African clawed frog
The African clawed frog (''Xenopus laevis''), also known as simply xenopus, African clawed toad, African claw-toed frog or the ''platanna'') is a species of African Aquatic animal, aquatic frog of the family Pipidae. Its name is derived from the ...
, ''Xenopus laevis'' in the laboratory of
Aaron Klug. A study of the transcription of a particular RNA sequence revealed that the binding strength of a small
transcription factor
In molecular biology, a transcription factor (TF) (or sequence-specific DNA-binding factor) is a protein that controls the rate of transcription (genetics), transcription of genetics, genetic information from DNA to messenger RNA, by binding t ...
(transcription factor IIIA; TFIIIA) was due to the presence of zinc-coordinating finger-like structures.
Amino acid sequencing of TFIIIA revealed nine tandem sequences of 30 amino acids, including two invariant pairs of cysteine and histidine residues.
Extended x-ray absorption fine structure confirmed the identity of the zinc ligands: two cysteines and two histidines.
The DNA-binding loop formed by the coordination of these ligands by zinc were thought to resemble fingers, hence the name.
This was followed soon thereafter by the discovery of the
Krüppel factor in ''
Drosophila
''Drosophila'' (), from Ancient Greek δρόσος (''drósos''), meaning "dew", and φίλος (''phílos''), meaning "loving", is a genus of fly, belonging to the family Drosophilidae, whose members are often called "small fruit flies" or p ...
'' by the Schuh team in 1986.
More recent work in the characterization of proteins in various organisms has revealed the importance of zinc ions in polypeptide stabilization.
The crystal structures of zinc finger-DNA complexes solved in 1991 and 1993 revealed the canonical pattern of interactions of zinc fingers with DNA.
The binding of zinc finger is found to be distinct from many other DNA-binding proteins that bind DNA through the 2-fold symmetry of the double helix, instead zinc fingers are linked linearly in tandem to bind nucleic acid sequences of varying lengths.
Zinc fingers often bind to a sequence of DNA known as the
GC box. The modular nature of the zinc finger motif allows for a large number of combinations of DNA and RNA sequences to be bound with high degree of affinity and specificity, and is therefore ideally suited for engineering protein that can be targeted to and bind specific DNA sequences. In 1994, it was shown that an artificially-constructed three-finger protein can block the expression of an oncogene in a mouse cell line. Zinc fingers fused to various other effector domains, some with therapeutic significance, have since been constructed.
Such was its importance that "the zinc-finger motif" was cited in the Scientific Background to the 2024
Nobel Prize in Chemistry (awarded to
David Baker,
Demis Hassabis, and
John M. Jumper for computational protein design and protein structure prediction).
Domain
Zinc finger (Znf)
domains are relatively small
protein motifs that contain multiple finger-like protrusions that make tandem contacts with their target molecule. Some of these domains
bind
BIND () is a suite of software for interacting with the Domain Name System (DNS). Its most prominent component, named (pronounced ''name-dee'': , short for ''name Daemon (computing), daemon''), performs both of the main DNS server roles, acting ...
zinc, but many do not, instead binding other metals such as iron, or no metal at all. For example, some family members form
salt bridge
In electrochemistry, a salt bridge or ion bridge is an essential laboratory device discovered over 100 years ago. It contains an electrolyte solution, typically an inert solution, used to connect the Redox, oxidation and reduction Half cell, ...
s to stabilise the finger-like
folds. They were first identified as a DNA-binding motif in
transcription factor
In molecular biology, a transcription factor (TF) (or sequence-specific DNA-binding factor) is a protein that controls the rate of transcription (genetics), transcription of genetics, genetic information from DNA to messenger RNA, by binding t ...
TFIIIA from ''
Xenopus
''Xenopus'' () (Gk., ξενος, ''xenos'' = strange, πους, ''pous'' = foot, commonly known as the clawed frog) is a genus of highly aquatic frogs native to sub-Saharan Africa. Twenty species are currently described with ...
laevis'' (African clawed frog), however they are now recognised to bind DNA, RNA, protein, and/or
lipid
Lipids are a broad group of organic compounds which include fats, waxes, sterols, fat-soluble vitamins (such as vitamins A, D, E and K), monoglycerides, diglycerides, phospholipids, and others. The functions of lipids include storing ...
substrates.
Their binding properties depend on the
amino acid sequence
Protein primary structure is the linear sequence of amino acids in a peptide or protein. By convention, the primary structure of a protein is reported starting from the amino-terminal (N) end to the carboxyl-terminal (C) end. Protein biosynthe ...
of the finger domains and on the linker between fingers, as well as on the higher-order
structure
A structure is an arrangement and organization of interrelated elements in a material object or system, or the object or system so organized. Material structures include man-made objects such as buildings and machines and natural objects such as ...
s and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. Znf motifs occur in several unrelated
protein superfamilies
A protein superfamily is the largest grouping (clade) of proteins for which common ancestry can be inferred (see homology (biology), homology). Usually this common ancestry is inferred from structural alignment and mechanistic similarity, even if n ...
, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g., some bind DNA, others protein), suggesting that Znf
motifs are stable scaffolds that have
evolved
Evolution is the change in the heritable Phenotypic trait, characteristics of biological populations over successive generations. It occurs when evolutionary processes such as natural selection and genetic drift act on genetic variation, re ...
specialised functions. For example, Znf-containing proteins function in
gene
In biology, the word gene has two meanings. The Mendelian gene is a basic unit of heredity. The molecular gene is a sequence of nucleotides in DNA that is transcribed to produce a functional RNA. There are two types of molecular genes: protei ...
transcription, translation, mRNA trafficking,
cytoskeleton
The cytoskeleton is a complex, dynamic network of interlinking protein filaments present in the cytoplasm of all cells, including those of bacteria and archaea. In eukaryotes, it extends from the cell nucleus to the cell membrane and is compos ...
organization,
epithelial
Epithelium or epithelial tissue is a thin, continuous, protective layer of cells with little extracellular matrix. An example is the epidermis, the outermost layer of the skin. Epithelial ( mesothelial) tissues line the outer surfaces of man ...
development,
cell adhesion
Cell adhesion is the process by which cells interact and attach to neighbouring cells through specialised molecules of the cell surface. This process can occur either through direct contact between cell surfaces such as Cell_junction, cell junc ...
, protein folding,
chromatin
Chromatin is a complex of DNA and protein found in eukaryote, eukaryotic cells. The primary function is to package long DNA molecules into more compact, denser structures. This prevents the strands from becoming tangled and also plays important r ...
remodeling, and zinc sensing, to name but a few.
Zinc-binding motifs are stable structures, and they rarely undergo
conformational change
In biochemistry, a conformational change is a change in the shape of a macromolecule, often induced by environmental factors.
A macromolecule is usually flexible and dynamic. Its shape can change in response to changes in its environment or othe ...
s upon binding their target.
Classes
Initially, the term zinc finger was used solely to describe DNA-binding motif found in ''Xenopus laevis''; however, it is now used to refer to any number of structures related by their coordination of a zinc ion. In general, zinc fingers coordinate zinc ions with a combination of
cysteine
Cysteine (; symbol Cys or C) is a semiessential proteinogenic amino acid with the chemical formula, formula . The thiol side chain in cysteine enables the formation of Disulfide, disulfide bonds, and often participates in enzymatic reactions as ...
and
histidine
Histidine (symbol His or H) is an essential amino acid that is used in the biosynthesis of proteins. It contains an Amine, α-amino group (which is in the protonated –NH3+ form under Physiological condition, biological conditions), a carboxylic ...
residues. Originally, the number and order of these residues was used to classify different types of zinc fingers ( e.g., Cys
2His
2, Cys
4, and Cys
6). More recently, a more systematic method has been used to classify zinc finger proteins instead. This method classifies zinc finger proteins into "fold groups" based on the overall shape of the protein backbone in the folded domain. The most common "fold groups" of zinc fingers are the Cys
2His
2-like (the "classic zinc finger"), treble clef, and zinc ribbon.
The following table
shows the different structures and their key features:
Cys2His2
The Cys
2His
2-like fold group (C2H2) is by far the best-characterized class of zinc fingers, and is common in mammalian transcription factors. Such domains adopt a simple ββα fold and have the amino acid
sequence motif:
[
:X2-Cys-X2,4-Cys-X12-His-X3,4,5-His
This class of zinc fingers can have a variety of functions such as binding RNA and mediating protein-protein interactions, but is best known for its role in sequence-specific DNA-binding proteins such as Zif268 (Egr1). In such proteins, individual zinc finger domains typically occur as tandem repeats with two, three, or more fingers comprising the DNA-binding domain of the protein. These tandem arrays can bind in the major groove of DNA and are typically spaced at 3-bp intervals. The α-helix of each domain (often called the "recognition helix") can make sequence-specific contacts to DNA bases; residues from a single recognition helix can contact four or more bases to yield an overlapping pattern of contacts with adjacent zinc fingers.
]
Gag-knuckle
This fold group is defined by two short β-strands connected by a turn (zinc knuckle) followed by a short helix or loop and resembles the classical Cys2His2 motif with a large portion of the helix and β-hairpin truncated.
The retroviral nucleocapsid (NC) protein from HIV and other related retroviruses are examples of proteins possessing these motifs. The gag-knuckle zinc finger in the HIV NC protein is the target of a class of drugs known as zinc finger inhibitors.
Treble-clef
The treble-clef motif consists of a β-hairpin at the N-terminus and an α-helix at the C-terminus that each contribute two ligands for zinc binding, although a loop and a second β-hairpin of varying length and conformation can be present between the N-terminal β-hairpin and the C-terminal α-helix. These fingers are present in a diverse group of proteins that frequently do not share sequence or functional similarity with each other. The best-characterized proteins containing treble-clef zinc fingers are the nuclear hormone receptors.
Zinc ribbon
The zinc ribbon fold is characterised by two beta-hairpins forming two structurally similar zinc-binding sub-sites.
Zn2/Cys6
The canonical members of this class contain a binuclear zinc cluster in which two zinc ions are bound by six cysteine
Cysteine (; symbol Cys or C) is a semiessential proteinogenic amino acid with the chemical formula, formula . The thiol side chain in cysteine enables the formation of Disulfide, disulfide bonds, and often participates in enzymatic reactions as ...
residues. These zinc fingers can be found in several transcription factors including the yeast Gal4 protein.
Miscellaneous
The ''zinc finger antiviral protein'' () binds to the CpG site. It is used in mammals for antiviral defense.[Xuhua Xia]
Extreme genomic CpG deficiency in SARS-CoV-2 and evasion of host antiviral defense
In: Molecular Biologa and Evolution, Academic Press, April 14th, 2020, doi:10.1093/molbev/msaa094Evidence of Stray Dogs as Possible Origin of COVID-19 Pandemic
On: SciTechDaily, April 14th, 2020. Source: University of Ottawa
Applications
Various
protein engineering techniques can be used to alter the DNA-binding specificity of zinc fingers
and tandem repeats of such engineered zinc fingers can be used to target desired genomic DNA sequences.
Fusing a second protein domain such as a transcriptional activator or repressor to an array of engineered zinc fingers that bind near the promoter of a given gene can be used to alter the transcription of that gene.
[ Fusions between engineered zinc finger arrays and protein domains that cleave or otherwise modify DNA can also be used to target those activities to desired genomic loci.][ The most common applications for engineered zinc finger arrays include zinc finger transcription factors and zinc finger nucleases, but other applications have also been described. Typical engineered zinc finger arrays have between 3 and 6 individual zinc finger motifs and bind target sites ranging from 9 basepairs to 18 basepairs in length. Arrays with 6 zinc finger motifs are particularly attractive because they bind a target site that is long enough to have a good chance of being unique in a mammalian genome.]
Zinc finger nucleases
Engineered zinc finger arrays are often fused to a DNA cleavage domain (usually the cleavage domain of FokI) to generate zinc finger nucleases. Such zinc finger-FokI fusions have become useful reagents for manipulating genomes of many higher organisms including ''Drosophila melanogaster
''Drosophila melanogaster'' is a species of fly (an insect of the Order (biology), order Diptera) in the family Drosophilidae. The species is often referred to as the fruit fly or lesser fruit fly, or less commonly the "vinegar fly", "pomace fly" ...
'', ''Caenorhabditis elegans
''Caenorhabditis elegans'' () is a free-living transparent nematode about 1 mm in length that lives in temperate soil environments. It is the type species of its genus. The name is a Hybrid word, blend of the Greek ''caeno-'' (recent), ''r ...
'', tobacco
Tobacco is the common name of several plants in the genus '' Nicotiana'' of the family Solanaceae, and the general term for any product prepared from the cured leaves of these plants. More than 70 species of tobacco are known, but the ...
, corn
Maize (; ''Zea mays''), also known as corn in North American English, is a tall stout Poaceae, grass that produces cereal grain. It was domesticated by indigenous peoples of Mexico, indigenous peoples in southern Mexico about 9,000 years ago ...
, zebrafish, various types of mammalian cells, and rat
Rats are various medium-sized, long-tailed rodents. Species of rats are found throughout the order Rodentia, but stereotypical rats are found in the genus ''Rattus''. Other rat genera include '' Neotoma'' (pack rats), '' Bandicota'' (bandicoo ...
s. Targeting a double-strand break to a desired genomic locus can be used to introduce frame-shift mutations into the coding sequence of a gene due to the error-prone nature of the non-homologous DNA repair pathway. If a homologous DNA "donor sequence" is also used then the genomic locus can be converted to a defined sequence via the homology directed repair pathway. An ongoing clinical trial is evaluating Zinc finger nucleases that disrupt the CCR5 gene in CD4+ human T-cells as a potential treatment for HIV/AIDS
The HIV, human immunodeficiency virus (HIV) is a retrovirus that attacks the immune system. Without treatment, it can lead to a spectrum of conditions including acquired immunodeficiency syndrome (AIDS). It is a Preventive healthcare, pr ...
.
Methods of engineering zinc finger arrays
The majority of engineered zinc finger arrays are based on the zinc finger domain of the murine transcription factor Zif268, although some groups have used zinc finger arrays based on the human transcription factor SP1. Zif268 has three individual zinc finger motifs that collectively bind a 9 bp sequence with high affinity. The structure of this protein bound to DNA was solved in 1991 and stimulated a great deal of research into engineered zinc finger arrays. In 1994 and 1995, a number of groups used phage display to alter the specificity of a single zinc finger of Zif268. There are two main methods currently used to generate engineered zinc finger arrays, modular assembly, and a bacterial selection system, and there is some debate about which method is best suited for most applications.
The most straightforward method to generate new zinc finger arrays is to combine smaller zinc finger "modules" of known specificity. The structure of the zinc finger protein Zif268 bound to DNA described by Pavletich and Pabo in their 1991 publication has been key to much of this work and describes the concept of obtaining fingers for each of the 64 possible base pair triplets and then mixing and matching these fingers to design proteins with any desired sequence specificity.
The most common modular assembly process involves combining separate zinc fingers that can each recognize a 3-basepair DNA sequence to generate 3-finger, 4-, 5-, or 6-finger arrays that recognize target sites ranging from 9 basepairs to 18 basepairs in length. Another method uses 2-finger modules to generate zinc finger arrays with up to six individual zinc fingers. The Barbas Laboratory of The Scripps Research Institute used phage display to develop and characterize zinc finger domains that recognize most DNA triplet sequences while another group isolated and characterized individual fingers from the human genome. A potential drawback with modular assembly in general is that specificities of individual zinc finger can overlap and can depend on the context of the surrounding zinc fingers and DNA. A recent study demonstrated that a high proportion of 3-finger zinc finger arrays generated by modular assembly fail to bind their intended target with sufficient affinity in a bacterial two-hybrid assay and fail to function as zinc finger nucleases, but the success rate was somewhat higher when sites of the form GNNGNNGNN were targeted.
A subsequent study used modular assembly to generate zinc finger nucleases with both 3-finger arrays and 4-finger arrays and observed a much higher success rate with 4-finger arrays. A variant of modular assembly that takes the context of neighboring fingers into account has also been reported and this method tends to yield proteins with improved performance relative to standard modular assembly.
Numerous selection methods have been used to generate zinc finger arrays capable of targeting desired sequences. Initial selection efforts utilized phage display to select proteins that bound a given DNA target from a large pool of partially randomized zinc finger arrays. This technique is difficult to use on more than a single zinc finger at a time, so a multi-step process that generated a completely optimized 3-finger array by adding and optimizing a single zinc finger at a time was developed. More recent efforts have utilized yeast one-hybrid systems, bacterial one-hybrid and two-hybrid systems, and mammalian cells. A promising new method to select novel 3-finger zinc finger arrays utilizes a bacterial two-hybrid system and has been dubbed "OPEN" by its creators. This system combines pre-selected pools of individual zinc fingers that were each selected to bind a given triplet and then utilizes a second round of selection to obtain 3-finger arrays capable of binding a desired 9-bp sequence. This system was developed by the Zinc Finger Consortium as an alternative to commercial sources of engineered zinc finger arrays. It is somewhat difficult to directly compare the binding properties of proteins generated with this method to proteins generated by modular assembly as the specificity profiles of proteins generated by the OPEN method have never been reported.
Examples
This entry represents the CysCysHisCys (C2HC) type zinc finger domain found in eukaryotes
The eukaryotes ( ) constitute the domain of Eukaryota or Eukarya, organisms whose cells have a membrane-bound nucleus. All animals, plants, fungi, seaweeds, and many unicellular organisms are eukaryotes. They constitute a major group of ...
. Proteins
Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, re ...
containing these domains include:
*MYST family histone acetyltransferase
Histone acetyltransferases (HATs) are enzymes that acetylation, acetylate conserved lysine amino acids on histone proteins by transferring an acetyl group from acetyl-CoA to form ε-N-acetyllysine, ε-''N''-acetyllysine. DNA is wrapped around his ...
s
*Myelin
Myelin Sheath ( ) is a lipid-rich material that in most vertebrates surrounds the axons of neurons to insulate them and increase the rate at which electrical impulses (called action potentials) pass along the axon. The myelinated axon can be lik ...
transcription factor Myt1
*Suppressor of tumourigenicity protein 18 (ST18)
See also
* B-box zinc finger
* DNA-binding protein
* FPG IleRS zinc finger
* Krüppel associated box
* RING finger domain
In molecular biology, a RING (short for Really Interesting New Gene) finger domain is a protein structural domain of zinc finger type which contains a C3HC4 amino acid motif which binds two zinc cations (seven cysteines and one histidine arrang ...
* Sequence motif
* Steroid hormone receptor
* Structural motif
* TAL effector
* Transcription Activator-Like Effector Nuclease
* Zinc finger inhibitor
* Zinc finger nuclease
* Zinc Finger Transcription Factor
References
External links
C2H2 family
a
PlantTFDB: Plant Transcription Factor Database
*
*
Zinc Finger Tools design and information site
Human KZNF Gene Catalog
Zinc finger C2H2-type domain
in PROSITE
Entry for zinc finger class C2H2 in the SMART database
The Zinc Finger Consortium
ZiFiT- Zinc Finger Design Tool
Zinc Finger Consortium Materials from Addgene
Predicting DNA-binding Specificities for C2H2 Zinc Finger Proteins
{{DEFAULTSORT:Zinc Finger
Protein domains
Protein structural motifs
Protein folds
DNA-binding substances
Zinc finger proteins
Thiolates
Protein superfamilies