homeobox gene
   HOME

TheInfoList



OR:

A homeobox is a
DNA sequence DNA sequencing is the process of determining the nucleic acid sequence – the order of nucleotides in DNA. It includes any method or technology that is used to determine the order of the four bases: adenine, guanine, cytosine, and thymine. T ...
, around 180
base pair A base pair (bp) is a fundamental unit of double-stranded nucleic acids consisting of two nucleobases bound to each other by hydrogen bonds. They form the building blocks of the DNA double helix and contribute to the folded structure of both D ...
s long, that regulates large-scale anatomical features in the early stages of embryonic development. For instance, mutations in a homeobox may change large-scale anatomical features of the full-grown organism. Homeoboxes are found within
gene In biology, the word gene (from , ; "...Wilhelm Johannsen coined the word gene to describe the Mendelian units of heredity..." meaning ''generation'' or ''birth'' or ''gender'') can have several different meanings. The Mendelian gene is a b ...
s that are involved in the regulation of patterns of anatomical development (
morphogenesis Morphogenesis (from the Greek ''morphê'' shape and ''genesis'' creation, literally "the generation of form") is the biological process that causes a cell, tissue or organism to develop its shape. It is one of three fundamental aspects of deve ...
) in
animal Animals are multicellular, eukaryotic organisms in the biological kingdom Animalia. With few exceptions, animals consume organic material, breathe oxygen, are able to move, can reproduce sexually, and go through an ontogenetic stage ...
s,
fungi A fungus ( : fungi or funguses) is any member of the group of eukaryotic organisms that includes microorganisms such as yeasts and molds, as well as the more familiar mushrooms. These organisms are classified as a kingdom, separately fr ...
,
plant Plants are predominantly photosynthetic eukaryotes of the kingdom Plantae. Historically, the plant kingdom encompassed all living things that were not animals, and included algae and fungi; however, all current definitions of Plantae excl ...
s, and numerous single cell
eukaryote Eukaryotes () are organisms whose cells have a nucleus. All animals, plants, fungi, and many unicellular organisms, are Eukaryotes. They belong to the group of organisms Eukaryota or Eukarya, which is one of the three domains of life. Bacter ...
s. Homeobox genes encode homeodomain
protein Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, res ...
products that are
transcription factor In molecular biology, a transcription factor (TF) (or sequence-specific DNA-binding factor) is a protein that controls the rate of transcription of genetic information from DNA to messenger RNA, by binding to a specific DNA sequence. The f ...
s sharing a characteristic
protein fold A protein superfamily is the largest grouping (clade) of proteins for which common ancestry can be inferred (see homology). Usually this common ancestry is inferred from structural alignment and mechanistic similarity, even if no sequence similari ...
structure that binds DNA to regulate expression of target genes. Homeodomain proteins regulate gene expression and cell differentiation during early embryonic development, thus mutations in homeobox genes can cause developmental disorders.
Homeosis In evolutionary developmental biology, homeosis is the transformation of one organ into another, arising from mutation in or misexpression of certain developmentally critical genes, specifically homeotic genes. In animals, these developmental ...
is a term coined by
William Bateson William Bateson (8 August 1861 – 8 February 1926) was an English biologist who was the first person to use the term genetics to describe the study of heredity, and the chief populariser of the ideas of Gregor Mendel following their rediscove ...
to describe the outright replacement of a discrete body part with another body part, e.g.
antennapedia ''Antennapedia'' (abbreviated ''Antp'') is a Hox gene first discovered in ''Drosophila'' which controls the formation of legs during development. Loss-of-function mutations in the regulatory region of this gene result in the development of ...
—replacement of the antenna on the head of a fruit fly with legs. The "homeo-" prefix in the words "homeobox" and "homeodomain" stems from this mutational phenotype, which is observed when some of these genes are mutated in
animals Animals are multicellular, eukaryotic organisms in the biological kingdom Animalia. With few exceptions, animals consume organic material, breathe oxygen, are able to move, can reproduce sexually, and go through an ontogenetic stage in ...
. The homeobox domain was first identified in a number of ''Drosophila''
homeotic In evolutionary developmental biology, homeosis is the transformation of one organ into another, arising from mutation in or misexpression of certain developmentally critical genes, specifically homeotic genes. In animals, these developmental gen ...
and segmentation proteins, but is now known to be well-conserved in many other animals, including
vertebrate Vertebrates () comprise all animal taxa within the subphylum Vertebrata () ( chordates with backbones), including all mammals, birds, reptiles, amphibians, and fish. Vertebrates represent the overwhelming majority of the phylum Chordata, with ...
s.


Discovery

The existence of homeobox genes was first discovered in ''
Drosophila ''Drosophila'' () is a genus of flies, belonging to the family Drosophilidae, whose members are often called "small fruit flies" or (less frequently) pomace flies, vinegar flies, or wine flies, a reference to the characteristic of many speci ...
'' by isolating the gene responsible for a homeotic transformation where legs grow from the head instead of the expected antennae. Walter Gehring identified a gene called ''
antennapedia ''Antennapedia'' (abbreviated ''Antp'') is a Hox gene first discovered in ''Drosophila'' which controls the formation of legs during development. Loss-of-function mutations in the regulatory region of this gene result in the development of ...
'' that caused this homeotic phenotype. Analysis of ''antennapedia'' revealed that this gene contained a 180 base pair sequence that encoded a DNA binding domain, which William McGinnis termed the "homeobox". The existence of additional ''Drosophila'' genes containing the ''antennapedia'' homeobox sequence was independently reported by Ernst Hafen, Michael Levine, William McGinnis, and Walter Jakob Gehring of the
University of Basel The University of Basel (Latin: ''Universitas Basiliensis'', German: ''Universität Basel'') is a university in Basel, Switzerland. Founded on 4 April 1460, it is Switzerland's oldest university and among the world's oldest surviving universiti ...
in
Switzerland ). Swiss law does not designate a ''capital'' as such, but the federal parliament and government are installed in Bern, while other federal institutions, such as the federal courts, are in other cities (Bellinzona, Lausanne, Luzern, Neuchâtel ...
and
Matthew P. Scott Matthew P. Scott is an American biologist who was the tenth president of the Carnegie Institution for Science. While at Stanford University, Scott studied how embryonic and later development is governed by proteins that control gene activity and c ...
and Amy Weiner of
Indiana University Indiana University (IU) is a system of public universities in the U.S. state of Indiana. Campuses Indiana University has two core campuses, five regional campuses, and two regional centers under the administration of IUPUI. *Indiana Universi ...
in Bloomington in 1984. Isolation of homologous genes by Edward de Robertis and William McGinnis revealed that numerous genes from a variety of species contained the homeobox. Subsequent
phylogenetic In biology, phylogenetics (; from Greek φυλή/ φῦλον [] "tribe, clan, race", and wikt:γενετικός, γενετικός [] "origin, source, birth") is the study of the evolutionary history and relationships among or within groups ...
studies detailing the evolutionary relationship between homeobox-containing genes showed that these genes are present in all
bilaterian The Bilateria or bilaterians are animals with bilateral symmetry as an embryo, i.e. having a left and a right side that are mirror images of each other. This also means they have a head and a tail (anterior-posterior axis) as well as a belly an ...
animals.


Homeodomain structure

The characteristic homeodomain
protein fold A protein superfamily is the largest grouping (clade) of proteins for which common ancestry can be inferred (see homology). Usually this common ancestry is inferred from structural alignment and mechanistic similarity, even if no sequence similari ...
consists of a 60-
amino acid Amino acids are organic compounds that contain both amino and carboxylic acid functional groups. Although hundreds of amino acids exist in nature, by far the most important are the alpha-amino acids, which comprise proteins. Only 22 alpha ...
long domain composed of three alpha helixes. The following shows the consensus homeodomain (~60 amino acid chain):
            Helix 1          Helix 2         Helix 3/4
         ______________    __________    _________________
RRRKRTAYTRYQLLELEKEFHFNRYLTRRRRIELAHSLNLTERHIKIWFQNRRMKWKKEN
...., ...., ...., ...., ...., ...., ...., ...., ...., ...., ...., ...., 
         10        20        30        40        50        60
Helix 2 and helix 3 form a so-called
helix-turn-helix Helix-turn-helix is a DNA-binding protein (DBP). The helix-turn-helix (HTH) is a major structural motif capable of binding DNA. Each monomer incorporates two α helices, joined by a short strand of amino acids, that bind to the major groove of ...
(HTH) structure, where the two
alpha helices The alpha helix (α-helix) is a common motif in the secondary structure of proteins and is a right hand-helix conformation in which every backbone N−H group hydrogen bonds to the backbone C=O group of the amino acid located four residues ear ...
are connected by a short loop region. The
N-terminal The N-terminus (also known as the amino-terminus, NH2-terminus, N-terminal end or amine-terminus) is the start of a protein or polypeptide, referring to the free amine group (-NH2) located at the end of a polypeptide. Within a peptide, the ami ...
two helices of the homeodomain are antiparallel and the longer
C-terminal The C-terminus (also known as the carboxyl-terminus, carboxy-terminus, C-terminal tail, C-terminal end, or COOH-terminus) is the end of an amino acid chain (protein or polypeptide), terminated by a free carboxyl group (-COOH). When the protein is ...
helix is roughly perpendicular to the axes established by the first two. It is this third helix that interacts directly with DNA via a number of hydrogen bonds and hydrophobic interactions, as well as indirect interactions via water molecules, which occur between specific
side chain In organic chemistry and biochemistry, a side chain is a chemical group that is attached to a core part of the molecule called the "main chain" or backbone. The side chain is a hydrocarbon branching element of a molecule that is attached to a ...
s and the exposed bases within the
major groove Major ( commandant in certain jurisdictions) is a military rank of commissioned officer status, with corresponding ranks existing in many military forces throughout the world. When used unhyphenated and in conjunction with no other indicat ...
of the DNA. Homeodomain proteins are found in
eukaryote Eukaryotes () are organisms whose cells have a nucleus. All animals, plants, fungi, and many unicellular organisms, are Eukaryotes. They belong to the group of organisms Eukaryota or Eukarya, which is one of the three domains of life. Bacter ...
s. Through the HTH motif, they share limited sequence similarity and structural similarity to prokaryotic transcription factors, such as
lambda phage ''Enterobacteria phage λ'' (lambda phage, coliphage λ, officially ''Escherichia virus Lambda'') is a bacterial virus, or bacteriophage, that infects the bacterial species ''Escherichia coli'' (''E. coli''). It was discovered by Esther Leder ...
proteins that alter the expression of genes in
prokaryote A prokaryote () is a single-celled organism that lacks a nucleus and other membrane-bound organelles. The word ''prokaryote'' comes from the Greek πρό (, 'before') and κάρυον (, 'nut' or 'kernel').Campbell, N. "Biology:Concepts & Con ...
s. The HTH motif shows some sequence similarity but a similar structure in a wide range of DNA-binding proteins (e.g., cro and repressor proteins, homeodomain proteins, etc.). One of the principal differences between HTH motifs in these different proteins arises from the stereochemical requirement for
glycine Glycine (symbol Gly or G; ) is an amino acid that has a single hydrogen atom as its side chain. It is the simplest stable amino acid ( carbamic acid is unstable), with the chemical formula NH2‐ CH2‐ COOH. Glycine is one of the proteinog ...
in the turn which is needed to avoid
steric Steric effects arise from the spatial arrangement of atoms. When atoms come close together there is a rise in the energy of the molecule. Steric effects are nonbonding interactions that influence the shape ( conformation) and reactivity of ions ...
interference of the beta-carbon with the main chain: for cro and repressor proteins the glycine appears to be mandatory, whereas for many of the homeotic and other DNA-binding proteins the requirement is relaxed.


Sequence specificity

Homeodomains can bind both specifically and nonspecifically to B-DNA with the C-terminal recognition helix aligning in the DNA's major groove and the unstructured peptide "tail" at the N-terminus aligning in the minor groove. The recognition helix and the inter-helix loops are rich in
arginine Arginine is the amino acid with the formula (H2N)(HN)CN(H)(CH2)3CH(NH2)CO2H. The molecule features a guanidino group appended to a standard amino acid framework. At physiological pH, the carboxylic acid is deprotonated (−CO2−) and both the am ...
and
lysine Lysine (symbol Lys or K) is an α-amino acid that is a precursor to many proteins. It contains an α-amino group (which is in the protonated form under biological conditions), an α-carboxylic acid group (which is in the deprotonated − ...
residues, which form
hydrogen bond In chemistry, a hydrogen bond (or H-bond) is a primarily electrostatic force of attraction between a hydrogen (H) atom which is covalently bound to a more electronegative "donor" atom or group (Dn), and another electronegative atom bearing a l ...
s to the DNA backbone. Conserved
hydrophobic In chemistry, hydrophobicity is the physical property of a molecule that is seemingly repelled from a mass of water (known as a hydrophobe). In contrast, hydrophiles are attracted to water. Hydrophobic molecules tend to be nonpolar and, ...
residues in the center of the recognition helix aid in stabilizing the helix packing. Homeodomain proteins show a preference for the DNA sequence 5'-TAAT-3'; sequence-independent binding occurs with significantly lower affinity. The specificity of a single homeodomain protein is usually not enough to recognize specific target gene promoters, making cofactor binding an important mechanism for controlling binding sequence specificity and target gene expression. To achieve higher target specificity, homeodomain proteins form complexes with other transcription factors to recognize the
promoter region In genetics, a promoter is a sequence of DNA to which proteins bind to initiate transcription of a single RNA transcript from the DNA downstream of the promoter. The RNA transcript may encode a protein ( mRNA), or can have a function in and ...
of a specific target gene.


Biological function

Homeodomain proteins function as
transcription factors In molecular biology, a transcription factor (TF) (or sequence-specific DNA-binding factor) is a protein that controls the rate of transcription of genetic information from DNA to messenger RNA, by binding to a specific DNA sequence. The fun ...
due to the DNA binding properties of the conserved HTH motif. Homeodomain proteins are considered to be master control genes, meaning that a single protein can regulate expression of many target genes. Homeodomain proteins direct the formation of the body axes and body structures during early embryonic development. Many homeodomain proteins induce
cellular differentiation Cellular differentiation is the process in which a stem cell alters from one type to a differentiated one. Usually, the cell changes to a more specialized type. Differentiation happens multiple times during the development of a multicellular ...
by initiating the cascades of coregulated genes required to produce individual tissues and organs. Other proteins in the family, such as NANOG are involved in maintaining
pluripotency Pluripotency: These are the cells that can generate into any of the three Germ layers which imply Endodermal, Mesodermal, and Ectodermal cells except tissues like the placenta. According to Latin terms, Pluripotentia means the ability for many thin ...
and preventing cell differentiation.


Regulation

Hox genes and their associated microRNAs are highly conserved developmental master regulators with tight tissue-specific, spatiotemporal control. These genes are known to be dysregulated in several cancers and are often controlled by DNA methylation. The regulation of Hox genes is highly complex and involves reciprocal interactions, mostly inhibitory.
Drosophila ''Drosophila'' () is a genus of flies, belonging to the family Drosophilidae, whose members are often called "small fruit flies" or (less frequently) pomace flies, vinegar flies, or wine flies, a reference to the characteristic of many speci ...
is known to use the
polycomb Polycomb-group proteins (PcG proteins) are a family of protein complexes first discovered in fruit flies that can remodel chromatin such that epigenetic silencing of genes takes place. Polycomb-group proteins are well known for silencing Hox genes ...
and trithorax complexes to maintain the expression of Hox genes after the down-regulation of the pair-rule and gap genes that occurs during larval development.
Polycomb-group proteins Polycomb-group proteins (PcG proteins) are a family of protein complexes first discovered in fruit flies that can remodel chromatin such that epigenetic silencing of genes takes place. Polycomb-group proteins are well known for silencing Hox genes ...
can silence the HOX genes by modulation of
chromatin Chromatin is a complex of DNA and protein found in eukaryote, eukaryotic cells. The primary function is to package long DNA molecules into more compact, denser structures. This prevents the strands from becoming tangled and also plays important ...
structure.


Mutations

Mutations to homeobox genes can produce easily visible
phenotypic In genetics, the phenotype () is the set of observable characteristics or traits of an organism. The term covers the organism's morphology or physical form and structure, its developmental processes, its biochemical and physiological pr ...
changes in body segment identity, such as the Antennapedia and Bithorax mutant phenotypes in ''Drosophila''. Duplication of homeobox genes can produce new body segments, and such duplications are likely to have been important in the
evolution Evolution is change in the heritable characteristics of biological populations over successive generations. These characteristics are the expressions of genes, which are passed on from parent to offspring during reproduction. Variation ...
of segmented animals.


Evolution

The homeobox itself may have evolved from a non-DNA-binding transmembrane domain at the C-terminus of the MraY enzyme. This is based on metagenomic data acquired from the transitional archaeon, ''
Lokiarchaeum Lokiarchaeota is a proposed phylum of the Archaea. The phylum includes all members of the group previously named Deep Sea Archaeal Group (DSAG), also known as Marine Benthic Group B (MBG-B). Lokiarchaeota is part of the superphylum Asgard contai ...
'', that is regarded as the prokaryote closest to the ancestor of all eukaryotes. Phylogenetic analysis of homeobox gene sequences and homeodomain protein structures suggests that the last common ancestor of plants, fungi, and animals had at least two homeobox genes. Molecular evidence shows that some limited number of Hox genes have existed in the
Cnidaria Cnidaria () is a phylum under kingdom Animalia containing over 11,000 species of aquatic animals found both in freshwater and marine environments, predominantly the latter. Their distinguishing feature is cnidocytes, specialized cells that ...
since before the earliest true
Bilatera The Bilateria or bilaterians are animals with bilateral symmetry as an embryo, i.e. having a left and a right side that are mirror images of each other. This also means they have a head and a tail (anterior-posterior axis) as well as a belly and ...
, making these genes pre-
Paleozoic The Paleozoic (or Palaeozoic) Era is the earliest of three geologic eras of the Phanerozoic Eon. The name ''Paleozoic'' ( ;) was coined by the British geologist Adam Sedgwick in 1838 by combining the Greek words ''palaiós'' (, "old") and ...
. It is accepted that the three major animal ANTP-class clusters, Hox, ParaHox, and NK (MetaHox), are the result of segmental duplications. A first duplication created MetaHox and ProtoHox, the latter of which later duplicated into Hox and ParaHox. The clusters themselves were created by tandem duplications of a single ANTP-class homeobox gene. Gene duplication followed by
neofunctionalization Neofunctionalization, one of the possible outcomes of functional divergence, occurs when one gene copy, or paralog, takes on a totally new function after a gene duplication event. Neofunctionalization is an adaptive mutation process; meaning one ...
is responsible for the many homeobox genes found in eukaryotes. Comparison of homeobox genes and gene clusters has been used to understand the evolution of genome structure and body morphology throughout metazoans.


Types of homeobox genes


Hox genes

Hox genes are the most commonly known subset of homeobox genes. They are essential
metazoan Animals are multicellular, eukaryotic organisms in the biological kingdom Animalia. With few exceptions, animals consume organic material, breathe oxygen, are able to move, can reproduce sexually, and go through an ontogenetic stage in ...
genes that determine the identity of embryonic regions along the anterior-posterior axis. The first vertebrate Hox gene was isolated in ''
Xenopus ''Xenopus'' () (Gk., ξενος, ''xenos''=strange, πους, ''pous''=foot, commonly known as the clawed frog) is a genus of highly aquatic frogs native to sub-Saharan Africa. Twenty species are currently described within it. The two best-know ...
'' by Edward De Robertis and colleagues in 1984. The main interest in this set of genes stems from their unique behavior and arrangement in the genome. Hox genes are typically found in an organized cluster. The linear order of Hox genes within a cluster is directly correlated to the order they are expressed in both time and space during development. This phenomenon is called colinearity. Mutations in these
homeotic genes In evolutionary developmental biology, homeotic genes are genes which regulate the development of anatomical structures in various organisms such as echinoderms, insects, mammals, and plants. Homeotic genes often encode transcription factor protein ...
cause displacement of body segments during embryonic development. This is called
ectopia Ectopia, ectopic, or ectopy may refer to: *Ectopia (medicine), including a list of medical uses of ectopia or ectopic **Ectopic pregnancy, a pregnancy that occurs outside the uterus ** Ectopic beat, or cardiac ectopy, a disturbance in cardiac rhy ...
. For example, when one gene is lost the segment develops into a more anterior one, while a mutation that leads to a gain of function causes a segment to develop into a more posterior one. Famous examples are ''
Antennapedia ''Antennapedia'' (abbreviated ''Antp'') is a Hox gene first discovered in ''Drosophila'' which controls the formation of legs during development. Loss-of-function mutations in the regulatory region of this gene result in the development of ...
'' and bithorax in ''Drosophila'', which can cause the development of legs instead of antennae and the development of a duplicated thorax, respectively. In vertebrates, the four
paralog Sequence homology is the biological homology between DNA, RNA, or protein sequences, defined in terms of shared ancestry in the evolutionary history of life. Two segments of DNA can have shared ancestry because of three phenomena: either a sp ...
clusters are partially redundant in function, but have also acquired several derived functions. For example, HoxA and HoxD specify segment identity along the
limb Limb may refer to: Science and technology * Limb (anatomy), an appendage of a human or animal *Limb, a large or main branch of a tree *Limb, in astronomy, the curved edge of the apparent disk of a celestial body, e.g. lunar limb *Limb, in botany, ...
axis. Specific members of the Hox family have been implicated in vascular remodeling,
angiogenesis Angiogenesis is the physiological process through which new blood vessels form from pre-existing vessels, formed in the earlier stage of vasculogenesis. Angiogenesis continues the growth of the vasculature by processes of sprouting and splittin ...
, and disease by orchestrating changes in matrix degradation, integrins, and components of the ECM. HoxA5 is implicated in atherosclerosis. HoxD3 and HoxB3 are proinvasive, angiogenic genes that upregulate b3 and a5 integrins and Efna1 in ECs, respectively. HoxA3 induces
endothelial The endothelium is a single layer of squamous endothelial cells that line the interior surface of blood vessels and lymphatic vessels. The endothelium forms an interface between circulating blood or lymph in the lumen and the rest of the ve ...
cell (EC) migration by upregulating MMP14 and uPAR. Conversely, HoxD10 and HoxA5 have the opposite effect of suppressing EC migration and angiogenesis, and stabilizing adherens junctions by upregulating TIMP1/downregulating uPAR and MMP14, and by upregulating Tsp2/downregulating VEGFR2, Efna1, Hif1alpha and COX-2, respectively. HoxA5 also upregulates the tumor suppressor p53 and Akt1 by downregulation of PTEN. Suppression of HoxA5 has been shown to attenuate
hemangioma A hemangioma or haemangioma is a usually benign vascular tumor derived from blood vessel cell types. The most common form, seen in infants, is an infantile hemangioma, known colloquially as a "strawberry mark", most commonly presenting on the ski ...
growth. HoxA5 has far-reaching effects on gene expression, causing ~300 genes to become upregulated upon its induction in breast cancer cell lines. HoxA5 protein transduction domain overexpression prevents inflammation shown by inhibition of TNFalpha-inducible monocyte binding to HUVECs.


LIM genes

LIM genes (named after the initial letters of the names of three proteins where the characteristic domain was first identified) encode two 60 amino acid cysteine and histidine-rich LIM domains and a homeodomain. The LIM domains function in protein-protein interactions and can bind zinc molecules. LIM domain proteins are found in both the cytosol and the nucleus. They function in cytoskeletal remodeling, at focal adhesion sites, as scaffolds for protein complexes, and as transcription factors.


Pax genes

Most Pax genes contain a homeobox and a paired domain that also binds DNA to increase binding specificity, though some Pax genes have lost all or part of the homeobox sequence. Pax genes function in embryo segmentation,
nervous system In biology, the nervous system is the highly complex part of an animal that coordinates its actions and sensory information by transmitting signals to and from different parts of its body. The nervous system detects environmental changes ...
development, generation of the frontal eye fields,
skeletal A skeleton is the structural frame that supports the body of an animal. There are several types of skeletons, including the exoskeleton, which is the stable outer shell of an organism, the endoskeleton, which forms the support structure inside ...
development, and formation of face structures. Pax 6 is a master regulator of eye development, such that the gene is necessary for development of the optic vesicle and subsequent eye structures.


POU genes

Proteins containing a POU region consist of a homeodomain and a separate, structurally homologous POU domain that contains two
helix-turn-helix Helix-turn-helix is a DNA-binding protein (DBP). The helix-turn-helix (HTH) is a major structural motif capable of binding DNA. Each monomer incorporates two α helices, joined by a short strand of amino acids, that bind to the major groove of ...
motifs and also binds DNA. The two domains are linked by a flexible loop that is long enough to stretch around the DNA helix, allowing the two domains to bind on opposite sides of the target DNA, collectively covering an eight-base segment with
consensus sequence In molecular biology and bioinformatics, the consensus sequence (or canonical sequence) is the calculated order of most frequent residues, either nucleotide or amino acid, found at each position in a sequence alignment. It serves as a simplified r ...
5'-ATGCAAAT-3'. The individual domains of POU proteins bind DNA only weakly, but have strong sequence-specific affinity when linked. The POU domain itself has significant structural similarity with repressors expressed in
bacteriophage A bacteriophage (), also known informally as a ''phage'' (), is a duplodnaviria virus that infects and replicates within bacteria and archaea. The term was derived from "bacteria" and the Greek φαγεῖν ('), meaning "to devour". Bac ...
s, particularly
lambda phage ''Enterobacteria phage λ'' (lambda phage, coliphage λ, officially ''Escherichia virus Lambda'') is a bacterial virus, or bacteriophage, that infects the bacterial species ''Escherichia coli'' (''E. coli''). It was discovered by Esther Leder ...
.


Plant homeobox genes

As in animals, the plant homeobox genes code for the typical 60 amino acid long DNA-binding homeodomain or in case of the TALE (three amino acid loop extension) homeobox genes for an atypical homeodomain consisting of 63 amino acids. According to their conserved intron–exon structure and to unique codomain architectures they have been grouped into 14 distinct classes: HD-ZIP I to IV, BEL, KNOX, PLINC, WOX, PHD, DDT, NDX, LD, SAWADEE and PINTOX. Conservation of codomains suggests a common eukaryotic ancestry for TALE and non-TALE homeodomain proteins.


Human homeobox genes

The Hox genes in humans are organized in four chromosomal clusters:
ParaHox The ParaHox gene cluster is an array of homeobox genes (involved in morphogenesis, the regulation of patterns of anatomical development) from the Gsx, Xlox ( Pdx) and Cdx gene families. Regulatory gene cluster These genes were first shown to b ...
genes are analogously found in four areas. They include
CDX1 Homeobox protein CDX-1 is a protein in humans that is encoded by the ''CDX1'' gene. CDX1 is expressed in the developing endoderm and its expression persists in the intestine throughout adulthood. CDX1 protein expression varies along the intestine, ...
, CDX2, CDX4; GSX1,
GSX2 GSX may refer to: *Gsx (gene family) * GSX Techedu, a large Chinese online tutoring company *Asurada GSX, a fictional racing car from ''Future GPX Cyber Formula'' * Buick GSX, a muscle car * Gandhi Smarak Road railway station, in Maharashtra, India ...
; and
PDX1 PDX1 (pancreatic and duodenal homeobox 1), also known as insulin promoter factor 1, is a transcription factor in the ParaHox gene cluster.Brooke, N. M., Garcia-Fernàndez, J., & Holland, P. W. (1998). The ParaHox gene cluster is an evolutionary si ...
. Other genes considered Hox-like include
EVX1 Evx1 is a mammalian gene located downstream of the HoxA cluster, which encodes for a homeobox transcription factor. Evx1 is a homolog of even-skipped (eve), which is a pair-rule gene that regulates body segmentation in Drosophila. The expression ...
, EVX2; GBX1, GBX2; MEOX1, MEOX2; and
MNX1 Motor neuron and pancreas homeobox 1 (MNX1), also known as Homeobox HB9 (HLXB9), is a human protein encoded by the ''MNX1'' gene In biology, the word gene (from , ; "... Wilhelm Johannsen coined the word gene to describe the Mendelian uni ...
. The NK-like (NKL) genes, some of which are considered "MetaHox", are grouped with Hox-like genes into a large ANTP-like group. Humans have a "distal-less homeobox" family:
DLX1 Homeobox protein DLX-1 is a protein that in humans is encoded by the ''DLX1'' gene. Function This gene encodes a member of a homeobox transcription factor gene family similar to the ''Drosophila'' distal-less gene. The encoded protein is locali ...
,
DLX2 Homeobox protein DLX-2 is a protein that in humans is encoded by the ''DLX2'' gene. Many vertebrate homeo box-containing genes have been identified on the basis of their sequence similarity with Drosophila developmental genes. Members of the Dl ...
,
DLX3 Homeobox protein DLX-3 is a protein that in humans is encoded by the ''DLX3'' gene. Function Dlx3 is a crucial regulator of hair follicle differentiation and cycling. Dlx3 transcription is mediated through Wnt, and colocalization of Dlx3 wit ...
,
DLX4 Homeobox protein DLX-4 is a protein that in humans is encoded by the ''DLX4'' gene. Function Many vertebrate homeobox-containing genes have been identified on the basis of their sequence similarity with ''Drosophila'' developmental genes. Mem ...
,
DLX5 Homeobox protein DLX-5 is a protein that in humans is encoded by the distal-less homeobox 5 gene, or ''DLX5'' gene. DLX5 is a member of DLX gene family. Function This gene encodes a member of a homeobox transcription factor gene family similar ...
, and
DLX6 Homeobox protein DLX-6 is a protein that in humans is encoded by the ''DLX6'' gene. This gene encodes a member of a homeobox transcription factor gene family similar to the Drosophila ''Drosophila'' () is a genus of flies, belonging to the ...
. Dlx genes are involved in the development of the nervous system and of limbs. They are considered a subset of the NK-like genes. Human TALE (Three Amino acid Loop Extension) homeobox genes for an "atypical" homeodomain consist of 63 rather than 60 amino acids:
IRX1 Iroquois-class homeodomain protein IRX-1, also known as Iroquois homeobox protein 1, is a protein that in humans is encoded by the ''IRX1'' gene. All members of the Iroquois (IRO) family of proteins share two highly conserved features, encoding ...
, IRX2, IRX3, IRX4,
IRX5 Iroquois-class homeodomain protein IRX-5, also known as Iroquois homeobox protein 5, is a protein that in humans is encoded by the ''IRX5'' gene In biology, the word gene (from , ; "...Wilhelm Johannsen coined the word gene to describe the ...
, IRX6;
MEIS1 Homeobox protein Meis1 is a protein that in humans is encoded by the MEIS1 gene. Function Homeobox genes, of which the most well-characterized category is represented by the HOX genes, play a crucial role in normal development. In addition, s ...
,
MEIS2 Homeobox protein Meis2 is a protein that in humans is encoded by the ''MEIS2'' gene. This gene encodes a homeobox protein belonging to the TALE ('three amino acid loop extension') family of homeodomain-containing proteins. TALE homeobox protein ...
, MEIS3;
MKX Homeobox protein Mohawk, also known as iroquois homeobox protein-like 1, is a protein that in humans is encoded by the ''MKX'' (mohawk homeobox) gene. MKX is a member of an Iroquois (IRX) family-related class of 'three-amino acid loop extension' ...
;
PBX1 Pre-B-cell leukemia transcription factor 1 is a protein that in humans is encoded by the ''PBX1'' gene. Interactions PBX1 has been shown to interact Advocates for Informed Choice, dba interACT or interACT Advocates for Intersex Youth, i ...
, PBX2,
PBX3 Pre-B-cell leukemia transcription factor 3 is a protein that in humans is encoded by the ''PBX3'' gene. References Further reading * * * * * * * * * * External links * * Transcription factors {{gene-9-stub ...
, PBX4; PKNOX1, PKNOX2; TGIF1, TGIF2, TGIF2LX, TGIF2LY. In addition, humans have the following homeobox genes and proteins: * LIM-class: ISL1, ISL2;
LHX1 LIM homeobox 1 is a protein that in humans is encoded by the LHX1 gene. This gene encodes a member of a large protein family which contains the LIM domain, a unique cysteine-rich zinc-binding domain. The encoded protein is a transcription factor ...
,
LHX2 LIM/homeobox protein Lhx2 is a protein that in humans is encoded by the ''LHX2'' gene. This gene encodes a protein belonging to a large protein family, members of which carry the LIM domain, a unique cysteine-rich zinc-binding domain. The encod ...
, LHX3, LHX4, LHX5, LHX6, LHX8, LHX9;
LMX1A LIM homeobox transcription factor 1, alpha, also known as LMX1A, is a protein which in humans is encoded by the ''LMX1A'' gene. Function Insulin is produced exclusively by the beta cells in the islets of Langerhans in the pancreas. The level a ...
, LMX1B * POU-class: HDX; POU1F1; POU2F1;
POU2F2 POU or pou may refer to: People * Pou (surname), a surname * Chu Pou (303–350), Chinese general and politician * Pou Temara (born 1948), New Zealand Māori academic Codes * POU, IATA airport code and FAA location identifier for Hudson Valley Re ...
;
POU2F3 POU domain, class 2, transcription factor 3 is a protein that in humans is encoded by the ''POU2F3'' gene In biology, the word gene (from , ; "...Wilhelm Johannsen coined the word gene to describe the Mendelian units of heredity..." meani ...
;
POU3F1 POU domain, class 3, transcription factor 1 (also known as Oct-6) is a protein that in humans is encoded by the ''POU3F1'' gene. See also * Octamer transcription factor Octamer transcription factors are a protein family, family of transcriptio ...
;
POU3F2 POU domain, class 3, transcription factor 2 is a protein that in humans is encoded by the ''POU3F2'' gene. Function N-Oct-3 is a protein belonging to a large family of transcription factors that bind to the octameric DNA sequence ATGCAAAT. Mos ...
;
POU3F3 POU class 3 homeobox 3 is a protein that in humans is encoded by the POU3F3 gene In biology, the word gene (from , ; "...Wilhelm Johannsen coined the word gene to describe the Mendelian units of heredity..." meaning ''generation'' or ''bi ...
;
POU3F4 POU domain, class 3, transcription factor 4 is a protein that in humans is encoded by the ''POU3F4'' gene found on the X chromosome. POU3F4 is involved in the patterning of the neural tube and both the paraventricular and supraoptic nuclei of ...
;
POU4F1 POU domain, class 4, transcription factor 1 (POU4F1) also known as brain-specific homeobox/POU domain protein 3A (BRN3A), homeobox/POU domain protein RDC-1 or Oct-T1 is a protein that in humans is encoded by the ''POU4F1'' gene. BRN3A (POU4F1) ...
;
POU4F2 POU domain, class 4, transcription factor 2 is a protein that in humans is encoded by the ''POU4F2'' gene. Function POU4F2 is a member of the POU-domain family of transcription factors. POU-domain proteins have been observed to play important ...
;
POU4F3 POU domain, class 4, transcription factor 3 is a protein that in humans is encoded by the ''POU4F3'' gene. It's a member of BRN-3 group, also known as POU family class 4. Nomenclature DFNA15 refers to a type of nonsyndromic deafness, with auto ...
; POU5F1; POU5F1P1; POU5F1P4; POU5F2; POU6F1; and POU6F2 * CERS-class: LASS2, LASS3, LASS4, LASS5, LASS6; * HNF-class:
HMBOX1 Homeobox containing 1, also known as homeobox telomere-binding protein 1 (HOT1), is a protein that in humans is encoded by the HMBOX1 gene. HMBOX1 directly binds to the double-stranded repeat sequence of telomeres. HMBOX1 has originally been ide ...
;
HNF1A HNF1 homeobox A (hepatocyte nuclear factor 1 homeobox A), also known as HNF1A, is a human gene on chromosome 12. It is ubiquitously expressed in many tissues and cell types. The protein encoded by this gene is a transcription factor that is high ...
,
HNF1B HNF1 homeobox B (hepatocyte nuclear factor 1 homeobox B), also known as HNF1B or transcription factor 2 (TCF2), is a human gene. Function HNF1B encodes hepatocyte nuclear factor 1-beta, a protein of the homeobox-containing basic helix-turn-hel ...
; * SINE-class:
SIX1 Homeobox protein SIX1 (Sine oculis homeobox homolog 1) is a protein that in humans is encoded by the ''SIX1'' gene. Function The vertebrate SIX genes are homologs of the Drosophila 'sine oculis' (so) gene, which is expressed primarily in the ...
, SIX2,
SIX3 Homeobox protein SIX3 is a protein that in humans is encoded by the ''SIX3'' gene. Function The SIX homeobox 3 (SIX3) gene is crucial in embryonic development by providing necessary instructions for the formation of the forebrain and eye deve ...
, SIX4,
SIX5 Homeobox protein SIX5 is a protein that in humans is encoded by the ''SIX5'' gene In biology, the word gene (from , ; "...Wilhelm Johannsen coined the word gene to describe the Mendelian units of heredity..." meaning ''generation'' or ''b ...
, SIX6 * CUT-class:
ONECUT1 One cut homeobox 1 is a protein that in humans is encoded by the ONECUT1 gene. Function This gene encodes a member of the Cut homeobox family of transcription factors. Expression of the encoded protein is enriched in the liver, where it sti ...
, ONECUT2, ONECUT3;
CUX1 CUX1 is an animal gene. The name stands for Cut like homeobox 1. The term "cut" derives from the "cut wing" phenotype observed in a mutant of '' Drosophila melanogaster''. In mammals, a CCAAT-displacement activity was originally described in DNA bin ...
, CUX2; SATB1, SATB2; * ZF-class: ADNP, ADNP2; TSHZ1, TSHZ2,
TSHZ3 Teashirt homolog 3 is a protein that in humans is encoded by the ''TSHZ3'' gene. In mice, it is a necessary part of the neural circuitry that controls breathing. The gene is also a homolog of the Drosophila melanogaster teashirt gene, which encod ...
;
ZEB1 Zinc finger E-box-binding homeobox 1 is a protein that in humans is encoded by the ''ZEB1'' gene. ZEB1 (previously known as TCF8) encodes a zinc finger and homeodomain transcription factor that represses T-lymphocyte-specific IL2 gene expression ...
,
ZEB2 Zinc finger E-box-binding homeobox 2 is a protein that in humans is encoded by the ''ZEB2'' gene. The ZEB2 protein is a transcription factor that plays a role in the transforming growth factor β (TGFβ) signaling pathways that are essential durin ...
; ZFHX2, ZFHX3, ZFHX4;
ZHX1 Zinc fingers and homeoboxes protein 1 is a protein that in humans is encoded by the ''ZHX1'' gene. The members of the zinc fingers and homeoboxes gene family are nuclear homodimeric transcriptional repressors that interact with the A subunit o ...
, HOMEZ; * PRD-class:
ALX1 ALX homeobox protein 1 is a protein that in humans is encoded by the ''ALX1'' gene. Function The specific function of this gene has yet to be determined in humans; however, in rodents, it is necessary for survival of the forebrain mesenchyme ...
(CART1), ALX3, ALX4; ARGFX;
ARX Arx, ARX, or ArX may refer to: * ARX (Algorithmic Research Ltd.), a digital security company *ARX (gene), Aristaless related homeobox *ARX (operating system), an operating system * ArX (revision control), revision control software *Arx (Roman), a ...
; DMBX1; DPRX; DRGX; DUXA, DUXB, DUX ( 1, 2, 3, 4, 4c, 5); ESX1; GSC, GSC2;
HESX1 Homeobox expressed in ES cells 1, also known as homeobox protein ANF, is a homeobox protein that in humans is encoded by the ''HESX1'' gene In biology, the word gene (from , ; "...Wilhelm Johannsen coined the word gene to describe the Men ...
; HOPX; ISX; LEUTX; MIXL1; NOBOX; OTP;
OTX1 Homeobox protein OTX1 is a protein that in humans is encoded by the ''OTX1'' gene In biology, the word gene (from , ; "...Wilhelm Johannsen coined the word gene to describe the Mendelian units of heredity..." meaning ''generation'' or ''b ...
,
OTX2 Homeobox protein OTX2 is a protein that in humans is encoded by the ''OTX2'' gene. Function This gene encodes a member of the bicoid sub-family of homeodomain-containing transcription factors. The encoded protein acts as a transcription facto ...
, CRX;
PAX2 Paired box gene 2, also known as Pax-2, is a protein which in humans is encoded by the ''PAX2'' gene. Function The Pax Genes, or Paired-Box Containing Genes, play important roles in the development and proliferation of multiple cell lines, d ...
,
PAX3 The PAX3 (paired box gene 3) gene encodes a member of the paired box or PAX family of transcription factors. The PAX family consists of nine human (PAX1-PAX9) and nine mouse (Pax1-Pax9) members arranged into four subfamilies. Human PAX3 and mouse ...
, PAX4, PAX5,
PAX6 Paired box protein Pax-6, also known as aniridia type II protein (AN2) or oculorhombin, is a protein that in humans is encoded by the ''PAX6'' gene. Function PAX6 is a member of the Pax gene family which is responsible for carrying the gene ...
,
PAX7 Paired box protein Pax-7 is a protein that in humans is encoded by the ''PAX7'' gene. Function Pax-7 plays a role in neural crest development and gastrulation, and it is an important factor in the expression of neural crest markers such as Slug ...
,
PAX8 Paired box gene 8, also known as PAX8, is a protein which in humans is encoded by the ''PAX8'' gene. Function This gene is a member of the paired box ( PAX) family of transcription factors. Members of this gene family typically encode proteins ...
;
PHOX2A Paired mesoderm homeobox protein 2A is a protein that in humans is encoded by the ''PHOX2A'' gene. Function The protein encoded by this gene contains a paired-like homeodomain most similar to that of the Drosophila aristaless gene product. T ...
, PHOX2B;
PITX1 Paired-like homeodomain 1 is a protein that in humans is encoded by the ''PITX1'' gene. Function This gene encodes a member of the RIEG/PITX homeobox family, which is in the bicoid class of homeodomain proteins. Members of this family are inv ...
, PITX2, PITX3;
PROP1 Homeobox protein prophet of PIT-1 is a protein that in humans is encoded by the ''PROP1'' gene In biology, the word gene (from , ; "...Wilhelm Johannsen coined the word gene to describe the Mendelian units of heredity..." meaning ''genera ...
; PRRX1, PRRX2;
RAX The Rax is a mountain range in the Northern Limestone Alps on the border of the Austrian federal provinces of Lower Austria and Styria. Its highest peak is the ''Heukuppe'' (2,007 m). The Rax, together with the nearby Schneeberg, are a traditio ...
, RAX2; RHOXF1, RHOXF2/ 2B; SEBOX;
SHOX The short-stature homeobox gene (SHOX), also known as short-stature-homeobox-containing gene, is a gene located on both the X and Y chromosomes, which is associated with short stature in humans if mutated or present in only one copy ( haploinsuff ...
, SHOX2; TPRX1; UNCX; VSX1, VSX2 * NKL-class: BARHL1, BARHL2; BARX1, BARX2; BSX;
DBX1 Homeobox protein DBX1, also known as developing brain homeobox protein 1, is a protein that in humans is encoded by the ''DBX1'' gene In biology, the word gene (from , ; "...Wilhelm Johannsen coined the word gene to describe the Mendelian ...
, DBX2; EMX1, EMX2; EN1, EN2;
HHEX Hematopoietically-expressed homeobox protein HHEX is a protein that in humans is encoded by the ''HHEX'' gene and also known as Proline Rich Homeodomain protein PRH. This gene encodes a member of the homeobox family of transcription factors, many ...
; HLX1;
LBX1 Transcription factor LBX1 is a protein that in humans is encoded by the ''LBX1'' gene. This gene and the orthologous mouse gene were found by their homology to the Drosophila lady bird early and late homeobox genes. In the mouse, this gene is ...
,
LBX2 In computing, LBX, or Low Bandwidth X, is a protocol to use the X Window System over network links with low bandwidth and high latency. It was introduced in X11R6.3 ("Broadway") in 1996, but never achieved wide use. It was disabled by default as ...
;
MSX1 Homeobox protein MSX-1, is a protein that in humans is encoded by the ''MSX1'' gene. MSX1 transcripts are not only found in thyrotrope-derived TSH cells, but also in the TtT97 thyrotropic tumor, which is a well differentiated hyperplastic tissue ...
,
MSX2 MSX is a standardized home computer architecture, announced by Microsoft and ASCII Corporation on June 16, 1983. It was initially conceived by Microsoft as a product for the Eastern sector, and jointly marketed by Kazuhiko Nishi, then vice- ...
; NANOG;
NOTO Noto ( scn, Notu; la, Netum) is a city and in the Province of Syracuse, Sicily, Italy. It is southwest of the city of Syracuse at the foot of the Iblean Mountains. It lends its name to the surrounding area Val di Noto. In 2002 Noto and i ...
; TLX1,
TLX2 T-cell leukemia homeobox protein 2 is a protein that in humans is encoded by the ''TLX2'' gene. Interactions TLX2 has been shown to interact Advocates for Informed Choice, dba interACT or interACT Advocates for Intersex Youth, is a 501( ...
,
TLX3 T-cell leukemia homeobox protein 3 is a protein that in humans is encoded by the ''TLX3'' gene. RNX (HOX11L2, TLX3) belongs to a family of orphan homeobox genes that encode DNA-binding nuclear transcription factors. Members of the HOX11 gene f ...
; TSHZ1, TSHZ2,
TSHZ3 Teashirt homolog 3 is a protein that in humans is encoded by the ''TSHZ3'' gene. In mice, it is a necessary part of the neural circuitry that controls breathing. The gene is also a homolog of the Drosophila melanogaster teashirt gene, which encod ...
; VAX1, VAX2, VENTX; ** Nkx:
NKX2-1 NK2 homeobox 1 (NKX2-1), also known as thyroid transcription factor 1 (TTF-1), is a protein which in humans is encoded by the ''NKX2-1'' gene. Function Thyroid transcription factor-1 (TTF-1) is a protein that regulates transcription (genetics ...
,
NKX2-4 Marine Corps Air Station Miramar (MCAS Miramar) , formerly Naval Auxiliary Air Station (NAAS) Miramar and Naval Air Station (NAS) Miramar, is a United States Marine Corps installation that is home to the 3rd Marine Aircraft Wing, which is the av ...
; NKX2-2,
NKX2-8 Marine Corps Air Station Miramar (MCAS Miramar) , formerly Naval Auxiliary Air Station (NAAS) Miramar and Naval Air Station (NAS) Miramar, is a United States Marine Corps installation that is home to the 3rd Marine Aircraft Wing, which is the av ...
;
NKX3-1 Homeobox protein Nkx-3.1, also known as NKX3-1, NKX3, BAPX2, NKX3A and NKX3.1 is a protein that in humans is encoded by the ''NKX3-1'' gene located on chromosome 8p. ''NKX3-1'' is a prostatic tumor suppressor gene. ''NKX3-1'' is an androgen-reg ...
,
NKX3-2 NK3 homeobox 2 also known as NKX3-2 is a human gene. It is a homolog of ''bagpipe (bap)'' in ''Drosophila'' and therefore also known as Bapx1 (bagpipe homeobox homolog 1). The protein encoded by this gene is a homeodomain containing transcriptio ...
; NKX2-3,
NKX2-5 Homeobox protein Nkx-2.5 is a protein that in humans is encoded by the ''NKX2-5'' gene. Function Homeobox-containing genes play critical roles in regulating tissue-specific gene expression essential for tissue differentiation, as well as det ...
,
NKX2-6 Marine Corps Air Station Miramar (MCAS Miramar) , formerly Naval Auxiliary Air Station (NAAS) Miramar and Naval Air Station (NAS) Miramar, is a United States Marine Corps installation that is home to the 3rd Marine Aircraft Wing, which is the avi ...
; HMX1, HMX2, HMX3;
NKX6-1 Homeobox protein Nkx-6.1 is a protein that in humans is encoded by the ''NKX6-1'' gene. Function In the pancreas, NKX6.1 is required for the development of beta cell Beta cells (β-cells) are a type of cell found in pancreatic islets that s ...
;
NKX6-2 Homeobox protein Nkx-6.2 is a protein that in humans is encoded by the ''NKX6-2'' gene In biology, the word gene (from , ; "... Wilhelm Johannsen coined the word gene to describe the Mendelian units of heredity..." meaning ''generation'' ...
; NKX6-3;


See also

*
Evolutionary developmental biology Evolutionary developmental biology (informally, evo-devo) is a field of biological research that compares the developmental processes of different organisms to infer how developmental processes evolved. The field grew from 19th-century beginn ...
*
Body plan A body plan, ( ), or ground plan is a set of morphological features common to many members of a phylum of animals. The vertebrates share one body plan, while invertebrates have many. This term, usually applied to animals, envisages a "blueprin ...


References


Further reading

* * *


External links


The Homeodomain Resource (National Human Genome Research Institute, National Institutes of Health)

HomeoDB: a database of homeobox genes diversity. Zhong YF, Butts T, Holland PWH, since 2008.
* * {{genarch Genes Developmental genetics Protein domains Transcription factors Evolutionary developmental biology