HOME

TheInfoList



OR:

A homeobox is a
DNA sequence A nucleic acid sequence is a succession of bases within the nucleotides forming alleles within a DNA (using GACT) or RNA (GACU) molecule. This succession is denoted by a series of a set of five different letters that indicate the order of the nu ...
, around 180
base pair A base pair (bp) is a fundamental unit of double-stranded nucleic acids consisting of two nucleobases bound to each other by hydrogen bonds. They form the building blocks of the DNA double helix and contribute to the folded structure of both DNA ...
s long, that regulates large-scale anatomical features in the early stages of embryonic development. Mutations in a homeobox may change large-scale anatomical features of the full-grown organism. Homeoboxes are found within
gene In biology, the word gene has two meanings. The Mendelian gene is a basic unit of heredity. The molecular gene is a sequence of nucleotides in DNA that is transcribed to produce a functional RNA. There are two types of molecular genes: protei ...
s that are involved in the regulation of patterns of anatomical development (
morphogenesis Morphogenesis (from the Greek ''morphê'' shape and ''genesis'' creation, literally "the generation of form") is the biological process that causes a cell, tissue or organism to develop its shape. It is one of three fundamental aspects of deve ...
) in
animal Animals are multicellular, eukaryotic organisms in the Biology, biological Kingdom (biology), kingdom Animalia (). With few exceptions, animals heterotroph, consume organic material, Cellular respiration#Aerobic respiration, breathe oxygen, ...
s,
fungi A fungus (: fungi , , , or ; or funguses) is any member of the group of eukaryotic organisms that includes microorganisms such as yeasts and mold (fungus), molds, as well as the more familiar mushrooms. These organisms are classified as one ...
,
plant Plants are the eukaryotes that form the Kingdom (biology), kingdom Plantae; they are predominantly Photosynthesis, photosynthetic. This means that they obtain their energy from sunlight, using chloroplasts derived from endosymbiosis with c ...
s, and numerous single cell
eukaryote The eukaryotes ( ) constitute the Domain (biology), domain of Eukaryota or Eukarya, organisms whose Cell (biology), cells have a membrane-bound cell nucleus, nucleus. All animals, plants, Fungus, fungi, seaweeds, and many unicellular organisms ...
s. Homeobox genes encode homeodomain
protein Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residue (biochemistry), residues. Proteins perform a vast array of functions within organisms, including Enzyme catalysis, catalysing metab ...
products that are
transcription factor In molecular biology, a transcription factor (TF) (or sequence-specific DNA-binding factor) is a protein that controls the rate of transcription (genetics), transcription of genetics, genetic information from DNA to messenger RNA, by binding t ...
s sharing a characteristic
protein fold A protein superfamily is the largest grouping (clade) of proteins for which common ancestry can be inferred (see homology). Usually this common ancestry is inferred from structural alignment and mechanistic similarity, even if no sequence similar ...
structure that binds
DNA Deoxyribonucleic acid (; DNA) is a polymer composed of two polynucleotide chains that coil around each other to form a double helix. The polymer carries genetic instructions for the development, functioning, growth and reproduction of al ...
to regulate expression of target genes. Homeodomain proteins regulate gene expression and cell differentiation during early embryonic development, thus mutations in homeobox genes can cause developmental disorders.
Homeosis In evolutionary developmental biology, homeosis is the transformation of one organ into another, arising from mutation in or misexpression of certain developmentally critical genes, specifically homeotic genes. In animals, these developmental gen ...
is a term coined by
William Bateson William Bateson (8 August 1861 – 8 February 1926) was an English biologist who was the first person to use the term genetics to describe the study of heredity, and the chief populariser of the ideas of Gregor Mendel following their rediscover ...
to describe the outright replacement of a discrete body part with another body part, e.g.
antennapedia ''Antennapedia'' (abbreviated ''Antp'') is a Hox gene first discovered in ''Drosophila'' which controls the formation of legs during development. Loss-of-function mutations in the regulatory region of this gene result in the development of t ...
—replacement of the antenna on the head of a fruit fly with legs. The "homeo-" prefix in the words "homeobox" and "homeodomain" stems from this mutational phenotype, which is observed when some of these genes are mutated in
animals Animals are multicellular, eukaryotic organisms in the biological kingdom Animalia (). With few exceptions, animals consume organic material, breathe oxygen, have myocytes and are able to move, can reproduce sexually, and grow from a ...
. The homeobox domain was first identified in a number of ''Drosophila'' homeotic and segmentation proteins, but is now known to be well-conserved in many other animals, including
vertebrate Vertebrates () are animals with a vertebral column (backbone or spine), and a cranium, or skull. The vertebral column surrounds and protects the spinal cord, while the cranium protects the brain. The vertebrates make up the subphylum Vertebra ...
s.


Discovery

The existence of homeobox genes was first discovered in ''
Drosophila ''Drosophila'' (), from Ancient Greek δρόσος (''drósos''), meaning "dew", and φίλος (''phílos''), meaning "loving", is a genus of fly, belonging to the family Drosophilidae, whose members are often called "small fruit flies" or p ...
'' by isolating the gene responsible for a homeotic transformation where legs grow from the head instead of the expected antennae. Walter Gehring identified a gene called ''
antennapedia ''Antennapedia'' (abbreviated ''Antp'') is a Hox gene first discovered in ''Drosophila'' which controls the formation of legs during development. Loss-of-function mutations in the regulatory region of this gene result in the development of t ...
'' that caused this homeotic phenotype. Analysis of ''antennapedia'' revealed that this gene contained a 180 base pair sequence that encoded a DNA binding domain, which William McGinnis termed the "homeobox". The existence of additional ''Drosophila'' genes containing the ''antennapedia'' homeobox sequence was independently reported by Ernst Hafen, Michael Levine,
William McGinnis William McGinnis is an American molecular biologist who is a professor of biology at the University of California San Diego. At UC San Diego he has also served as the Chairman of the Department of Biology from July 1998 to June 1999, as Associa ...
, and
Walter Jakob Gehring Walter Jakob Gehring (20 March 1939 – 29 May 2014) was a Swiss developmental biologist who was a professor at the Biozentrum Basel of the University of Basel, Switzerland. He obtained his PhD at the University of Zurich in 1965 and after two ...
of the
University of Basel The University of Basel (Latin: ''Universitas Basiliensis''; German: ''Universität Basel'') is a public research university in Basel, Switzerland. Founded on 4 April 1460, it is Switzerland's oldest university and among the world's oldest univ ...
in
Switzerland Switzerland, officially the Swiss Confederation, is a landlocked country located in west-central Europe. It is bordered by Italy to the south, France to the west, Germany to the north, and Austria and Liechtenstein to the east. Switzerland ...
and Matthew P. Scott and Amy Weiner of
Indiana University Indiana University (IU) is a state university system, system of Public university, public universities in the U.S. state of Indiana. The system has two core campuses, five regional campuses, and two regional centers under the administration o ...
in Bloomington in 1984. Isolation of homologous genes by Edward de Robertis and William McGinnis revealed that numerous genes from a variety of species contained the homeobox. Subsequent
phylogenetic In biology, phylogenetics () is the study of the evolutionary history of life using observable characteristics of organisms (or genes), which is known as phylogenetic inference. It infers the relationship among organisms based on empirical dat ...
studies detailing the evolutionary relationship between homeobox-containing genes showed that these genes are present in all
bilaterian Bilateria () is a large clade of animals characterised by bilateral symmetry during embryonic development. This means their body plans are laid around a longitudinal axis with a front (or "head") and a rear (or "tail") end, as well as a left–r ...
animals.


Homeodomain structure

The characteristic homeodomain
protein fold A protein superfamily is the largest grouping (clade) of proteins for which common ancestry can be inferred (see homology). Usually this common ancestry is inferred from structural alignment and mechanistic similarity, even if no sequence similar ...
consists of a 60-
amino acid Amino acids are organic compounds that contain both amino and carboxylic acid functional groups. Although over 500 amino acids exist in nature, by far the most important are the 22 α-amino acids incorporated into proteins. Only these 22 a ...
long domain composed of three
alpha helices An alpha helix (or α-helix) is a sequence of amino acids in a protein that are twisted into a coil (a helix). The alpha helix is the most common structural arrangement in the secondary structure of proteins. It is also the most extreme type of l ...
. The following shows the consensus homeodomain (~60 amino acid chain):
            Helix 1          Helix 2         Helix 3/4
         ______________    __________    _________________
RRRKRTAYTRYQLLELEKEFHFNRYLTRRRRIELAHSLNLTERHIKIWFQNRRMKWKKEN
...., ...., ...., ...., ...., ...., ...., ...., ...., ...., ...., ...., 
         10        20        30        40        50        60
Helix 2 and helix 3 form a so-called
helix-turn-helix Helix-turn-helix is a DNA-binding domain (DBD). The helix-turn-helix (HTH) is a major structural motif capable of binding DNA. Each monomer incorporates two alpha helix, α helices, joined by a short strand of amino acids, that bind to the majo ...
(HTH) structure, where the two alpha helices are connected by a short loop region. The
N-terminal The N-terminus (also known as the amino-terminus, NH2-terminus, N-terminal end or amine-terminus) is the start of a protein or polypeptide, referring to the free amine group (-NH2) located at the end of a polypeptide. Within a peptide, the amin ...
two helices of the homeodomain are antiparallel and the longer
C-terminal The C-terminus (also known as the carboxyl-terminus, carboxy-terminus, C-terminal tail, carboxy tail, C-terminal end, or COOH-terminus) is the end of an amino acid chain (protein or polypeptide), terminated by a free carboxyl group (-COOH). When t ...
helix is roughly perpendicular to the axes of the first two. It is this third helix that interacts directly with
DNA Deoxyribonucleic acid (; DNA) is a polymer composed of two polynucleotide chains that coil around each other to form a double helix. The polymer carries genetic instructions for the development, functioning, growth and reproduction of al ...
via a number of hydrogen bonds and hydrophobic interactions, as well as indirect interactions via water molecules, which occur between specific
side chain In organic chemistry and biochemistry, a side chain is a substituent, chemical group that is attached to a core part of the molecule called the "main chain" or backbone chain, backbone. The side chain is a hydrocarbon branching element of a mo ...
s and the exposed bases within the
major groove In molecular biology, the term double helix refers to the structure formed by double-stranded molecules of nucleic acids such as DNA. The double helical structure of a nucleic acid complex arises as a consequence of its secondary structure, a ...
of the DNA. Homeodomain proteins are found in
eukaryote The eukaryotes ( ) constitute the Domain (biology), domain of Eukaryota or Eukarya, organisms whose Cell (biology), cells have a membrane-bound cell nucleus, nucleus. All animals, plants, Fungus, fungi, seaweeds, and many unicellular organisms ...
s. Through the HTH motif, they share limited sequence similarity and structural similarity to prokaryotic transcription factors, such as
lambda phage Lambda phage (coliphage λ, scientific name ''Lambdavirus lambda'') is a bacterial virus, or bacteriophage, that infects the bacterial species ''Escherichia coli'' (''E. coli''). It was discovered by Esther Lederberg in 1950. The wild type of ...
proteins that alter the expression of genes in
prokaryote A prokaryote (; less commonly spelled procaryote) is a unicellular organism, single-celled organism whose cell (biology), cell lacks a cell nucleus, nucleus and other membrane-bound organelles. The word ''prokaryote'' comes from the Ancient Gree ...
s. The HTH motif shows some sequence similarity but a similar structure in a wide range of DNA-binding proteins (e.g., cro and
repressor protein In molecular genetics, a repressor is a DNA- or RNA-binding protein that inhibits the expression of one or more genes by binding to the operator or associated silencers. A DNA-binding repressor blocks the attachment of RNA polymerase to t ...
s, homeodomain proteins, etc.). One of the principal differences between HTH motifs in these different proteins arises from the stereochemical requirement for
glycine Glycine (symbol Gly or G; ) is an amino acid that has a single hydrogen atom as its side chain. It is the simplest stable amino acid. Glycine is one of the proteinogenic amino acids. It is encoded by all the codons starting with GG (G ...
in the turn which is needed to avoid
steric Steric effects arise from the spatial arrangement of atoms. When atoms come close together there is generally a rise in the energy of the molecule. Steric effects are nonbonding interactions that influence the shape ( conformation) and reactivi ...
interference of the beta-carbon with the main chain: for cro and repressor proteins the glycine appears to be mandatory, whereas for many of the homeotic and other DNA-binding proteins the requirement is relaxed.


Sequence specificity

Homeodomains can bind both specifically and nonspecifically to B-DNA with the C-terminal recognition helix aligning in the DNA's major groove and the unstructured peptide "tail" at the N-terminus aligning in the minor groove. The recognition helix and the inter-helix loops are rich in
arginine Arginine is the amino acid with the formula (H2N)(HN)CN(H)(CH2)3CH(NH2)CO2H. The molecule features a guanidinium, guanidino group appended to a standard amino acid framework. At physiological pH, the carboxylic acid is deprotonated (−CO2−) a ...
and
lysine Lysine (symbol Lys or K) is an α-amino acid that is a precursor to many proteins. Lysine contains an α-amino group (which is in the protonated form when the lysine is dissolved in water at physiological pH), an α-carboxylic acid group ( ...
residues, which form
hydrogen bond In chemistry, a hydrogen bond (H-bond) is a specific type of molecular interaction that exhibits partial covalent character and cannot be described as a purely electrostatic force. It occurs when a hydrogen (H) atom, Covalent bond, covalently b ...
s to the DNA backbone. Conserved
hydrophobic In chemistry, hydrophobicity is the chemical property of a molecule (called a hydrophobe) that is seemingly repelled from a mass of water. In contrast, hydrophiles are attracted to water. Hydrophobic molecules tend to be nonpolar and, thu ...
residues in the center of the recognition helix aid in stabilizing the helix packing. Homeodomain proteins show a preference for the DNA sequence 5'-TAAT-3'; sequence-independent binding occurs with significantly lower affinity. The specificity of a single homeodomain protein is usually not enough to recognize specific target gene promoters, making cofactor binding an important mechanism for controlling binding sequence specificity and target gene expression. To achieve higher target specificity, homeodomain proteins form complexes with other transcription factors to recognize the
promoter region In genetics, a promoter is a sequence of DNA to which proteins bind to initiate transcription of a single RNA transcript from the DNA downstream of the promoter. The RNA transcript may encode a protein (mRNA), or can have a function in and of it ...
of a specific target gene.


Biological function

Homeodomain proteins function as
transcription factors In molecular biology, a transcription factor (TF) (or sequence-specific DNA-binding factor) is a protein that controls the rate of transcription of genetic information from DNA to messenger RNA, by binding to a specific DNA sequence. The fun ...
due to the DNA binding properties of the conserved HTH motif. Homeodomain proteins are considered to be master control genes, meaning that a single protein can regulate expression of many target genes. Homeodomain proteins direct the formation of the body axes and body structures during early embryonic development. Many homeodomain proteins induce
cellular differentiation Cellular differentiation is the process in which a stem cell changes from one type to a differentiated one. Usually, the cell changes to a more specialized type. Differentiation happens multiple times during the development of a multicellula ...
by initiating the cascades of coregulated genes required to produce individual tissues and
organs In a multicellular organism, an organ is a collection of tissues joined in a structural unit to serve a common function. In the hierarchy of life, an organ lies between tissue and an organ system. Tissues are formed from same type cells to a ...
. Other proteins in the family, such as NANOG are involved in maintaining
pluripotency Cell potency is a cell's ability to differentiate into other cell types. The more cell types a cell can differentiate into, the greater its potency. Potency is also described as the gene activation potential within a cell, which like a continuum ...
and preventing cell differentiation.


Regulation

Hox genes Hox genes, a subset of homeobox genes, are a group of related genes that specify regions of the body plan of an embryo along the head-tail axis of animals. Hox proteins encode and specify the characteristics of 'position', ensuring that the c ...
and their associated
microRNA Micro ribonucleic acid (microRNA, miRNA, μRNA) are small, single-stranded, non-coding RNA molecules containing 21–23 nucleotides. Found in plants, animals, and even some viruses, miRNAs are involved in RNA silencing and post-transcr ...
s are highly conserved developmental master regulators with tight tissue-specific, spatiotemporal control. These genes are known to be dysregulated in several cancers and are often controlled by DNA methylation. The regulation of Hox genes is highly complex and involves reciprocal interactions, mostly inhibitory.
Drosophila ''Drosophila'' (), from Ancient Greek δρόσος (''drósos''), meaning "dew", and φίλος (''phílos''), meaning "loving", is a genus of fly, belonging to the family Drosophilidae, whose members are often called "small fruit flies" or p ...
is known to use the
polycomb Polycomb-group proteins (PcG proteins) are a family of protein complexes first discovered in fruit flies that can remodel chromatin such that epigenetic silencing of genes takes place. Polycomb-group proteins are well known for silencing Hox genes ...
and trithorax complexes to maintain the expression of Hox genes after the down-regulation of the pair-rule and gap genes that occurs during larval development.
Polycomb-group proteins Polycomb-group proteins (PcG proteins) are a family of protein complexes first discovered in fruit flies that can remodel chromatin such that epigenetic silencing of genes takes place. Polycomb-group proteins are well known for silencing Hox genes ...
can silence the Hox genes by modulation of
chromatin Chromatin is a complex of DNA and protein found in eukaryote, eukaryotic cells. The primary function is to package long DNA molecules into more compact, denser structures. This prevents the strands from becoming tangled and also plays important r ...
structure.


Mutations

Mutations to homeobox genes can produce easily visible
phenotypic In genetics, the phenotype () is the set of observable characteristics or traits of an organism. The term covers the organism's morphology (physical form and structure), its developmental processes, its biochemical and physiological propert ...
changes in body segment identity, such as the Antennapedia and Bithorax mutant phenotypes in ''Drosophila''. Duplication of homeobox genes can produce new body segments, and such duplications are likely to have been important in the
evolution Evolution is the change in the heritable Phenotypic trait, characteristics of biological populations over successive generations. It occurs when evolutionary processes such as natural selection and genetic drift act on genetic variation, re ...
of segmented animals.


Evolution

Phylogenetic analysis of homeobox gene sequences and homeodomain protein structures suggests that the last common ancestor of plants, fungi, and animals had at least two homeobox genes. Molecular evidence shows that some limited number of Hox genes have existed in the
Cnidaria Cnidaria ( ) is a phylum under kingdom Animalia containing over 11,000 species of aquatic invertebrates found both in fresh water, freshwater and marine environments (predominantly the latter), including jellyfish, hydroid (zoology), hydroids, ...
since before the earliest true
Bilatera Bilateria () is a large clade of animals characterised by bilateral symmetry during embryonic development. This means their body plans are laid around a longitudinal axis with a front (or "head") and a rear (or "tail") end, as well as a left–r ...
, making these genes pre-
Paleozoic The Paleozoic ( , , ; or Palaeozoic) Era is the first of three Era (geology), geological eras of the Phanerozoic Eon. Beginning 538.8 million years ago (Ma), it succeeds the Neoproterozoic (the last era of the Proterozoic Eon) and ends 251.9 Ma a ...
. It is accepted that the three major animal ANTP-class clusters, Hox, ParaHox, and NK (MetaHox), are the result of segmental duplications. A first duplication created MetaHox and ProtoHox, the latter of which later duplicated into Hox and ParaHox. The clusters themselves were created by tandem duplications of a single ANTP-class homeobox gene. Gene duplication followed by neofunctionalization is responsible for the many homeobox genes found in eukaryotes. Comparison of homeobox genes and gene clusters has been used to understand the evolution of genome structure and body morphology throughout metazoans.


Types of homeobox genes


Hox genes

Hox genes are the most commonly known subset of homeobox genes. They are essential
metazoan Animals are multicellular, eukaryotic organisms in the biological kingdom Animalia (). With few exceptions, animals consume organic material, breathe oxygen, have myocytes and are able to move, can reproduce sexually, and grow from a ho ...
genes that determine the identity of embryonic regions along the anterior-posterior axis. The first vertebrate Hox gene was isolated in ''
Xenopus ''Xenopus'' () (Gk., ξενος, ''xenos'' = strange, πους, ''pous'' = foot, commonly known as the clawed frog) is a genus of highly aquatic frogs native to sub-Saharan Africa. Twenty species are currently described with ...
'' by Edward De Robertis and colleagues in 1984. The main interest in this set of genes stems from their unique behavior and arrangement in the genome. Hox genes are typically found in an organized cluster. The linear order of Hox genes within a cluster is directly correlated to the order in which they are expressed in both time and space during development. This phenomenon is called colinearity. Mutations in these
homeotic genes Homeotic genes are genes which regulate the development of anatomical structures in various organisms such as echinoderms, insects, mammals, and plants. Homeotic genes often encode transcription factor proteins, and these proteins affect developme ...
cause displacement of body segments during embryonic development. This is called ectopia. For example, when one gene is lost the segment develops into a more anterior one, while a mutation that leads to a gain of function causes a segment to develop into a more posterior one. Famous examples are ''
Antennapedia ''Antennapedia'' (abbreviated ''Antp'') is a Hox gene first discovered in ''Drosophila'' which controls the formation of legs during development. Loss-of-function mutations in the regulatory region of this gene result in the development of t ...
'' and bithorax in ''Drosophila'', which can cause the development of legs instead of antennae and the development of a duplicated thorax, respectively. In vertebrates, the four
paralog Sequence homology is the biological homology between DNA, RNA, or protein sequences, defined in terms of shared ancestry in the evolutionary history of life. Two segments of DNA can have shared ancestry because of three phenomena: either a sp ...
clusters are partially redundant in function, but have also acquired several derived functions. For example, HoxA and HoxD specify segment identity along the
limb Limb may refer to: Science and technology *Limb (anatomy), an appendage of a human or animal *Limb, a large or main branch of a tree *Limb, in astronomy, the curved edge of the apparent disk of a celestial body, e.g. lunar limb *Limb, in botany, t ...
axis. Specific members of the Hox family have been implicated in vascular remodeling,
angiogenesis Angiogenesis is the physiological process through which new blood vessels form from pre-existing vessels, formed in the earlier stage of vasculogenesis. Angiogenesis continues the growth of the vasculature mainly by processes of sprouting and ...
, and disease by orchestrating changes in matrix degradation, integrins, and components of the ECM. HoxA5 is implicated in atherosclerosis. HoxD3 and HoxB3 are proinvasive, angiogenic genes that upregulate b3 and a5 integrins and Efna1 in ECs, respectively. HoxA3 induces
endothelial The endothelium (: endothelia) is a single layer of squamous endothelial cells that line the interior surface of blood vessels and lymphatic vessels. The endothelium forms an interface between circulating blood or lymph in the lumen and the res ...
cell (EC) migration by upregulating MMP14 and uPAR. Conversely, HoxD10 and HoxA5 have the opposite effect of suppressing EC migration and angiogenesis, and stabilizing adherens junctions by upregulating TIMP1/downregulating uPAR and MMP14, and by upregulating Tsp2/downregulating VEGFR2, Efna1, Hif1alpha and COX-2, respectively. HoxA5 also upregulates the tumor suppressor p53 and Akt1 by downregulation of PTEN. Suppression of HoxA5 has been shown to attenuate
hemangioma A hemangioma or haemangioma is a usually benign vascular tumor derived from blood vessel cell types. The most common form, seen in infants, is an infantile hemangioma, known colloquially as a "strawberry mark", most commonly presenting on the sk ...
growth. HoxA5 has far-reaching effects on gene expression, causing ~300 genes to become upregulated upon its induction in breast cancer cell lines. HoxA5 protein transduction domain overexpression prevents inflammation shown by inhibition of TNFalpha-inducible monocyte binding to HUVECs.


LIM genes

LIM genes (named after the initial letters of the names of three proteins where the characteristic domain was first identified) encode two 60 amino acid cysteine and histidine-rich LIM domains and a homeodomain. The LIM domains function in protein-protein interactions and can bind zinc molecules. LIM domain proteins are found in both the cytosol and the nucleus. They function in cytoskeletal remodeling, at focal adhesion sites, as scaffolds for protein complexes, and as transcription factors.


Pax genes

Most Pax genes contain a homeobox and a paired domain that also binds DNA to increase binding specificity, though some Pax genes have lost all or part of the homeobox sequence. Pax genes function in embryo segmentation,
nervous system In biology, the nervous system is the complex system, highly complex part of an animal that coordinates its behavior, actions and sense, sensory information by transmitting action potential, signals to and from different parts of its body. Th ...
development, generation of the
frontal eye fields The frontal eye fields (FEF) are a region located in the frontal cortex, more specifically in Brodmann area 8 or BA8, of the primate brain. In humans, it can be more accurately said to lie in a region around the intersection of the middle frontal ...
,
skeletal A skeleton is the structural frame that supports the body of most animals. There are several types of skeletons, including the exoskeleton, which is a rigid outer shell that holds up an organism's shape; the endoskeleton, a rigid internal fram ...
development, and formation of face structures. Pax 6 is a master regulator of eye development, such that the gene is necessary for development of the optic vesicle and subsequent eye structures.


POU genes

Proteins containing a POU region consist of a homeodomain and a separate, structurally homologous POU domain that contains two
helix-turn-helix Helix-turn-helix is a DNA-binding domain (DBD). The helix-turn-helix (HTH) is a major structural motif capable of binding DNA. Each monomer incorporates two alpha helix, α helices, joined by a short strand of amino acids, that bind to the majo ...
motifs and also binds DNA. The two domains are linked by a flexible loop that is long enough to stretch around the DNA helix, allowing the two domains to bind on opposite sides of the target DNA, collectively covering an eight-base segment with
consensus sequence In molecular biology and bioinformatics, the consensus sequence (or canonical sequence) is the calculated sequence of most frequent residues, either nucleotide or amino acid, found at each position in a sequence alignment. It represents the result ...
5'-ATGCAAAT-3'. The individual domains of POU proteins bind DNA only weakly, but have strong sequence-specific affinity when linked. The POU domain itself has significant structural similarity with repressors expressed in
bacteriophage A bacteriophage (), also known informally as a phage (), is a virus that infects and replicates within bacteria. The term is derived . Bacteriophages are composed of proteins that Capsid, encapsulate a DNA or RNA genome, and may have structu ...
s, particularly
lambda phage Lambda phage (coliphage λ, scientific name ''Lambdavirus lambda'') is a bacterial virus, or bacteriophage, that infects the bacterial species ''Escherichia coli'' (''E. coli''). It was discovered by Esther Lederberg in 1950. The wild type of ...
.


Plant homeobox genes

As in animals, the plant homeobox genes code for the typical 60 amino acid long DNA-binding homeodomain or in case of the TALE (three amino acid loop extension) homeobox genes for an atypical homeodomain consisting of 63 amino acids. According to their conserved intron–exon structure and to unique codomain architectures they have been grouped into 14 distinct classes: HD-ZIP I to IV, BEL, KNOX, PLINC, WOX, PHD, DDT, NDX, LD, SAWADEE and PINTOX. Conservation of codomains suggests a common eukaryotic ancestry for TALE and non-TALE homeodomain proteins.


Human homeobox genes

The Hox genes in humans are organized in four chromosomal clusters:
ParaHox The ParaHox gene cluster is an array of homeobox genes (involved in morphogenesis, the regulation of patterns of anatomical development) from the Gsx, Xlox ( Pdx) and Cdx gene families. Regulatory gene cluster These genes were first shown to ...
genes are analogously found in four areas. They include
CDX1 Homeobox protein CDX-1 is a protein in humans that is encoded by the ''CDX1'' gene. CDX-1 is expressed in the developing endoderm and its expression persists in the intestine throughout adulthood. CDX-1 protein expression varies along the intesti ...
, CDX2, CDX4; GSX1, GSX2; and
PDX1 PDX1 (pancreatic and duodenal homeobox 1), also known as insulin promoter factor 1, is a transcription factor in the ParaHox gene cluster.Brooke, N. M., Garcia-Fernàndez, J., & Holland, P. W. (1998). The ParaHox gene cluster is an evolutionary ...
. Other genes considered Hox-like include EVX1, EVX2; GBX1, GBX2; MEOX1, MEOX2; and MNX1. The NK-like (NKL) genes, some of which are considered "MetaHox", are grouped with Hox-like genes into a large ANTP-like group. Humans have a "distal-less homeobox" family:
DLX1 Homeobox protein DLX-1 is a protein that in humans is encoded by the ''DLX1'' gene. Function This gene encodes a member of a homeobox transcription factor gene family similar to the ''Drosophila'' distal-less gene. The encoded protein is loca ...
,
DLX2 Homeobox protein DLX-2 is a protein that in humans is encoded by the ''DLX2'' gene. Many vertebrate homeo box-containing genes have been identified on the basis of their sequence similarity with Drosophila developmental genes. Members of the Dlx ...
, DLX3, DLX4, DLX5, and DLX6. Dlx genes are involved in the development of the nervous system and of limbs. They are considered a subset of the NK-like genes. Human TALE (Three Amino acid Loop Extension) homeobox genes for an "atypical" homeodomain consist of 63 rather than 60 amino acids: IRX1, IRX2, IRX3, IRX4, IRX5,
IRX6 Iroquois-class homeodomain protein IRX-6, also known as Iroquois homeobox protein 6, is a protein that in humans is encoded by the ''IRX6'' gene In biology, the word gene has two meanings. The Mendelian gene is a basic unit of heredity. The ...
; MEIS1, MEIS2, MEIS3; MKX; PBX1, PBX2, PBX3, PBX4; PKNOX1, PKNOX2; TGIF1, TGIF2, TGIF2LX, TGIF2LY. In addition, humans have the following homeobox genes and proteins: * LIM-class: ISL1, ISL2; LHX1, LHX2, LHX3, LHX4, LHX5, LHX6, LHX8, LHX9;
LMX1A LIM homeobox transcription factor 1, alpha, also known as LMX1A, is a protein which in humans is encoded by the ''LMX1A'' gene. Function Insulin is produced exclusively by the beta cells in the islets of Langerhans in the pancreas. The level a ...
,
LMX1B LIM homeobox transcription factor 1-beta, also known as LMX1B, is a protein which in humans is encoded by the ''LMX1B'' gene. Function LMX1B is a LIM domain, LIM homeobox transcription factor which plays a central role in Limb development#Dor ...
* POU-class: HDX; POU1F1;
POU2F1 POU domain, class 2, transcription factor 1 is a protein that in humans is encoded by the ''POU2F1'' gene. Interactions POU2F1 has been shown to interact with: * EPRS, * Glucocorticoid receptor, * Glyceraldehyde 3-phosphate dehydrogenase, ...
; POU2F2; POU2F3;
POU3F1 POU domain, class 3, transcription factor 1 (also known as Oct-6) is a protein that in humans is encoded by the ''POU3F1'' gene In biology, the word gene has two meanings. The Mendelian gene is a basic unit of heredity. The molecular gene i ...
;
POU3F2 POU domain, class 3, transcription factor 2 is a protein that in humans is encoded by the ''POU3F2'' gene. Function N-Oct-3 is a protein belonging to a large family of transcription factors that bind to the octameric DNA sequence ATGCAAAT. Mos ...
;
POU3F3 POU class 3 homeobox 3 is a protein that in humans is encoded by the POU3F3 gene In biology, the word gene has two meanings. The Mendelian gene is a basic unit of heredity. The molecular gene is a sequence of nucleotides in DNA that is tra ...
; POU3F4; POU4F1; POU4F2; POU4F3; POU5F1; POU5F1P1; POU5F1P4; POU5F2; POU6F1; and POU6F2 * CERS-class: LASS2, LASS3, LASS4, LASS5, LASS6; * HNF-class: HMBOX1;
HNF1A HNF1 homeobox A (hepatocyte nuclear factor 1 homeobox A), also known as HNF1A, is a human gene on chromosome 12. It is ubiquitously expressed in many tissues and cell types. The protein encoded by this gene is a transcription factor that is high ...
,
HNF1B HNF1 homeobox B (hepatocyte nuclear factor 1 homeobox B), also known as HNF1B or transcription factor 2 (TCF2), is a human gene. Function HNF1B encodes hepatocyte nuclear factor 1-beta, a protein of the homeobox-containing basic helix-turn-hel ...
; * SINE-class: SIX1, SIX2,
SIX3 Homeobox protein SIX3 is a protein that in humans is encoded by the ''SIX3'' gene. Function The SIX homeobox 3 (SIX3) gene is crucial in embryonic development by providing necessary instructions for the formation of the forebrain and eye dev ...
, SIX4, SIX5, SIX6 * CUT-class:
ONECUT1 One cut homeobox 1 is a protein that in humans is encoded by the ONECUT1 gene. Function This gene encodes a member of the Cut homeobox family of transcription factors. Expression of the encoded protein is enriched in the liver, where it stim ...
, ONECUT2, ONECUT3; CUX1, CUX2; SATB1, SATB2; * ZF-class: ADNP, ADNP2; TSHZ1, TSHZ2, TSHZ3;
ZEB1 Zinc finger E-box-binding homeobox 1 is a protein that in humans is encoded by the ''ZEB1'' gene. ZEB1 (previously known as TCF8) encodes a zinc finger and homeodomain transcription factor that represses T-lymphocyte-specific IL2 gene expression ...
, ZEB2; ZFHX2, ZFHX3, ZFHX4; ZHX1, HOMEZ; * PRD-class: ALX1 (CART1), ALX3, ALX4; ARGFX; ARX; DMBX1; DPRX; DRGX; DUXA, DUXB, DUX ( 1, 2, 3, 4, 4c, 5);
ESX1 Homeobox protein ESX1 is a protein that in humans is encoded by the ''ESX1'' gene In biology, the word gene has two meanings. The Mendelian gene is a basic unit of heredity. The molecular gene is a sequence of nucleotides in DNA that is tra ...
; GSC, GSC2; HESX1; HOPX; ISX; LEUTX; MIXL1; NOBOX; OTP;
OTX1 Homeobox protein OTX1 is a protein that in humans is encoded by the ''OTX1'' gene In biology, the word gene has two meanings. The Mendelian gene is a basic unit of heredity. The molecular gene is a sequence of nucleotides in DNA that is tra ...
,
OTX2 Homeobox protein OTX2 is a protein that in humans is encoded by the ''OTX2'' gene. Function This gene encodes a member of the bicoid sub-family of homeodomain-containing transcription factors. The encoded protein acts as a transcription factor ...
, CRX; PAX2,
PAX3 The PAX3 (paired box gene 3) gene encodes a member of the paired box or Pax genes, PAX family of transcription factors. The PAX family consists of nine human (PAX1-PAX9) and nine mouse (Pax1-Pax9) members arranged into four subfamilies. Human PAX ...
, PAX4,
PAX5 Paired box protein Pax-5 is a protein that in humans is encoded by the ''PAX5'' gene. Function The PAX5 gene is a member of the paired box (PAX) family of transcription factors. The central feature of this gene family is a novel, highly con ...
, PAX6,
PAX7 Paired box protein Pax-7 is a protein that in humans is encoded by the ''PAX7'' gene. Function Pax-7 plays a role in neural crest development and gastrulation, and it is an important factor in the expression of neural crest markers such as Slu ...
,
PAX8 Paired box gene 8, also known as PAX8, is a protein which in humans is encoded by the ''PAX8'' gene. Function This gene is a member of the paired box ( PAX) family of transcription factors. Members of this gene family typically encode proteins ...
; PHOX2A,
PHOX2B Paired-like homeobox 2b (PHOX2B), also known as neuroblastoma Phox (NBPhox), is a protein that in humans is encoded by the PHOX2B gene located on chromosome 4. It codes for a homeodomain transcription factor. It is expressed exclusively in the n ...
; PITX1, PITX2, PITX3; PROP1;
PRRX1 Paired related homeobox 1 is a protein that in humans is encoded by the ''PRRX1'' gene. Function The DNA-associated protein encoded by this gene is a member of the paired family of homeobox proteins localized to the nucleus. The protein functi ...
, PRRX2; RAX, RAX2; RHOXF1, RHOXF2/ 2B; SEBOX; SHOX, SHOX2; TPRX1; UNCX; VSX1, VSX2 * NKL-class: BARHL1, BARHL2; BARX1, BARX2; BSX;
DBX1 Homeobox protein DBX1, also known as developing brain homeobox protein 1, is a protein that in humans is encoded by the ''DBX1'' gene In biology, the word gene has two meanings. The Mendelian gene is a basic unit of heredity. The molecular ...
, DBX2;
EMX1 Homeobox protein EMX1 is a protein that in humans is encoded by the ''EMX1'' gene. The transcribed EMX1 gene is a member of the EMX family of transcription factors. The EMX1 gene, along with its family members, are expressed in the developing cer ...
,
EMX2 Homeobox protein Emx2 is a protein that in humans is encoded by the ''EMX2'' gene. Function The homeodomain transcription factor EMX2 is critical for central nervous system and urogenital development. EMX1 (MIM 600034) and EMX2 are related to ...
; EN1, EN2; HHEX; HLX1;
LBX1 Transcription factor LBX1 is a protein that in humans is encoded by the ''LBX1'' gene. This gene and the orthologous mouse gene were found by their homology to the Drosophila lady bird early and late homeobox genes. In the mouse, this gene is ...
, LBX2;
MSX1 Homeobox protein MSX-1, is a protein that in humans is encoded by the ''MSX1'' gene. MSX1 transcripts are not only found in thyrotrope-derived TSH cells, but also in the TtT97 thyrotropic tumor, which is a well differentiated hyperplastic tissue ...
,
MSX2 MSX is a standardized home computer architecture, announced by ASCII Corporation on June 16, 1983. It was initially conceived by Microsoft as a product for the Eastern sector, and jointly marketed by Kazuhiko Nishi, the director at ASCII Cor ...
; NANOG;
NOTO Noto (; ) is a city and in the Province of Syracuse, Sicily, Italy. It is southwest of the city of Syracuse at the foot of the Iblean Mountains. It lends its name to the surrounding area Val di Noto. In 2002 Noto and its church were decl ...
;
TLX1 T-cell leukemia homeobox protein 1 is a protein that in humans is encoded by the ''TLX1'' gene, which was initially named ''HOX11''. Interactions TLX1 has been shown to Protein-protein interaction, interact with PPP1CC, PPP2CB and PPP2CA. ...
, TLX2, TLX3; TSHZ1, TSHZ2, TSHZ3;
VAX1 Ventral anterior homeobox 1 is a protein that in humans is encoded by the ''VAX1'' gene. Function This gene appears to influence the development in humans of the forebrain. It is also present in mice and xenopus frogs, which suggests a long e ...
, VAX2, VENTX; ** Nkx: NKX2-1, NKX2-4; NKX2-2, NKX2-8; NKX3-1, NKX3-2; NKX2-3,
NKX2-5 Homeobox protein Nkx-2.5 is a protein that in humans is encoded by the ''NKX2-5'' gene. Function Homeobox-containing genes play critical roles in regulating tissue-specific gene expression essential for tissue differentiation, as well as deter ...
, NKX2-6; HMX1, HMX2, HMX3;
NKX6-1 Homeobox protein Nkx-6.1 is a protein that in humans is encoded by the ''NKX6-1'' gene. Function In the pancreas, NKX6.1 is required for the development of beta cell Beta cells (β-cells) are specialized endocrine cells located within t ...
; NKX6-2; NKX6-3;


See also

*
Evolutionary developmental biology Evolutionary developmental biology, informally known as evo-devo, is a field of biological research that compares the developmental biology, developmental processes of different organisms to infer how developmental processes evolution, evolved. ...
*
Body plan A body plan, (), or ground plan is a set of morphology (biology), morphological phenotypic trait, features common to many members of a phylum of animals. The vertebrates share one body plan, while invertebrates have many. This term, usually app ...


References


Further reading

* * *


External links


The Homeodomain Resource (National Human Genome Research Institute, National Institutes of Health)

HomeoDB: a database of homeobox genes diversity. Zhong YF, Butts T, Holland PWH, since 2008.
* * {{genarch Genes Developmental genetics Protein domains Transcription factors Evolutionary developmental biology