BEND2 is a
protein
Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, respo ...
that in humans is encoded by the ''BEND2''
gene
In biology, the word gene (from , ; "...Wilhelm Johannsen coined the word gene to describe the Mendelian units of heredity..." meaning ''generation'' or ''birth'' or ''gender'') can have several different meanings. The Mendelian gene is a ba ...
.
It is also found in other
vertebrate
Vertebrates () comprise all animal taxa within the subphylum Vertebrata () ( chordates with backbones), including all mammals, birds, reptiles, amphibians, and fish. Vertebrates represent the overwhelming majority of the phylum Chordata, ...
s, including mammals, birds, and reptiles.
The expression of BEND2 in ''
Homo sapiens
Humans (''Homo sapiens'') are the most abundant and widespread species of primate, characterized by bipedalism and exceptional cognitive skills due to a large and complex brain. This has enabled the development of advanced tools, culture, ...
'' is regulated and occurs at high levels in the skeletal muscle tissue of the male
testis and in the
bone marrow
Bone marrow is a semi-solid tissue found within the spongy (also known as cancellous) portions of bones. In birds and mammals, bone marrow is the primary site of new blood cell production (or haematopoiesis). It is composed of hematopoietic ce ...
.
The presence of the BEN domains in the BEND2 protein indicates that this protein may be involved in
chromatin modification and regulation.
Gene
Common aliases
BEND2 stands for BEN domain containing 2 and is also known as CXorf20 (HGNC ID
28509.
Locus and size
The locus for BEND2 is on the minus strand of the
X chromosome
The X chromosome is one of the two sex-determining chromosomes (allosomes) in many organisms, including mammals (the other is the Y chromosome), and is found in both males and females. It is a part of the XY sex-determination system and XO sex-d ...
at Xp22.13. The gene is approximately 58 kilobases in length.
mRNA
Alternative splicing
BEND2 contains 14
exon
An exon is any part of a gene that will form a part of the final mature RNA produced by that gene after introns have been removed by RNA splicing. The term ''exon'' refers to both the DNA sequence within a gene and to the corresponding sequen ...
s which undergo
alternative splicing
Alternative splicing, or alternative RNA splicing, or differential splicing, is an alternative splicing process during gene expression that allows a single gene to code for multiple proteins. In this process, particular exons of a gene may be ...
to create five transcript variants that vary from 4,720
base pairs (bp) to 2,144 bp in the mature mRNA.
The longest and most complete transcript of the gene, variant 1, encodes isoform 1 of the BEND2 protein (NP_699177.2).
5' and 3'UTR
The
untranslated regions (UTR) flanking the
coding sequence
The coding region of a gene, also known as the coding sequence (CDS), is the portion of a gene's DNA or RNA that codes for protein. Studying the length, composition, regulation, splicing, structures, and functions of coding regions compared to no ...
of BEND2 at the 5' and 3' end of the mature mRNA molecule contain sites for RNA-binding proteins, including
RBMX,
pum2
Pumilio homolog 2 is an RNA-binding protein that in humans is encoded by the ''PUM2'' gene.
Interactions
PUM2 has been shown to interact with the following proteins:
* CPEB
* DAZL
* DAZ1
Deleted in azoospermia 1, also known as DAZ1, is a protei ...
, and
EIF4B
Eukaryotic translation initiation factor 4B is a protein that in humans is encoded by the ''EIF4B'' gene.
Interactions
eIF4B has been shown to interact with and stimulate the activity of eIF4A and bind to the eIF3 complex through the eIF3A subu ...
as well as
microRNA binding sites. The
5'UTR
The 5′ untranslated region (also known as 5′ UTR, leader sequence, transcript leader, or leader RNA) is the region of a messenger RNA (mRNA) that is directly upstream from the initiation codon. This region is important for the regulation of t ...
also contains an upstream in-frame
stop codon and the
3'UTR contains a
polyadenylation
Polyadenylation is the addition of a poly(A) tail to an RNA transcript, typically a messenger RNA (mRNA). The poly(A) tail consists of multiple adenosine monophosphates; in other words, it is a stretch of RNA that has only adenine bases. In euk ...
signal sequence.
Protein (Isoform 1)
Molecular weight and internal composition
The predicted
molecular weight
A molecule is a group of two or more atoms held together by attractive forces known as chemical bonds; depending on context, the term may or may not include ions which satisfy this criterion. In quantum physics, organic chemistry, and bioch ...
is 87.9 kDal.
The predicted
isoelectric point
The isoelectric point (pI, pH(I), IEP), is the pH at which a molecule carries no net electrical charge or is electrically neutral in the statistical mean. The standard nomenclature to represent the isoelectric point is pH(I). However, pI is also u ...
is pH 5.07.
The internal composition is enriched for
serine
Serine (symbol Ser or S) is an α-amino acid that is used in the biosynthesis of proteins. It contains an α-amino group (which is in the protonated − form under biological conditions), a carboxyl group (which is in the deprotonated − form un ...
residues.
Isoforms
Corresponding to the five alternative transcripts of BEND2, the protein encoded by this gene is found in two
isoforms
A protein isoform, or "protein variant", is a member of a set of highly similar proteins that originate from a single gene or gene family and are the result of genetic differences. While many perform the same or similar biological roles, some isof ...
(1 and 2) as well as three predicted structures (X1, X2, and X3). These isoforms range from 813 to 645
amino acid
Amino acids are organic compounds that contain both amino and carboxylic acid functional groups. Although hundreds of amino acids exist in nature, by far the most important are the alpha-amino acids, which comprise proteins. Only 22 alpha am ...
s in length.
Isoform 1 is 799 amino acids in length.
Subcellular location
The presence of
nuclear localization signals within the amino acid sequence or
primary structure
Protein primary structure is the linear sequence of amino acids in a peptide or protein. By convention, the primary structure of a protein is reported starting from the amino-terminal (N) end to the carboxyl-terminal (C) end. Protein biosynthes ...
of the BEND2 protein leads to a prediction of
subcellular localization The cells of eukaryotic organisms are elaborately subdivided into functionally-distinct membrane-bound compartments. Some major constituents of eukaryotic cells are: extracellular space, plasma membrane, cytoplasm, nucleus, mitochondria, Golgi ap ...
in the nucleus.
The pat7
P-X(1-3)-(3-4K/R)signal and a nuclear bipartite signal are both found near the N-terminus of the protein.
Structure
The
secondary structure
Protein secondary structure is the three dimensional conformational isomerism, form of ''local segments'' of proteins. The two most common Protein structure#Secondary structure, secondary structural elements are alpha helix, alpha helices and beta ...
for BEND2 is unclear, in particular at the
N-terminus
The N-terminus (also known as the amino-terminus, NH2-terminus, N-terminal end or amine-terminus) is the start of a protein or polypeptide, referring to the free amine group (-NH2) located at the end of a polypeptide. Within a peptide, the ami ...
, which is poorly
conserved between orthologs. The
C-terminus
The C-terminus (also known as the carboxyl-terminus, carboxy-terminus, C-terminal tail, C-terminal end, or COOH-terminus) is the end of an amino acid chain (protein or polypeptide), terminated by a free carboxyl group (-COOH). When the protein is ...
contains two BEN domains, which are predicted to form a series of
alpha helices.
Post-translational modifications
Based on its primary structure, BEND2 is predicted to undergo
N-terminus acetylation,
glycation
Glycation (sometimes called non-enzymatic glycosylation) is the covalent attachment of a sugar to a protein or lipid. Typical sugars that participate in glycation are glucose, fructose, and their derivatives. Glycation is the non-enzymatic proces ...
of several lysine residues,
SUMOlation, a SUMO interaction at the N-terminus, S-
palmitoylation
Palmitoylation is the covalent attachment of fatty acids, such as palmitic acid, to cysteine (''S''-palmitoylation) and less frequently to serine and threonine (''O''-palmitoylation) residues of proteins, which are typically lipid bilayer, memb ...
, and extensive
phosphorylation
In chemistry, phosphorylation is the attachment of a phosphate group to a molecule or an ion. This process and its inverse, dephosphorylation, are common in biology and could be driven by natural selection. Text was copied from this source, wh ...
.
Interacting Proteins
BEND2 is found to interact with the following proteins through experimental
yeast two-hybrid screens or
pull down assays.
BEN Domains (protein feature)
BEND2 has two BEN domains at its C-terminus.
BEN domain
In molecular biology, the BEN domain is a protein domain which is found in diverse proteins including:
* SMAR1 (Scaffold/Matrix attachment region-binding protein 1; also known as BANP), a tumour-suppressor MAR-binding protein that down-regulates ...
s are found in a diverse array of proteins and are predicted to be important for
chromatin remodeling
Chromatin remodeling is the dynamic modification of chromatin architecture to allow access of condensed genomic DNA to the regulatory transcription machinery proteins, and thereby control gene expression. Such remodeling is principally carried out ...
as well as for the recruitment of chromatin-modifying factors utilized during the process of
transcriptional regulation
In molecular biology and genetics, transcriptional regulation is the means by which a cell regulates the conversion of DNA to RNA (transcription), thereby orchestrating gene activity. A single gene can be regulated in a range of ways, from alt ...
of gene expression.
BEN domains are predicted to form four
alpha helices that allow this domain to interact with its DNA target.
Dai et al. 2013showed that the ''
Drosophila melanogaster
''Drosophila melanogaster'' is a species of fly (the taxonomic order Diptera) in the family Drosophilidae. The species is often referred to as the fruit fly or lesser fruit fly, or less commonly the "vinegar fly" or "pomace fly". Starting with Ch ...
'' Insensitive (Insv) gene and corresponding protein has no domains of known chemical function yet it contains a single BEN domain. They illustrated the activity of the Insv protein in transcriptional regulation of genes and obtained a crystal structure of two Insv BEN domains interacting with their DNA target site.
Expression
Tissue expression pattern
The expression of the BEND2 gene is regulated and it is therefore not ubiquitously expressed in the human body. High expression occurs in the testis and in the bone marrow. The NCBI EST profile for this gene shows expression only in the testis and in the muscle.
Transcriptional regulation of expression
The promoter regulating expression of BEND2 (GXP_2567556) is 1255 base pairs in length and is located directly upstream of the BEND2 gene. It regulates transcription of all five transcriptional variants of BEND2. Genomatix's MatInspector program predicted 418 transcription factor binding sites within the BEND2 promoter, including for
SRY,
neurogenin
Neurogenins are a family of bHLH transcription factors involved in specifying neuronal differentiation. It is one of many gene families related to the ''atonal'' gene in Drosophila. Other positive regulators of neuronal differentiation also exp ...
,
interferon regulatory factor-3 (IRF-3),
Ikaros2, and
TCF/LEF-1.
Homology
Paralogs
The BEND2 protein has no known paralogs within the human genome.
BEN-domain containing gene family
The BEND2 gene belongs to a family of human genes known a
"BEN-domain containing” This includes
BANP
Protein BANP is a protein that can be found in humans, it is encoded by the ''BANP'' gene. It is a member of the human gene family, " BEN-domain containing", which includes eight other genes: BEND2, BEND3, BEND4, BEND5, BEND6, BEND7, NACC1 (BEND ...
(BEND1), BEND3, BEND4, BEND5, BEND6, BEND7,
NACC1 (BEND8), and NACC2 (BEND9). The loci for these genes are spread throughout the human genome. Each of these genes contains between one and four BEN domains. Except for at these motifs, the genes of the BEN family do not have similar sequences.
Orthologs
The BEND2 gene is conserved across evolutionary time as it has 114 known
orthologs in a wide range of vertebrate species including
mammal
Mammals () are a group of vertebrate animals constituting the class Mammalia (), characterized by the presence of mammary glands which in females produce milk for feeding (nursing) their young, a neocortex (a region of the brain), fur or ...
s,
bird
Birds are a group of warm-blooded vertebrates constituting the class Aves (), characterised by feathers, toothless beaked jaws, the laying of hard-shelled eggs, a high metabolic rate, a four-chambered heart, and a strong yet lightweigh ...
s,
crocodilia
Crocodilia (or Crocodylia, both ) is an order of mostly large, predatory, semiaquatic reptiles, known as crocodilians. They first appeared 95 million years ago in the Late Cretaceous period ( Cenomanian stage) and are the closest living ...
, and
amphibian
Amphibians are tetrapod, four-limbed and ectothermic vertebrates of the Class (biology), class Amphibia. All living amphibians belong to the group Lissamphibia. They inhabit a wide variety of habitats, with most species living within terres ...
s. The BEND2 protein has 42 known
orthologs. The
C-terminus
The C-terminus (also known as the carboxyl-terminus, carboxy-terminus, C-terminal tail, C-terminal end, or COOH-terminus) is the end of an amino acid chain (protein or polypeptide), terminated by a free carboxyl group (-COOH). When the protein is ...
of the protein, the location of its BEN domains, is highly conserved; however, the
N-terminus
The N-terminus (also known as the amino-terminus, NH2-terminus, N-terminal end or amine-terminus) is the start of a protein or polypeptide, referring to the free amine group (-NH2) located at the end of a polypeptide. Within a peptide, the ami ...
is not well conserved, even within the order of
Primate
Primates are a diverse order of mammals. They are divided into the strepsirrhines, which include the lemurs, galagos, and lorisids, and the haplorhines, which include the tarsiers and the simians (monkeys and apes, the latter including huma ...
s.
Function
BEND2 is predicted to be a
DNA-binding protein due to the presence of BEN domains at its C-terminus, a hypothesis supported by its localization to the
nucleus
Nucleus ( : nuclei) is a Latin word for the seed inside a fruit. It most often refers to:
*Atomic nucleus, the very dense central region of an atom
*Cell nucleus, a central organelle of a eukaryotic cell, containing most of the cell's DNA
Nucle ...
, the transcription factors found in its
promoter region
In genetics, a promoter is a sequence of DNA to which proteins bind to initiate transcription of a single RNA transcript from the DNA downstream of the promoter. The RNA transcript may encode a protein (mRNA), or can have a function in and of i ...
, and the nature of the proteins it interacts with. Though the precise function of the BEND2 protein is not yet well understood by the scientific community, BEN domains have been found to be important
regulators of transcription.
Clinical significance
The diseases that have been linked to BEND2 are related to the
central nervous system
The central nervous system (CNS) is the part of the nervous system consisting primarily of the brain and spinal cord. The CNS is so named because the brain integrates the received information and coordinates and influences the activity of all par ...
though expression of the gene is not highly observed in these tissues.
* BEND2 was identified as one of the genes that causes a central nervous system
primitive neuroectodermal tumor when fused with the
MN1 gene, which is located on
chromosome 22
Chromosome 22 is one of the 23 pairs of chromosomes in human cell (biology), cells. Humans normally have two copies of chromosome 22 in each cell. Chromosome 22 is the second smallest human chromosome, spanning about 49 million DNA base pairs and ...
.
* A rare primary central nervous system
lymphoma tumor was found to have a high mutation ratio for BEND2; however, the authors do not describe this gene as primarily responsible for the tumor.
* A young girl diagnosed with severe
epileptic encephalopathy was found to have a 300-kb
deletion
Deletion or delete may refer to:
Computing
* File deletion, a way of removing a file from a computer's file system
* Code cleanup, a way of removing unnecessary variables, data structures, cookies, and temporary files in a programming language
* ...
in a region that included BEND2, an extremely rare
mutation
In biology, a mutation is an alteration in the nucleic acid sequence of the genome of an organism, virus, or extrachromosomal DNA. Viral genomes contain either DNA or RNA. Mutations result from errors during DNA or viral replication, mi ...
not found in her parents’
genome
In the fields of molecular biology and genetics, a genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA (or RNA in RNA viruses). The nuclear genome includes protein-coding genes and non-coding ge ...
s.
* An individual with
adult autism was identified to have a
copy-number variant of unknown significance in a region only containing BEND2.
References
{{reflist
Genes on human chromosome X