In
molecular biology
Molecular biology is the branch of biology that seeks to understand the molecular basis of biological activity in and between cells, including biomolecular synthesis, modification, mechanisms, and interactions. The study of chemical and phys ...
, the ARID domain (AT-rich interaction domain; also known as BRIGHT (B-cell Regulator of Ig Heavy chain Transcription) domain
))
is a
protein domain
In molecular biology, a protein domain is a region of a protein's polypeptide chain that is self-stabilizing and that folds independently from the rest. Each domain forms a compact folded three-dimensional structure. Many proteins consist o ...
that binds to
DNA. ARID domain-containing proteins are found in
fungi
A fungus (plural, : fungi or funguses) is any member of the group of Eukaryote, eukaryotic organisms that includes microorganisms such as yeasts and Mold (fungus), molds, as well as the more familiar mushrooms. These organisms are classified ...
, plants
[Zheng B, He H, Zheng Y, Wu W, McCormick S (2014) An ARID Domain-Containing Protein within Nuclear Bodies Is Required for Sperm Cell Formation in Arabidopsis thaliana. PLoS Genet 10(7): e1004421. doi: 10.1371/journal.pgen.1004421] and
invertebrate
Invertebrates are a paraphyletic group of animals that neither possess nor develop a vertebral column (commonly known as a ''backbone'' or ''spine''), derived from the notochord. This is a grouping including all animals apart from the chordate ...
and
vertebrate
Vertebrates () comprise all animal taxon, taxa within the subphylum Vertebrata () (chordates with vertebral column, backbones), including all mammals, birds, reptiles, amphibians, and fish. Vertebrates represent the overwhelming majority of the ...
metazoan
Animals are multicellular, eukaryotic organisms in the biological kingdom Animalia. With few exceptions, animals consume organic material, breathe oxygen, are able to move, can reproduce sexually, and go through an ontogenetic stage in ...
s. ARID-encoding
genes
In biology, the word gene (from , ; "... Wilhelm Johannsen coined the word gene to describe the Mendelian units of heredity..." meaning ''generation'' or ''birth'' or ''gender'') can have several different meanings. The Mendelian gene is a b ...
are involved in a variety of
biological
Biology is the scientific study of life. It is a natural science with a broad scope but has several unifying themes that tie it together as a single, coherent field. For instance, all organisms are made up of cells that process hereditary ...
processes including embryonic development,
cell lineage
gene regulation
Regulation of gene expression, or gene regulation, includes a wide range of mechanisms that are used by cells to increase or decrease the production of specific gene products (protein or RNA). Sophisticated programs of gene expression are wide ...
and
cell cycle
The cell cycle, or cell-division cycle, is the series of events that take place in a cell that cause it to divide into two daughter cells. These events include the duplication of its DNA ( DNA replication) and some of its organelles, and sub ...
control. Although the specific roles of this domain and of ARID-containing
protein
Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, respon ...
s in
transcriptional regulation
In molecular biology and genetics, transcriptional regulation is the means by which a cell regulates the conversion of DNA to RNA ( transcription), thereby orchestrating gene activity. A single gene can be regulated in a range of ways, from al ...
are yet to be elucidated, they include both positive and negative
transcriptional regulation
Regulation is the management of complex systems according to a set of rules and trends. In systems theory, these types of rules exist in various fields of biology
Biology is the scientific study of life. It is a natural science with a ...
and a likely involvement in the modification of
chromatin
Chromatin is a complex of DNA and protein found in eukaryote, eukaryotic cells. The primary function is to package long DNA molecules into more compact, denser structures. This prevents the strands from becoming tangled and also plays important ...
structure.
The basic structure of the ARID domain appears to be a series of six
alpha-helices
The alpha helix (α-helix) is a common motif in the secondary structure of proteins and is a right hand-helix conformation in which every backbone N−H group hydrogen bonds to the backbone C=O group of the amino acid located four residues earli ...
separated by
beta-strand
The beta sheet, (β-sheet) (also β-pleated sheet) is a common motif of the regular protein secondary structure. Beta sheets consist of beta strands (β-strands) connected laterally by at least two or three backbone hydrogen bonds, forming a gen ...
s, loops, or turns, but the structured region may extend to an additional
helix
A helix () is a shape like a corkscrew or spiral staircase. It is a type of smooth space curve with tangent lines at a constant angle to a fixed axis. Helices are important in biology, as the DNA molecule is formed as two intertwined hel ...
at either or both ends of the basic six. Based on primary
sequence
In mathematics, a sequence is an enumerated collection of objects in which repetitions are allowed and order matters. Like a set, it contains members (also called ''elements'', or ''terms''). The number of elements (possibly infinite) is called ...
homology, they can be partitioned into three
structural
A structure is an arrangement and organization of interrelated elements in a material object or system, or the object or system so organized. Material structures include man-made objects such as buildings and machines and natural objects such as ...
classes: Minimal ARID
proteins
Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, respondi ...
that consist of a core domain formed by six alpha helices; ARID proteins that supplement the core domain with an N-terminal alpha-helix; and Extended-ARID proteins, which contain the core domain and additional
alpha-helices
The alpha helix (α-helix) is a common motif in the secondary structure of proteins and is a right hand-helix conformation in which every backbone N−H group hydrogen bonds to the backbone C=O group of the amino acid located four residues earli ...
at their N- and C-termini.
The human
SWI-SNF complex protein
ARID1A
AT-rich interactive domain-containing protein 1A is a protein that in humans is encoded by the ''ARID1A'' gene.
Function
ARID1A is a member of the SWI/SNF family, whose members have helicase and ATPase activities and are thought to regulate ...
is an ARID family member with non-sequence-specific DNA
binding activity. The ARID consensus and other structural features are common to both ARID1A and
yeast
Yeasts are eukaryotic, single-celled microorganisms classified as members of the fungus kingdom. The first yeast originated hundreds of millions of years ago, and at least 1,500 species are currently recognized. They are estimated to consti ...
SWI1, suggesting that ARID1A is a
human
Humans (''Homo sapiens'') are the most abundant and widespread species of primate, characterized by bipedalism and exceptional cognitive skills due to a large and complex brain. This has enabled the development of advanced tools, culture, ...
counterpart of SWI1.
The approximately 100-residue ARID
sequence
In mathematics, a sequence is an enumerated collection of objects in which repetitions are allowed and order matters. Like a set, it contains members (also called ''elements'', or ''terms''). The number of elements (possibly infinite) is called ...
is present in a series of proteins strongly implicated in the regulation of
cell growth, development, and tissue-specific
gene expression
Gene expression is the process by which information from a gene is used in the synthesis of a functional gene product that enables it to produce end products, protein or non-coding RNA, and ultimately affect a phenotype, as the final effect. ...
. Although about a dozen ARID proteins can be identified from database searches, to date, only Bright (a regulator of B-cell-specific
gene
In biology, the word gene (from , ; "... Wilhelm Johannsen coined the word gene to describe the Mendelian units of heredity..." meaning ''generation'' or ''birth'' or ''gender'') can have several different meanings. The Mendelian gene is a b ...
expression), dead ringer (a ''
Drosophila
''Drosophila'' () is a genus of flies, belonging to the family Drosophilidae, whose members are often called "small fruit flies" or (less frequently) pomace flies, vinegar flies, or wine flies, a reference to the characteristic of many s ...
melanogaster'' gene
product
Product may refer to:
Business
* Product (business), an item that serves as a solution to a specific consumer problem.
* Product (project management), a deliverable or set of deliverables that contribute to a business solution
Mathematics
* Prod ...
required for normal development), and MRF-2 (which represses
expression
Expression may refer to:
Linguistics
* Expression (linguistics), a word, phrase, or sentence
* Fixed expression, a form of words with a specific meaning
* Idiom, a type of fixed expression
* Metaphorical expression, a particular word, phrase, ...
from the
Cytomegalovirus
''Cytomegalovirus'' (''CMV'') (from ''cyto-'' 'cell' via Greek - 'container' + 'big, megalo-' + -''virus'' via Latin 'poison') is a genus of viruses in the order '' Herpesvirales'', in the family '' Herpesviridae'', in the subfamily '' Betahe ...
enhancer) have been analyzed directly with regard to their DNA binding properties. Each
bind
BIND () is a suite of software for interacting with the Domain Name System (DNS). Its most prominent component, named (pronounced ''name-dee'': , short for ''name daemon''), performs both of the main DNS server roles, acting as an authoritative ...
s preferentially to AT-rich sites. In contrast, ARID1A shows no
sequence
In mathematics, a sequence is an enumerated collection of objects in which repetitions are allowed and order matters. Like a set, it contains members (also called ''elements'', or ''terms''). The number of elements (possibly infinite) is called ...
preference in its DNA binding activity, thereby demonstrating that AT-rich binding is not an intrinsic property of ARID
domains and that ARID family proteins may be involved in a wider range of DNA interactions.
References
{{InterPro content, IPR001606
Protein domains