Sal-like protein 4 (SALL4) is a transcription factor encoded by a member of the ''Spalt-like'' (''SALL'') gene family, ''SALL4''.
The ''SALL'' genes were identified based on their sequence homology to ''Spalt,'' which is a homeotic gene originally cloned in ''
Drosophila melanogaster
''Drosophila melanogaster'' is a species of fly (an insect of the Order (biology), order Diptera) in the family Drosophilidae. The species is often referred to as the fruit fly or lesser fruit fly, or less commonly the "vinegar fly", "pomace fly" ...
'' that is important for terminal trunk structure formation in embryogenesis and imaginal disc development in the larval stages. There are four human SALL proteins (
SALL1,
2,
3, and 4) with
structural homology and playing diverse roles in embryonic development, kidney function, and cancer. The ''SALL4'' gene encodes at least three
isoforms
A protein isoform, or "protein variant", is a member of a set of highly similar proteins that originate from a single gene and are the result of genetic differences. While many perform the same or similar biological roles, some isoforms have uniqu ...
, termed A, B, and C, through
alternative splicing
Alternative splicing, alternative RNA splicing, or differential splicing, is an alternative RNA splicing, splicing process during gene expression that allows a single gene to produce different splice variants. For example, some exons of a gene ma ...
, with the A and B forms being the most studied. SALL4 can alter gene expression changes through its interaction with many
co-factors and epigenetic complexes. It is also known as a key embryonic stem cell (
ESC) factor.
Structure, interaction partners, and DNA binding activity
SALL4 contains one
zinc finger
A zinc finger is a small protein structural motif that is characterized by the coordination of one or more zinc ions (Zn2+) which stabilizes the fold. The term ''zinc finger'' was originally coined to describe the finger-like appearance of a ...
in its amino (N-) terminus and three clusters of zinc fingers that each coordinates zinc with two cysteines and two histidines (Cys
2His
2-type) that potentially confer nucleic acid binding activity. SALL4B lacks two of the zinc finger clusters found in the A isoform. Although it remains unclear which zinc finger cluster is responsible for SALL4’s DNA binding property
Different SALL family members can form hetero- or homodimers via their conserved glutamine (Q)-rich region. SALL4 has at least one canonical
nuclear localization signal
A nuclear localization signal ''or'' sequence (NLS) is an amino acid sequence that 'tags' a protein for import into the cell nucleus by nuclear transport. Typically, this signal consists of one or more short sequences of positively charged lysin ...
(NLS) with the K-K/R-X-K/R motif in the N-terminal portion of the protein shared among both A and B isoforms (residues 64–67).
One report has suggested that with a mutated NLS sequence, SALL4 cannot localize to the nucleus.
Through a 12-amino acid sequence in its N-terminus (N-12a.a.), SALL4 binds to retinoblastoma binding protein 4 (
RBBP4), a subunit of the nucleosome remodeling and histone deacetylation (
NuRD) complex, which also contains chromodomain-helicase-DNA binding proteins (
CHD3/4 or Mi-2a/b), metastasis-associated proteins (
MTA), methyl-CpG-binding domain proteins (
MBD2
Methyl-CpG-binding domain protein 2 is a protein that in humans is encoded by the ''MBD2'' gene
In biology, the word gene has two meanings. The Mendelian gene is a basic unit of heredity. The molecular gene is a sequence of nucleotides in D ...
or
MBD3
Methyl-CpG-binding domain protein 3 is a protein that in humans is encoded by the ''MBD3'' gene.
Function
DNA methylation is the major modification of eukaryotic genomes and plays an essential role in mammalian development. Human proteins ME ...
), and histone deacetylases (
HDAC1 and
HDAC2
Histone deacetylase 2 (HDAC2) is an enzyme that in humans is encoded by the ''HDAC2'' gene. It belongs to the histone deacetylase class of enzymes responsible for the removal of acetyl groups from lysine residues at the N-terminal region of the ...
).
This association allows SALL4 to act as a transcriptional repressor. Accordingly, SALL4 has been shown to localize to heterochromatin regions in cells, for which its last zinc finger cluster (shared between SALL4A and B) is necessary.
Beside the NuRD complex, SALL4 is reportedly able to bind to other epigenetic modifiers such as histone lysine-specific demethylase 1 (
LSD1), which is frequently associated with the
NuRD complex and subsequently gene repression. In addition, SALL4 can also activate gene expression via the recruitment of the mixed lineage leukemia (
MLL) protein, which is a homolog of Drosophila Trithorax and yeast Set1 proteins and has histone 3 lysine 4 (H3K4) trimethylation activity.
This interaction is best characterized in the co-regulation of ''
HOXA9'' gene by SALL4 and MLL in leukemic cells.
In mouse ESCs, Sall4 was found to bind the essential stem cell factor, octamer-binding transcription factor 4 (
Oct4
Oct-4 (octamer-binding transcription factor 4), also known as POU5F1 ( POU domain, class 5, transcription factor 1), is a protein that in humans is encoded by the ''POU5F1'' gene. Oct-4 is a homeodomain transcription factor of the POU family ...
), in two separate unbiased mass spectrometry (spec) screens
Sall4 can also bind other important pluripotency proteins such as Nanog and sex determining region Y (SRY)-box 2 protein (
Sox2). Together these proteins can affect each other’s expression patterns as well as their own, thus forming a mESC-specific transcriptional regulatory circuit. SALL4 has also been reported to bind T-box 5 protein (
Tbx5
T-box transcription factor TBX5, (T-box protein 5) is a protein that in humans is encoded by the ''TBX5'' gene. Abnormalities in the TBX5 gene can result in altered limb development, Holt-Oram syndrome, Tetra-amelia syndrome, and cardiac and s ...
) in cardiac tissues as well as genetically interact with
Tbx5
T-box transcription factor TBX5, (T-box protein 5) is a protein that in humans is encoded by the ''TBX5'' gene. Abnormalities in the TBX5 gene can result in altered limb development, Holt-Oram syndrome, Tetra-amelia syndrome, and cardiac and s ...
in mouse limb development.
Other binding partners of SALL4 include promyelocytic leukemia zinc finger protein (
PLZF) in sperm precursor cells,
Rad50
DNA repair protein RAD50, also known as RAD50, is a protein that in humans is encoded by the ''RAD50'' gene.
Function
The protein encoded by this gene is highly similar to ''Saccharomyces cerevisiae'' Rad50, a protein involved in DNA double- ...
during DNA damage repair, and
b-catenin downstream of the
Wnt signaling pathway.
Since most of these interactions were identified by mass-spec or co-immunoprecipitation, whether they are direct are unknown. Through chromatin immunoprecipitation (
ChIP) followed by next-generation sequencing or microarray, some SALL4 targets have been identified. A key verified target gene encodes the enzyme phosphatidylinositol-3,4,5-trisphosphate 3-phosphatase (
PTEN). PTEN is a tumor suppressor that keeps uncontrolled cell growth in check through inducing programmed cell death, or apoptosis. SALL4 binds the ''PTEN'' promoter and recruits the NuRD complex to mediate its repression, thus leads to proliferation of cells.
Expression and role in stem cells and development
In mouse embryos, SALL4 expression is detectable as early as the two-cell stage. Its expression persists through 8- and 16-cell stages to the blastocyst, where it is found in some cells of the
trophectoderm
The trophoblast (from Greek : to feed; and : germinator) is the outer layer of cells of the blastocyst. Trophoblasts are present four days after fertilization in humans. They provide nutrients to the embryo and develop into a large part of the pl ...
and inner cell mass (
ICM), from which mouse ESCs are derived.
SALL4 is an important factor for maintaining the “stemness” of ESCs of both mouse and human origin, since loss of Sall4 leads to differentiation of these pluripotent cells down the trophectoderm lineage.
This is possibly due to down-regulation of ''Pou5f1'' (encoding
Oct4
Oct-4 (octamer-binding transcription factor 4), also known as POU5F1 ( POU domain, class 5, transcription factor 1), is a protein that in humans is encoded by the ''POU5F1'' gene. Oct-4 is a homeodomain transcription factor of the POU family ...
) expression and up-regulation of caudal-type homeobox 2 (''
Cdx2'') gene expression.
Sall4 is part of the transcriptional regulatory network that includes other pluripotent factors such as Oct4, Nanog, and Sox2 Because of its important role in early development, genetically mutated mice without functioning SALL4 die early on at the peri-implantation stage, while heterozygous mice have neural, kidney, heart defects and limb abnormalities.
Clinical significance
The various SALL4-null mouse models mimic human mutations in the ''SALL4'' gene, which were shown to cause developmental problems in patients with
Okihiro/Duane-Radial-ray syndrome.
These individuals frequently have family history of hand malformation and eye movement disorders.
''SALL4'' expression is low to undetectable in most adult tissues with the exception of germ cells and human blood progenitor cells.
However, ''SALL4'' is re-activated and mis-regulated in various cancers such as acute myeloid leukemia (
AML),
B-cell acute lymphocytic leukemia (
B-ALL),
germ cell tumor
A germ cell tumor (GCT) is a neoplasm derived from primordial germ cells. Germ-cell tumors can be cancerous or benign. Germ cell tumors typically originate from the gonads (ovary and testis), but can arise in other areas of the body. Extragon ...
s,
gastric cancer
Stomach cancer, also known as gastric cancer, is a malignant tumor of the stomach. It is a cancer that develops in the lining of the stomach. Most cases of stomach cancers are gastric carcinomas, which can be divided into a number of subtypes ...
,
breast cancer
Breast cancer is a cancer that develops from breast tissue. Signs of breast cancer may include a Breast lump, lump in the breast, a change in breast shape, dimpling of the skin, Milk-rejection sign, milk rejection, fluid coming from the nipp ...
, hepatocellular carcinoma (
HCC),
lung cancer
Lung cancer, also known as lung carcinoma, is a malignant tumor that begins in the lung. Lung cancer is caused by genetic damage to the DNA of cells in the airways, often caused by cigarette smoking or inhaling damaging chemicals. Damaged ...
, and
glioma
A glioma is a type of primary tumor that starts in the glial cells of the brain or spinal cord. They are malignant but some are extremely slow to develop. Gliomas comprise about 30% of all brain and central nervous system tumors and 80% of ...
. In many of these cancers, ''SALL4'' expression was compared in tumor cells to the normal tissue counterpart, e.g. it is expressed in nearly half of primary human endometrial cancer samples, but not in normal or hyperplastic endometrial tissue samples.
Often, ''SALL4'' expression is correlated with worse survival and poor prognosis such as in
HCC,
or with metastasis such as in endometrial cancer,
colorectal carcinoma, and esophageal squamous cell carcinoma. It is unclear how SALL4 expression is de-regulated in malignant cells, but DNA hypomethylation in its intron 1 region has been observed in B-ALL.
In breast cancer, Signal transducer and activator of transcription 3 (
STAT3
Signal transducer and activator of transcription 3 (STAT3) is a transcription factor which in humans is encoded by the ''STAT3'' gene. It is a member of the STAT protein family.
Function
STAT3 is a member of the STAT protein family. In respon ...
) has been reported to directly activate ''SALL4'' expression. Furthermore, canonical Wnt signaling has been proposed to activate ''SALL4'' gene expression in both development and in cancer.
In leukemia, the mechanism of SALL4 function is better characterized; mice with over-expression of human ''SALL4'' develop myelodysplatic syndromes (
MDS)-like symptoms and eventually
AML.
This is consistent with high level of ''SALL4'' expression correlating with high-risk MDS patients. Further elucidating its tumorigenesis function, knocking down ''SALL4'' expression with short hairpin-RNA in leukemic cells or treating these cells with a peptide that mimics the N-12aa of SALL4 to inhibit its interaction with the
NuRD complex both result in cell death.
These suggest the primary cancer-maintaining property of SALL4 is mediated through its transcriptional repressing function. These observations have led to growing interest in SALL4 as both a diagnostic tool as well as target in cancer therapy. For example, in solid tumors such as germ cell tumors, SALL4 protein expression has become a standard diagnostic biomarker.
Notes
References
Further reading
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
External links
GeneReviews/NCBI/NIH/UW entry on SALL4-Related Disorders
{{Transcription factors, g2
Transcription factors