C5orf34 (chromosome 5 open reading frame 34) is a protein that in humans is encoded by the ''C5orf34''
gene
In biology, the word gene has two meanings. The Mendelian gene is a basic unit of heredity. The molecular gene is a sequence of nucleotides in DNA that is transcribed to produce a functional RNA. There are two types of molecular genes: protei ...
(5p12).
C5orf34 is conserved in mammals, birds and reptiles with the most distant ancestor being the
Burmese python
The Burmese python (''Python bivittatus'') is one of the largest species of snakes. It is native to a large area of Southeast Asia and is listed as Vulnerable on the IUCN Red List. Until 2009, it was considered a subspecies of the Indian pyth ...
, ''Python bivittatus''. The C5orf34 protein contains two mammalian conserved domains: DUF 4520 and DUF 4524. The protein is also predicted to have a polo-box domain (PBD) of
polo-like kinase 4 (plk4), which has predicted conservation in distant orthologs from the clade Aves.
Gene

''C5orf34'' is located on the negative DNA strand of the short arm of chromosome 6 at locus 12. The gene is 28,744 base pairs long and spans from base pair 43,486,701 to base pair 43,515,445. The gene produces a single transcript of 2,540 base pairs long and encodes for 638 amino acids.
Gene neighborhood
The gene ''
PAIP1
Polyadenylate-binding protein-interacting protein 1 is a protein that in humans is encoded by the ''PAIP1'' gene
In biology, the word gene has two meanings. The Mendelian gene is a basic unit of heredity. The molecular gene is a sequence of ...
'' is found on the negative strand just downstream of ''C5orf34'' and is a member of the polyadenylate-binding family. ''PAIP1'' extends from base pairs 43,526,267 to 43,557,419. ''
CCL28'' is found downstream on the negative strand and extends from base pairs 43378052 to 43413837.
Gene expression
There indication of multiple sources that suggest, in humans, C5orf34 protein is expressed non-ubiquitously in select tissues at low/moderate levels, with the most abundant expression in the tissues of the stomach, small intestine, testis, skeletal muscle and heart muscle. A study of
Rho kinase inhibitor Rho-kinase inhibitors (rho-associated protein kinase inhibitor or ROCK inhibitor) are a series of compounds that target rho kinase (ROCK) and inhibit the ROCK pathway. Clinical trials have found that inhibition of the ROCK pathway contributes to the ...
effect on primary cell lines also showed that C5orf34 is expressed in
dermal fibroblast
Dermal fibroblasts are cells within the dermis layer of skin which are responsible for generating connective tissue and allowing the skin to recover from injury. Using organelles (particularly the rough endoplasmic reticulum), dermal fibroblasts ge ...
s of normal human tissue samples.
Promoter
The promoter region for C5orf34 is predicted to be between 43515079 and 43515773 and spans 695 base pairs.
Protein
C5orf34 consists of 638 amino acids, has a weight of 72.7 kDa and an isoelectric point of 7.77 in humans.
Function
Although the precise function of C5orf34 in humans remains unknown, there is evidentiary support based on structure that it is involved in kinase-related cellular functions.
In addition, C5orf34 is predicted to be nuclear, thus it has potential involvement in
gene regulation
Regulation of gene expression, or gene regulation, includes a wide range of mechanisms that are used by cells to increase or decrease the production of specific gene products (protein or RNA). Sophisticated programs of gene expression are wide ...
and
cell proliferation
Cell proliferation is the process by which ''a cell grows and divides to produce two daughter cells''. Cell proliferation leads to an exponential increase in cell number and is therefore a rapid mechanism of tissue growth. Cell proliferation ...
seeing as these are two primary
signal transduction pathways
Signal transduction is the process by which a chemical or physical signal is transmitted through a cell as a series of molecular events. Proteins responsible for detecting stimuli are generally termed receptors, although in some cases the term ...
involve nuclear kinase proteins.
Structure
In humans, C5orf34 contains two domains of unknown function
DUF 4520(pfam 15016) an
DUF 4524(pfam 150125), found between residues 6-153 and 444–539, respectively. The protein is
serine
Serine
(symbol Ser or S) is an α-amino acid that is used in the biosynthesis of proteins. It contains an α- amino group (which is in the protonated − form under biological conditions), a carboxyl group (which is in the deprotonated − ...
and
threonine
Threonine (symbol Thr or T) is an amino acid that is used in the biosynthesis of proteins. It contains an α-amino group (which is in the protonated −NH form when dissolved in water), a carboxyl group (which is in the deprotonated −COO− ...
rich. The charge distribution of the protein is equally dispersed per there are no positive or negative charge clusters sequestered within the protein.
The predicted secondary structures of the human protein were assessed by multiple bioinformatic tools. All of the programs predicted the protein's structure to consist of
alpha helices
An alpha helix (or α-helix) is a sequence of amino acids in a protein that are twisted into a coil (a helix).
The alpha helix is the most common structural arrangement in the secondary structure of proteins. It is also the most extreme type of l ...
, extended strands, random coils and beta turns. Th
Phyre2 server provideda predicted human protein structure that indicated domains of plk polo-box of the serine/threonine-protein kinase plk4. The server predicted with 96.8% confidence of 20% coverage (130 residues) of the protein. The coverage exhibited residues of the conserved polo-box domain and the two DUF domains. The protein was predominantly soluble, with an average hydrophobicity of -0.478.
Post-translational modifications
There is extensive, predicted
phosphorylation
In biochemistry, phosphorylation is described as the "transfer of a phosphate group" from a donor to an acceptor. A common phosphorylating agent (phosphate donor) is ATP and a common family of acceptor are alcohols:
:
This equation can be writ ...
of C5orf34, with 32 phosphoserines and 7 phosphothreonines being conserved in
orthologs
Sequence homology is the biological homology between DNA, RNA, or protein sequences, defined in terms of shared ancestry in the evolutionary history of life. Two segments of DNA can have shared ancestry because of three phenomena: either a spec ...
of the human C5orf34 protein. This analysis indicates C5orf34 as a
phosphoprotein
A phosphoprotein is a protein that is posttranslationally modified by the attachment of either a single phosphate group, or a complex molecule such as 5'-phospho-DNA, through a phosphate group. The target amino acid is most often serine, threonin ...
and supports structural predictions of it being a kinase protein. The protein contains only one
nuclear export signal
A nuclear export signal (NES) is a short target peptide containing 4 hydrophobic residues in a protein that targets it for export from the cell nucleus to the cytoplasm through the nuclear pore complex using nuclear transport. It has the opposit ...
residue, found at 481-L; however the NES score was found to be low at 0.515. Structural analysis of the protein indicated it was sequestered in the nucleus with an 87% probability.
Interacting proteins
Databases of protein interactions
MINTSTRINGIntAct an
BioGRID have not identified any interactions with C5orf34.
Homology and evolution
''C5orf34'' is highly conserved in primates and mammals and moderately conserved in reptiles. The furthest conserved ortholog is in ''Python bivittatus'', or the Burmese python. Below is a selected list of orthologs to demonstrate the homology of this gene with relation to the reference sequence in ''Homo sapiens''.
Orthologous space
151 organisms have been predicted orthologs with ''C5orf34''.
The most distant ortholog is the Burmese python, which diverged from humans 296 million years ago, indicating ''C5orf34'' developed in reptiles and birds.
Table of ''C5orf34'' orthologs
Paralogous space
There are no predicted paralogs for ''C5orf34'' in both humans and mice.
Conserved regions
Multiple sequence alignments indicated amino acid residue conservation throughout the C5orf34 protein in an array of orthologs, with the most highly conserved regions at both
N-terminus
The N-terminus (also known as the amino-terminus, NH2-terminus, N-terminal end or amine-terminus) is the start of a protein or polypeptide, referring to the free amine group (-NH2) located at the end of a polypeptide. Within a peptide, the amin ...
and
C-terminus
The C-terminus (also known as the carboxyl-terminus, carboxy-terminus, C-terminal tail, carboxy tail, C-terminal end, or COOH-terminus) is the end of an amino acid chain (protein
Proteins are large biomolecules and macromolecules that comp ...
where the DUF are located
DUF 4520(pfam 15016) was found to be conserved in C-terminus an
DUF 4524(pfam 150125) was found to be conserved in the N-terminus. Also, the polo-box domain of
plk4
Serine/threonine-protein kinase PLK4 also known as polo-like kinase 4 is an enzyme that in humans is encoded by the ''PLK4'' gene. The ''Drosophila'' homolog is SAK, the ''C. elegans'' homolog is zyg-1, and the ''Xenopus'' homolog is Plx4.
Func ...
was found to be conserved in the C-terminus in a multiple sequence alignment in both strict and distant orthologs.
References
{{reflist
Human proteins