C1orf27
   HOME

TheInfoList



OR:

Uncharacterized protein Chromosome 1 Open Reading Frame 27 is a
protein Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residue (biochemistry), residues. Proteins perform a vast array of functions within organisms, including Enzyme catalysis, catalysing metab ...
in humans, encoded by the C1orf27
gene In biology, the word gene has two meanings. The Mendelian gene is a basic unit of heredity. The molecular gene is a sequence of nucleotides in DNA that is transcribed to produce a functional RNA. There are two types of molecular genes: protei ...
. It is accession number NM_017847. This is a
membrane protein Membrane proteins are common proteins that are part of, or interact with, biological membranes. Membrane proteins fall into several broad categories depending on their location. Integral membrane proteins are a permanent part of a cell membrane ...
that is 3926
base pair A base pair (bp) is a fundamental unit of double-stranded nucleic acids consisting of two nucleobases bound to each other by hydrogen bonds. They form the building blocks of the DNA double helix and contribute to the folded structure of both DNA ...
s long with the most extensive string of
amino acid Amino acids are organic compounds that contain both amino and carboxylic acid functional groups. Although over 500 amino acids exist in nature, by far the most important are the 22 α-amino acids incorporated into proteins. Only these 22 a ...
s being 454aa long. C1orf27 exhibits
cytoplasm The cytoplasm describes all the material within a eukaryotic or prokaryotic cell, enclosed by the cell membrane, including the organelles and excluding the nucleus in eukaryotic cells. The material inside the nucleus of a eukaryotic cell a ...
ic expression in
epidermal The epidermis is the outermost of the three layers that comprise the skin, the inner layers being the dermis and hypodermis. The epidermal layer provides a barrier to infection from environmental pathogens and regulates the amount of water relea ...
tissues. Predicted associated biological processes of the gene include cell fate specification and developmental properties.


Gene


Locus

This gene is located on
chromosome 1 Chromosome 1 is the designation for the largest human chromosome. Humans have two copies of chromosome 1, as they do with all of the autosomes, which are the non-sex chromosomes. Chromosome 1 spans about 249 million nucleotide base pairs, which a ...
at 1q31.1. It is encoded on the plus strand of
DNA Deoxyribonucleic acid (; DNA) is a polymer composed of two polynucleotide chains that coil around each other to form a double helix. The polymer carries genetic instructions for the development, functioning, growth and reproduction of al ...
spanning from 186,344,406 to 186,390,514.


mRNA


Alternative splicing

There appear to be four
isoforms A protein isoform, or "protein variant", is a member of a set of highly similar proteins that originate from a single gene and are the result of genetic differences. While many perform the same or similar biological roles, some isoforms have uniqu ...
due to splicing. Two of those are truncated on the 3' end of the protein from 266aa and 396aa. Additional location of alternative splice sites are from 79aa to 102aa and 246aa to 260aa.


Protein


General properties

The primary encoded protein of C1orf27 consists of 454
amino acid Amino acids are organic compounds that contain both amino and carboxylic acid functional groups. Although over 500 amino acids exist in nature, by far the most important are the 22 α-amino acids incorporated into proteins. Only these 22 a ...
residues and is 3926
base pair A base pair (bp) is a fundamental unit of double-stranded nucleic acids consisting of two nucleobases bound to each other by hydrogen bonds. They form the building blocks of the DNA double helix and contribute to the folded structure of both DNA ...
s long. It consists of 14 total
exon An exon is any part of a gene that will form a part of the final mature RNA produced by that gene after introns have been removed by RNA splicing. The term ''exon'' refers to both the DNA sequence within a gene and to the corresponding sequence ...
s. The predicted
molecular weight A molecule is a group of two or more atoms that are held together by Force, attractive forces known as chemical bonds; depending on context, the term may or may not include ions that satisfy this criterion. In quantum physics, organic chemi ...
of the primary, unmodified protein is approximately 51.1 
kDa The dalton or unified atomic mass unit (symbols: Da or u, respectively) is a unit of mass defined as of the mass of an unbound neutral atom of carbon-12 in its nuclear and electronic ground state and at rest. It is a non-SI unit accepted f ...
.


Aliases

As with many other genes, there are some common
aliases A pseudonym (; ) or alias () is a fictitious name that a person assumes for a particular purpose, which differs from their original or true meaning (orthonym). This also differs from a new name that entirely or legally replaces an individual's ow ...
found with this gene. Those aliases are Lymphocyte-Activation Gene-1 (LAG1) Interacting Protein, Transparent Testa Glabra 1 (TTG1), and Odorant Response Abnormal 4 (ODR4). The most common alias for C1orf27 is ODR4, and this is what most readily appears when searching the gene.


Composition

Computational analysis revealed the most abundant amino acid to be
leucine Leucine (symbol Leu or L) is an essential amino acid that is used in the biosynthesis of proteins. Leucine is an α-amino acid, meaning it contains an α-amino group (which is in the protonated −NH3+ form under biological conditions), an α-Car ...
at 10.1% of the total protein. The second most abundant was
serine Serine (symbol Ser or S) is an α-amino acid that is used in the biosynthesis of proteins. It contains an α- amino group (which is in the protonated − form under biological conditions), a carboxyl group (which is in the deprotonated − ...
which contributes to 8.6% of the total protein.
Glutamic acid Glutamic acid (symbol Glu or E; known as glutamate in its anionic form) is an α- amino acid that is used by almost all living beings in the biosynthesis of proteins. It is a non-essential nutrient for humans, meaning that the human body can ...
was third most abundant and contributes to 7.7% of the protein. This analysis also revealed that the protein appears to be deficient in
tryptophan Tryptophan (symbol Trp or W) is an α-amino acid that is used in the biosynthesis of proteins. Tryptophan contains an α-amino group, an α-carboxylic acid group, and a side chain indole, making it a polar molecule with a non-polar aromat ...
as it only contributes to 1.1% of the protein. Based on the distribution of other amino acid types, there were five high scoring hydrophobic segments. There were also two
transmembrane domain A transmembrane domain (TMD, TM domain) is a membrane-spanning protein domain. TMDs may consist of one or several alpha-helices or a transmembrane beta barrel. Because the interior of the lipid bilayer is hydrophobic, the amino acid residues in ...
s located at 82-98aa and 432-449aa.


Post-translational modifications

C1orf27 is predicted to undergo multiple post translational modifications such as
glycosylation Glycosylation is the reaction in which a carbohydrate (or ' glycan'), i.e. a glycosyl donor, is attached to a hydroxyl or other functional group of another molecule (a glycosyl acceptor) in order to form a glycoconjugate. In biology (but not ...
,
myristoylation Myristoylation is a lipidation modification where a myristoyl group, derived from myristic acid, is covalently attached by an amide bond to the alpha-amino group of an ''N''-terminal glycine residue. Myristic acid is a 14-carbon saturated f ...
, and
phosphorylation In biochemistry, phosphorylation is described as the "transfer of a phosphate group" from a donor to an acceptor. A common phosphorylating agent (phosphate donor) is ATP and a common family of acceptor are alcohols: : This equation can be writ ...
.


Interactions

There were eight interactions identified by Mentha. The first one was UFSP2 which hydrolyzes the
peptide bond In organic chemistry, a peptide bond is an amide type of covalent chemical bond linking two consecutive alpha-amino acids from C1 (carbon number one) of one alpha-amino acid and N2 (nitrogen number two) of another, along a peptide or protein cha ...
at the C-term gly of UFM1, a
ubiquitin Ubiquitin is a small (8.6  kDa) regulatory protein found in most tissues of eukaryotic organisms, i.e., it is found ''ubiquitously''. It was discovered in 1975 by Gideon Goldstein and further characterized throughout the late 1970s and 19 ...
-like modifier protein bound to a number of target proteins. The second one was HSCB which acts as a co-chaperone in iron-sulfur cluster assembly in mitochondria. The third was GRB2 which is an adapter protein that provides a critical link between cell surface
growth factor receptor A growth factor receptor is a receptor that binds to a growth factor. Growth factor receptors are the first stop in cells where the signaling cascade for cell differentiation and proliferation begins. Growth factors, which are ligands that bind to ...
s and the Ras signaling pathway. The fourth was CYLD which is a
protease A protease (also called a peptidase, proteinase, or proteolytic enzyme) is an enzyme that catalysis, catalyzes proteolysis, breaking down proteins into smaller polypeptides or single amino acids, and spurring the formation of new protein products ...
that cleaves Lys-63-linked polyubiquitin chains, controls regulation of cell survival, proliferation, and differentiation, and is required for normal cell cycle progress. The fifth was ATM which activates checkpoint signaling upon double strand breaks,
apoptosis Apoptosis (from ) is a form of programmed cell death that occurs in multicellular organisms and in some eukaryotic, single-celled microorganisms such as yeast. Biochemistry, Biochemical events lead to characteristic cell changes (Morphology (biol ...
, and genotoxic stress. The sixth was
FAM177A1 Family with sequence similarity 177 member A1 (FAM177A1) is a protein that in humans is encoded by the ''FAM177A1'' gene, previously known as C14orf24. The other member of this family is FAM177B. Function FAM177A1 has been linked to immune s ...
, the function of which is unknown. The last two were THID2 and Q81kP6 which are both in bacillus anthracis.


Subcellular localization

The c1orf27 protein is likely cytoplasmic. This was found with 55.5 reliability. The K-NN prediction was ''k'' = 9/23 and the protein was found to be 55.6%
cytoplasm The cytoplasm describes all the material within a eukaryotic or prokaryotic cell, enclosed by the cell membrane, including the organelles and excluding the nucleus in eukaryotic cells. The material inside the nucleus of a eukaryotic cell a ...
ic, 11.1%
mitochondrial A mitochondrion () is an organelle found in the cells of most eukaryotes, such as animals, plants and fungi. Mitochondria have a double membrane structure and use aerobic respiration to generate adenosine triphosphate (ATP), which is used ...
, 11.1% vacuolar, 11.1%
cytoskeletal The cytoskeleton is a complex, dynamic network of interlinking protein filaments present in the cytoplasm of all Cell (biology), cells, including those of bacteria and archaea. In eukaryotes, it extends from the cell nucleus to the cell membrane ...
, and 11.1% golgi.


Structure

Alpha helices An alpha helix (or α-helix) is a sequence of amino acids in a protein that are twisted into a coil (a helix). The alpha helix is the most common structural arrangement in the secondary structure of proteins. It is also the most extreme type of l ...
predicted in the c1orf27 protein are colored blue in the above picture.
Beta sheet The beta sheet (β-sheet, also β-pleated sheet) is a common motif of the regular protein secondary structure. Beta sheets consist of beta strands (β-strands) connected laterally by at least two or three backbone hydrogen bonds, forming a gene ...
s are pictured by the red arrows.
Random coil In polymer chemistry, a random coil is a conformation of polymers where the monomer subunits are oriented randomly while still being bonded to adjacent units. It is not one specific shape, but a statistical distribution of shapes for all the cha ...
s are the purple strands between structures.


Expression

Overall, expression of c1orf27 seems to be ubiquitous. Highest expression body sites (>50 TPM) were
bladder The bladder () is a hollow organ in humans and other vertebrates that stores urine from the kidneys. In placental mammals, urine enters the bladder via the ureters and exits via the urethra during urination. In humans, the bladder is a distens ...
,
bone marrow Bone marrow is a semi-solid biological tissue, tissue found within the Spongy bone, spongy (also known as cancellous) portions of bones. In birds and mammals, bone marrow is the primary site of new blood cell production (or haematopoiesis). It i ...
,
kidney In humans, the kidneys are two reddish-brown bean-shaped blood-filtering organ (anatomy), organs that are a multilobar, multipapillary form of mammalian kidneys, usually without signs of external lobulation. They are located on the left and rig ...
,
liver The liver is a major metabolic organ (anatomy), organ exclusively found in vertebrates, which performs many essential biological Function (biology), functions such as detoxification of the organism, and the Protein biosynthesis, synthesis of var ...
,
pancreas The pancreas (plural pancreases, or pancreata) is an Organ (anatomy), organ of the Digestion, digestive system and endocrine system of vertebrates. In humans, it is located in the abdominal cavity, abdomen behind the stomach and functions as a ...
,
parathyroid Parathyroid glands are small endocrine glands in the neck of humans and other tetrapods. Humans usually have four parathyroid glands, located on the back of the thyroid gland in variable locations. The parathyroid gland produces and secretes pa ...
, and
vascular Vascular can refer to: * blood vessels, the vascular system in animals * vascular tissue Vascular tissue is a complex transporting tissue, formed of more than one cell type, found in vascular plants. The primary components of vascular tissue ...
. Highest expression health sites (>50 TPM) were
adrenal tumor An adrenal tumor or adrenal mass is any benign or malignant neoplasms of the adrenal gland, several of which are notable for their ability to overproduce endocrine hormones. Adrenal cancer is the presence of malignant adrenal tumors, which include ...
s, cervical tumors, and liver tumors. While both of these observations had relatively high TPM scores, there was still relatively low occurrence. This validates the assumption that expression is ubiquitous. There was moderate expression (>25 TPM) in the human
fetus A fetus or foetus (; : fetuses, foetuses, rarely feti or foeti) is the unborn offspring of a viviparous animal that develops from an embryo. Following the embryonic development, embryonic stage, the fetal stage of development takes place. Pren ...
, and expression increased with age. Expression was completely absent in the ears, esophagus,
lymph Lymph () is the fluid that flows through the lymphatic system, a system composed of lymph vessels (channels) and intervening lymph nodes whose function, like the venous system, is to return fluid from the tissues to be recirculated. At the ori ...
, nerve,
salivary gland The salivary glands in many vertebrates including mammals are exocrine glands that produce saliva through a system of ducts. Humans have three paired major salivary glands ( parotid, submandibular, and sublingual), as well as hundreds of min ...
s,
thyroid The thyroid, or thyroid gland, is an endocrine gland in vertebrates. In humans, it is a butterfly-shaped gland located in the neck below the Adam's apple. It consists of two connected lobes. The lower two thirds of the lobes are connected by ...
,
tonsil The tonsils ( ) are a set of lymphoid organs facing into the aerodigestive tract, which is known as Waldeyer's tonsillar ring and consists of the adenoid tonsil (or pharyngeal tonsil), two tubal tonsils, two palatine tonsils, and the lingual t ...
s, and
umbilical cord In Placentalia, placental mammals, the umbilical cord (also called the navel string, birth cord or ''funiculus umbilicalis'') is a conduit between the developing embryo or fetus and the placenta. During prenatal development, the umbilical cord i ...
. There was no expression in
bladder carcinoma Bladder cancer is the abnormal growth of cells in the bladder. These cells can grow to form a tumor, which eventually spreads, damaging the bladder and other organs. Most people with bladder cancer are diagnosed after noticing blood in their ...
despite expression being elevated in the bladder itself. There was high expression in
endothelial cells The endothelium (: endothelia) is a single layer of squamous endothelial cells that line the interior surface of blood vessels and lymphatic vessels. The endothelium forms an interface between circulating blood or lymph in the lumen and the res ...
and neuronal cells but was undetectable in
glial cells Glia, also called glial cells (gliocytes) or neuroglia, are non-neuronal cells in the central nervous system (the brain and the spinal cord) and in the peripheral nervous system that do not produce electrical impulses. The neuroglia make up ...
and
neuropil Neuropil (or "neuropile") is any area in the nervous system composed of mostly unmyelinated axons, dendrites and glial cell processes that forms a synaptically dense region containing a relatively low number of cell bodies. The most prevalent ...
cells. Expression was also localized to the nucleoplasm and plasma membrane in humans but is localized to the
cytosol The cytosol, also known as cytoplasmic matrix or groundplasm, is one of the liquids found inside cells ( intracellular fluid (ICF)). It is separated into compartments by membranes. For example, the mitochondrial matrix separates the mitochondri ...
in mice.


Homology


Paralogs

There were no
paralogs Sequence homology is the biological homology between DNA, RNA, or protein sequences, defined in terms of shared ancestry in the evolutionary history of life. Two segments of DNA can have shared ancestry because of three phenomena: either a speci ...
of C1orf27 identified in the human genome.


Orthologs

There were
orthologs Sequence homology is the biological homology between DNA, RNA, or protein sequences, defined in terms of shared ancestry in the evolutionary history of life. Two segments of DNA can have shared ancestry because of three phenomena: either a spec ...
identified in most animals for which there were complete genome data. The most distant, yet still relevant, orthologs identified were invertebrates from phylum
Cnidaria Cnidaria ( ) is a phylum under kingdom Animalia containing over 11,000 species of aquatic invertebrates found both in fresh water, freshwater and marine environments (predominantly the latter), including jellyfish, hydroid (zoology), hydroids, ...
.


Molecular evolution

The ''m'' value, or number of corrected amino acid changes per 100 residues, for the C1orf27 gene was graphed against the species divergence in millions of years. When compared to divergence graphs of
fibrinogen Fibrinogen (coagulation factor I) is a glycoprotein protein complex, complex, produced in the liver, that circulates in the blood of all vertebrates. During tissue and vascular injury, it is converted Enzyme, enzymatically by thrombin to fibrin ...
and cytochrome C, it was determined that this gene closely resembles the evolutionary pattern observed in fibrinogen, suggesting a more rapid rate of
evolution Evolution is the change in the heritable Phenotypic trait, characteristics of biological populations over successive generations. It occurs when evolutionary processes such as natural selection and genetic drift act on genetic variation, re ...
. ''M'' values for C1orf27 were calculated using the percentage of identity, when compared to humans, observed in the mRNA sequences of the orthologs using the formula derived from the
Molecular Clock Hypothesis The molecular clock is a figurative term for a technique that uses the mutation rate of biomolecules to deduce the time in prehistory when two or more life forms diverged. The biomolecular data used for such calculations are usually nucleotid ...
.


References

{{reflist Genes on human chromosome 1 Membrane proteins