Biology:C1orf185

From HandWiki
Short description: Protein-coding gene in the species Homo sapiens


A representation of the 3D structure of the protein myoglobin showing turquoise α-helices.
Generic protein structure example

Chromosome 1 open reading frame 185, also known as C1orf185, is a protein that in humans is encoded by the C1orf185 gene. In humans, C1orf185 is a lowly expressed protein that has been found to be occasionally expressed in the circulatory system.[1][2]

Gene

C1orf185 is located on chromosome 1 in humans on the positive strand between bases 51,102,221 and 51,148,086.[3] There are 5 exons in the main splice isoform, however the number and selection of exons varies based on the isoform[3]

C1orf185 locus within the human genome. Diagrams from NCBI Genome Viewer[4] (top) and the Integrative Genomics Viewer[5] (bottom).

mRNA and Protein Isoforms

C1orf185 has 5 different splice isoforms in humans.[3]

C1orf185 Transcripts
Isoform mRNA Accession Protein Accession Transcript Length (bp) Protein Length (AA)
uncharacterized protein C1orf185 NM_001136508.2 NP_001129980.1 921 199
uncharacterized protein C1orf185 isoform X1 XM_011541282.2 XP_011539584.1   787 195
uncharacterized protein C1orf185 isoform X2 XM_024446525.1 XP_024302293.1 586 116
uncharacterized protein C1orf185 isoform X3 XM_024446528.1 XP_024302296.1 420 116
uncharacterized protein C1orf185 isoform X4 XM_024446529.1 XP_024302297.1 367 107

Protein

C1orf185 is a member of the pfam15842 protein family, containing a domain of unknown function, DUF4718.[6] This family of proteins is between 130 and 224 amino acids long, and is found only in eukaryotes..

The main splice isoform of C1orf185 has a molecular weight of 22.4 kDa[7] and an isoelectric point of 7.67.[8] It contains a transmembrane domain spanning from positions 15 to 37.[3] There is also a conserved serine-rich region from S123 to S142, which could possibly indicate function as a "splicing activator".[9]

C1orf185 contains 3 primary subcellular domains: an extracellular domain which spans the amino acids from positions 1 to 14, a transmembrane domain from positions 15–37, and a large intracellular domain from positions 38–199.[10]

Below are predicted secondary and tertiary structures of C1orf185, modeled using the Chou-Fasman[11] secondary structure prediction tool and the I-TASSER[12] protein structure and function prediction tool. Chou-Fasman predicts a mixture of α-helices, β-sheets, and other structural turns and coils, which can be seen modeled on the I-TASSER prediction.

Chou-Fasman Secondary Structure Prediction[11] (left) and I-TASSER Tertiary Structure Prediction[12] (right) for C1orf185.

Regulation of Expression

Gene Level Regulation

Below is a diagram showing the locations of predicted transcription factor binding sites in the C1orf185 promoter, along with a table describing the attributes of each individual binding site. The transcription factors were found and analyzed using the ElDorado tool from Genomatix.[13]

Diagram of the C1orf185 with transcription factor binding sites annotated.


Transcription Factor Binding Sites within the C1orf185 Promoter
Transcription Factor Detailed matrix info Matrix similarity Sequence +/-
VTATA.02 Mammalian C-type LTR TATA box 0.91 tgtcaTAAAaacattcc +
NKX25.05 Homeodomain factor Nkx-2.5/Csx 0.986 tttttTGAGtgaagtcttg -
CDX1.01 Intestine specific homeodomain factor CDX-1 0.988 ttgccctTTTAtgaaaaaa +
VTATA.02 Mammalian C-type LTR TATA box 0.914 tacttTAAAaataagca -
ERG.02 v-ets erythroblastosis virus E26 oncogene homolog 0.942 gtctcaaaGGAAaataaaaag -
SPI1.02 SPI-1 proto-oncogene; hematopoietic transcription factor PU.1 0.992 attaaagaGGAAgtctcaaag -
FHXB.01 Fork head homologous X binds DNA with a dual sequence specificity (FHXA and FHXB) 0.831 ttctaaATAAcacattt -
TGIF.01 TG-interacting factor belonging to TALE class of homeodomain factors 1 tctataaatGTCAatta +
ZNF219.01 Kruppel-like zinc finger protein 219 0.913 ctccaCCCCcgtcagcccaaagg +
ZBP89.01 Zinc finger transcription factor ZBP-89 0.956 catctccaCCCCcgtcagcccaa +
CREB.02 cAMP-responsive element binding protein 0.922 cctttgggcTGACgggggtgg -
FOXP1_ES.01 Alternative splicing variant of FOXP1, activated in ESCs 1 tcataaaAACAttccag -
VTATA.02 Mammalian C-type LTR TATA box 0.895 tgtcaTAAAaacattcc -
CREB1.02 cAMP-responsive element binding protein 1 0.949 tggaaGTGAtgtcataaaaac -
SPI1.02 SPI-1 proto-oncogene; hematopoietic transcription factor PU.1 0.979 atttgagtGGAAgtgatgtca -
NKX25.05 Homeodomain factor Nkx-2.5/Csx 0.994 gaattTGAGtggaagtgat -
MESP1_2.01 Mesoderm posterior 1 and 2 0.917 cagtCATAtggct +
MESP1_2.01 Mesoderm posterior 1 and 2 0.929 aagcCATAtgact -
DELTAEF1.01 deltaEF1 0.99 gcttcACCTaaag +
ERG.02 v-ets erythroblastosis virus E26 oncogene homolog 0.93 gaagaagaGGAAaatatattt +

Matrix similarity correlates to the confidence in the prediction for each individual binding sites. +/- correlates to presence on either the positive or negative strand. The transcription factors are listed in order of appearance from beginning to end of the promoter.

C1orf185 has a very low expression pattern, with the only site in the body showing any signs of expression being the circulatory system. Two NCBI GEO profiles have shown that C1orf185 was consistently overexpressed in whole blood samples within a group of postmenopausal women,[14] as well as being somewhat overexpressed in the peripheral blood of Parkinson's patients compared to controls.[15]

Transcript Level Regulation

Below is a figure produced by mfold[16] showing predicted mRNA structure of the 3' UTR of C1orf185.

Possible mRNA secondary structure of C1orf185 made by mfold.[16] There are 3 main branches that end in 1-2 stem loops each. The stem loop near the end of the sequence contains the Poly-A signal, which signals the end of transcription.

C1orf185 has one conserved miRNA binding site of type 7mer-A1 among several orthologs.[17] The presence of a 7mer-A1 binding site indicates that C1orf185 is likely to be post-transcriptionally repressed.[18]

Possible conserved C1orf185 miRNA binding site details found using TargetScan.[17]

Protein Level Regulation

Below is a figure and table showing predicted post-translational modification sites for C1orf185.

Sequence showing predicted post-translational modifications on the C1orf185 protein.
Table of Post-Translational Modifications for C1orf185
Type of Modification Tool Positions in Homo sapiens
Phosphorylation NetPhos[19] S61, S69, S104, S130, S142, S147, S165, S186
Glycation NetGlycate,[20] NetNGlyc[21] K5, K50, K98, K113
O-GlcNAc YinOYang[22] T121, S122, S130

The presence of multiple leucine glycation sites indicate that there may be ways to deter the function of the protein, as glycation has been associated with the loss of protein function in blood vessels[23]

Clinical Significance

C1orf185 has been shown to play a role in the circulatory system, likely in a more reactive role, as it is lowly expressed across many species. It appears in studies surrounding atrial fibrillation[2] and abnormal QRS duration,[1] which implies it may play a role in those circulatory diseases.

Homology

Below is a table showing C1orf185 orthologs across a variety of conserved species. Orthologs were found using NCBI BLAST,[24] the dates of divergence were found using TimeTree,[25] and the global sequence identities and similarities were found using the Clustal Omega multiple sequence alignment tool.[26]


Ortholog Table for C1orf185.
Genus and Species Common Name Taxonomic Group Date of Divergence (MYA) Accession Number Sequence Length (aa) Sequence Identity (Global) Sequence Similarity (Global)
Homo sapiens Human Primates 0 NP_001129980.1 199 100% 100%
Pongo abelii Sumatran orangutan Primates 15.76 PNJ53823.1 195 93.50% 95.50%
Cebus capucinus imitator Capuchin Primates 43.2 XP_017404303.1 229 77.00% 79.60%
Galeopterus variegatus Sunda flying lemur Dermoptera 76 XP_008578352.1 203 73.70% 77.90%
Oryctolagus cuniculus Rabbit Lagomorpha 90 XP_008263491.1 225 69.90% 76.40%
Dipodomys ordii Ord's kangaroo rat Rodentia 90 XP_012877642.1 188 52.20% 59.40%
Mastomys coucha Southern multimammate mouse Rodentia 90 XP_031234037 263 51.50% 61.50%
Mus musculus House mouse Rodentia 90 NP_001186019.1 226 47.40% 59.50%
Peromyscus leucopus White-footed mouse Rodentia 90 XP_028745885.1 295 41% 48.20%
Phyllostomus discolor Pale spear-nosed bat Chiroptera 96 XP_028367083.1 191 73.40% 80.40%
Myotis davidii David's myotis Chiroptera 96 XP_006768446.1 196 71.40% 78.40%
Equus caballus Horse Perissodactyla 96 XP_023485921.1 243 63.80% 68.30%
Muntiacus muntjak Indian muntjac Artiodactyla 96 KAB0362285.1 200 59.40% 65.90%
Hipposideros armiger Great roundleaf bat Chiroptera 96 XP_019487867.1 157 54.90% 59.20%
Tursiops truncatus Bottlenose dolphin Artiodactyla 96 XP_033708766.1 189 54.10% 59.00%
Sarcophilus harrisii Tasmanian devil Dasyuromorhpia 159 XP_031825005.1 333 18.20% 27.70%
Ornithorhynchus anatinus Platypus Monotremata 180 XP_028902271 309 26.80% 37.40%
Pelodiscus sinensis Chinese softshell turtle Reptilia 312 XP_025042106.1 890 7.40% 11.40%
Gopherus evgoodei Sinaloan thornscrub tortoise Reptilia 312 XP_030429802.1 777 4.00% 6.30%
Chrysemys picta bellii Western painted turtle Reptilia 312 XP_023960730.1 748 3.70% 5.80%

Compared to other genes, C1orf185 appears to be evolving and changing relatively quickly, as it is only conserved in mammals and a few turtles, and more distant mammals have quite distant similarities. Primates are the only taxonomic group that heavily conserves this gene with regards to the human sequence, while other mammals and turtles only heavily conserve the transmembrane domain (positions 15–37). As primates and mammals are warm-blooded, this may further support the evidence showing a possible role in the circulatory system.

References

  1. 1.0 1.1 "Common variants in 22 loci are associated with QRS duration and cardiac ventricular conduction". Nature Genetics 42 (12): 1068–76. December 2010. doi:10.1038/ng.716. PMID 21076409. 
  2. 2.0 2.1 "Multi-ethnic genome-wide association study for atrial fibrillation". Nature Genetics 50 (9): 1225–1233. June 2018. doi:10.1038/s41588-018-0133-9. PMID 29892015. 
  3. 3.0 3.1 3.2 3.3 "C1orf185 chromosome 1 open reading frame 185 [Homo sapiens (human) - Gene - NCBI"]. https://www.ncbi.nlm.nih.gov/gene/284546. 
  4. "Genome Data Viewer". https://www.ncbi.nlm.nih.gov/genome/gdv/browser/gene/?id=284546. 
  5. "Home | Integrative Genomics Viewer". http://software.broadinstitute.org/software/igv/. 
  6. "CDD Conserved Protein Domain Family: DUF4718". https://www.ncbi.nlm.nih.gov/Structure/cdd/cddsrv.cgi?uid=374163. 
  7. "SAPS < Sequence Statistics < EMBL-EBI". https://www.ebi.ac.uk/Tools/seqstats/saps/. 
  8. "ExPASy - Compute pI/Mw tool". https://web.expasy.org/compute_pi/. 
  9. "Arginine/serine-rich domains of SR proteins can function as activators of pre-mRNA splicing". Molecular Cell 1 (5): 765–71. April 1998. doi:10.1016/s1097-2765(00)80076-3. PMID 9660960. 
  10. "TMHMM Server, v. 2.0". http://www.cbs.dtu.dk/services/TMHMM/. 
  11. 11.0 11.1 "CFSSP: Chou & Fasman Secondary Structure Prediction Server". http://www.biogem.org/tool/chou-fasman/index.php. 
  12. 12.0 12.1 "I-TASSER server for protein structure and function prediction". https://zhanglab.ccmb.med.umich.edu/I-TASSER/. 
  13. "Genomatix - NGS Data Analysis & Personalized Medicine". https://www.genomatix.de/?s=3cb24792b9b50b12d07447330301e42a. 
  14. "13889230 - GEO Profiles - NCBI". https://www.ncbi.nlm.nih.gov/geoprofiles/13889230. 
  15. "129780050 - GEO Profiles - NCBI". https://www.ncbi.nlm.nih.gov/geoprofiles/129780050. 
  16. 16.0 16.1 "The Mfold Web Server | mfold.rit.albany.edu". http://unafold.rna.albany.edu/?q=mfold. 
  17. 17.0 17.1 "TargetScanHuman 7.2". http://www.targetscan.org/vert_72/. 
  18. "MicroRNA targeting specificity in mammals: determinants beyond seed pairing". Molecular Cell 27 (1): 91–105. July 2007. doi:10.1016/j.molcel.2007.06.017. PMID 17612493. 
  19. "NetPhos 3.1 Server". http://www.cbs.dtu.dk/services/NetPhos/. 
  20. "NetGlycate 1.0 Server" (in en). http://www.cbs.dtu.dk/services/NetGlycate. 
  21. "NetNGlyc 1.0 Server". http://www.cbs.dtu.dk/services/NetNGlyc/. 
  22. "YinOYang 1.2 Server". http://www.cbs.dtu.dk/services/YinOYang/. 
  23. "The role of glycation in the pathogenesis of aging and its prevention through herbal products and physical exercise". Journal of Exercise Nutrition & Biochemistry 21 (3): 55–61. September 2017. doi:10.20463/jenb.2017.0027. PMID 29036767. 
  24. "BLAST: Basic Local Alignment Search Tool". https://blast.ncbi.nlm.nih.gov/Blast.cgi. 
  25. "TimeTree :: The Timescale of Life". http://www.timetree.org/. 
  26. "Clustal Omega < Multiple Sequence Alignment < EMBL-EBI". https://www.ebi.ac.uk/Tools/msa/clustalo/.