Biology:UPF0602

From HandWiki
UPF0602 Protein C4orf47
C4orf47 AlphaFold Prediction.png
AlphaFold prediction of UPF0602 protein c4orf47's tertiary structure[1]
Identifiers
Symbolc4orf47
Alt. symbolsLOC441054
HGNC34346
RefSeqNM_00107829
UniProtA7E2U8
Other data
LocusChr. 4 q35.1

UPF0602 is a protein in humans that is encoded by the chromosome 4 open reading frame 47 (c4orf47) gene.[2]

Gene

The c4orf47 gene is positioned at 4q35.1 on the plus strand and spans 44,602 base pairs in length (185,405,227...185,449,828). The gene is made up of 12 exons and 11 introns.[3]

There is overlap with two other genes which reside on the negative strand. These genes are UFM1 specific peptidase 2 (UFSP2) and Coiled-coil domain containing 110 (CCDC110)

Another alias for the c4orf47 gene is LOC441054

Transcript

Transcript variant 1 is the longest experimentally validated variant of c4orf47 mRNA and it encodes for UPF0602 protein isoform 1. This variant contains a total of 8 exons with an upstream in-frame stop codon located within the first exon, a disordered region, and a domain of unknown function. The mRNA is 1,333 nucleotides long and encodes for a 309 amino acid polypeptide.[4]

Transcript variant 2 differs in the 5' UTR, uses an alternate translation start site, and lacks two alternate exons in the 5' coding region compared to variant 1. The encoded protein isoform (2) is shorter and has a distinct N-terminus compared to protein isoform 1. The mRNA is 1,037 nucleotides long and encodes for a 183 amino acid polypeptide.[5]

C4orf47 mRNA is ubiquitously expressed in all tissue, with higher expression occurring within the choroid plexus, retina, fallopian tubes, and testis.[6]

Protein

UPF0602 protein isoform 1 has a molecular weight of 34.4kDa and a predicted isoelectric point of 9.64 pI.[7] It contains the domain of unknown function known as DUF4586.[8] this domain belongs to pfam15239 which is the only member of protein superfamily cl21099.[9]

This protein contains a higher than average quantity of basic amino acids relative to its size and contains two repeat sections.[7]

Location Amino Acids
145 - 148 PGKK
235 - 238 PGKK
164 - 168 SHSAD
252 - 256 SHSAD

Localization

This protein contains no signal peptide and has been shown to localize within the cell to cytoplasmic microtubules, centrosomes, and non-motile cilia.[10][3][11][12]

Expression

UPF0602 is ubiquitously expressed in all tissue, with higher expression occurring within the lungs, fallopian tubes, and testis. The lungs and fallopian tubes see the greatest protein abundance within ciliated cells. Specifically in the tip of cilia and the cilia axoneme. Within the testis, protein abundance is highest in elongated or late spermatid.[6][10]

Homology

UPF0602 protein has no paralogs. However, homologs are found within most ciliated eukaryotes. Exceptions include all reptiles except turtles, salamanders, and lobe-finned fishes other than the West Indian Coelacanth. A UPF0602 protein homolog is also found within Chytridiomycetes, a class of fungi.

The following table represents a small selection of homologs found using BLAST.[13]

Genus and Species Common Name Taxonomic Group Estimated Divergence (MYA) Accession Number Sequence Length (aa) Sequence Identity (%) Sequence Similarity (%)
Homo sapiens Human Primates 0 NP_001107829.1 309 100 100
Gallus gallus Domestic chicken Aves 312 XP_004936032.2 311 67.2 78.8
Chrysemys picta bellii Painted turtle Reptilia 312 XP_005282053.1 311 66.6 79.7
Rhinatrema bivittatum Two-lined caecilian Amphibia 351.8 XP_029442782.1 308 61.1 74.3
Xenophus tropicalis Western clawed frog Amphibia 351.8 XP_002934310.1 307 58.3 75.1
Latimeria chalumnae West Indian coelacanth Coelacanthiformes 413 XP_014353738.1 311 55 70.4
Danio rerio Zebrafish Actinopterygii 435 NP_001038879.1 312 54.5 70.2
Rhinocodon typus Whale shark Chondrichthyes 473 XP_020367910.1 319 50.2 63.3
Lytechinus variegatus Green sea urchin Temnopleuroida 684 XP_041485424.1 316 52.8 67.1
Pomacea canaliculata Channeled applesnail Mollusca 797 XP_025090509.1 321 51.1 67
Amphibalanus amphitrite Acorn barnacle Arthropoda 797 KAF0292396.1 334 30.8 47.9
Powellomyces hirtus Chytrids Chytridiomycetes 1017 TPX58729.1 353 32.3 45.3

Evolution

The c4orf47 gene has been evolving at a relatively slow rate when compared to the evolutionary rates of Fibrinogen Alpha and Cytochrome C. This suggests there is a conserved function for the encoded protein.

Graph showing UPF0602 protein c4orf47's evolutionary history

Function

The function this protein carries out within the cell are not well understood by the scientific community, however evidence suggests it is related to cilia and flagella assembly.[10][14]

Interacting proteins

High throughput evidence supports physical interaction between UPF0602 protein and nucleophosmin (NPM1),[15] as well as with ubiquitin-specific peptidase 9, Y-linked (USP9Y).[14]

Clinical significance

Single nucleotide polymorphisms (SNPs) within regions of the UFSP2 gene overlapping c4orf47 have been linked to Beukes hip dysplasia, Spondyloepimetaphyseal dysplasia, Di Rocco type, microcephaly, and other developmental anomalies.[16][17][18]

References

  1. "AlphaFold Protein Structure Database". https://alphafold.ebi.ac.uk/entry/A7E2U8. 
  2. "UPF0602 protein C4orf47 isoform 1 [Homo sapiens - Protein"]. https://www.ncbi.nlm.nih.gov/protein/NP_001107829.1. 
  3. 3.0 3.1 "C4orf47 chromosome 4 open reading frame 47 [Homo sapiens (human) - Gene"]. https://www.ncbi.nlm.nih.gov/gene/441054. 
  4. Homo sapiens chromosome 4 open reading frame 47 (C4orf47), transcript variant 1, mRNA. 2 July 2021. https://www.ncbi.nlm.nih.gov/nuccore/NM_001114357.3. 
  5. (in en-US) Homo sapiens chromosome 4 open reading frame 47 (C4orf47), transcript variant 2, mRNA. 2020-12-18. http://www.ncbi.nlm.nih.gov/nuccore/NM_001346007.2. 
  6. 6.0 6.1 "Tissue expression of C4orf47 - Summary". https://www.proteinatlas.org/ENSG00000205129-C4orf47/tissue. 
  7. 7.0 7.1 "SAPS < Sequence Statistics". https://www.ebi.ac.uk/Tools/seqstats/saps/. 
  8. "UPF0602 protein C4orf47 isoform 1 [Homo sapiens - Protein"]. https://www.ncbi.nlm.nih.gov/protein/NP_001107829.1. 
  9. "CDD Conserved Protein Domain Family: DUF4586". https://www.ncbi.nlm.nih.gov/Structure/cdd/pfam15239. 
  10. 10.0 10.1 10.2 "Kappa- opioid receptor regulates human sperm functions via SPANX-A/D protein family". Reproductive Biology 20 (3): 300–306. September 2020. doi:10.1016/j.repbio.2020.07.003. PMID 32684427. 
  11. "Proteomic analysis of mammalian sperm cells identifies new components of the centrosome". Journal of Cell Science 127 (Pt 19): 4128–4133. October 2014. doi:10.1242/jcs.157008. PMID 25074808. 
  12. "Evolutionary Proteomics Uncovers Ancient Associations of Cilia with Signaling Pathways". Developmental Cell 43 (6): 744–762.e11. December 2017. doi:10.1016/j.devcel.2017.11.014. PMID 29257953. 
  13. "BLAST: Basic Local Alignment Search Tool". National Center for Biotechnology Information. https://blast.ncbi.nlm.nih.gov/Blast.cgi. 
  14. 14.0 14.1 "Dual proteome-scale networks reveal cell-specific remodeling of the human interactome". Cell 184 (11): 3022–3040.e28. May 2021. doi:10.1016/j.cell.2021.04.011. PMID 33961781. 
  15. "Histone Interaction Landscapes Visualized by Crosslinking Mass Spectrometry in Intact Cell Nuclei". Molecular & Cellular Proteomics 17 (10): 2018–2033. October 2018. doi:10.1074/mcp.RA118.000924. PMID 30021884. 
  16. "Identification of a mutation in the ubiquitin-fold modifier 1-specific peptidase 2 gene, UFSP2, in an extended South African family with Beukes hip dysplasia". South African Medical Journal = Suid-Afrikaanse Tydskrif vir Geneeskunde 105 (7): 558–563. September 2015. doi:10.7196/SAMJnew.7917. PMID 26428751. 
  17. "Novel spondyloepimetaphyseal dysplasia due to UFSP2 gene mutation". Clinical Genetics 93 (3): 671–674. March 2018. doi:10.1111/cge.13134. PMID 28892125. 
  18. "A pathogenic UFSP2 variant in an autosomal recessive form of pediatric neurodevelopmental anomalies and epilepsy". Genetics in Medicine 23 (5): 900–908. May 2021. doi:10.1038/s41436-020-01071-z. PMID 33473208.