Biology:Zinc finger protein 800

From HandWiki
Short description: Protein found in humans

}}

Zinc Finger Protein 800 or ZNF800 is a protein that in humans is encoded by the ZNF800 gene. The specific function of ZNF800 is not yet well understood by the scientific community.

ZNF800 is found at locus 7q31.33.

Gene

This conceptual translation highlights the zinc finger domains, phosphorylation sites, sumoylation sites, SNP's, and important DNA sequences of human ZNF800.

The ZNF800 gene is between 127373344bp and 127391557bp on the reverse strand of Chromosome 7, locus 7q31.33, spanning a total of 18214 base pairs. It has a predicted molecular weight of 75,235.79 g/mol and a predicted isoelectric point of 9.54.[1] The gene codes for 7 exons and is translated into 664 amino acids. Only 4 exons are in the coding region of the gene and make up the ZNF800 protein, but there is a total of 6. Another name for ZNF800 is PP902. ZNF800 contains 6 functioning C2H2 zinc finger protein domains, a number of SNP's, and multiple experimentally-shown phosphorylation[2][3][4] and SUMOylation sites.[5][6][7][8][9]

Neighborhood

ZNF800 is in the neighborhood of PAX4,[10] which plays an important role in the differentiation and development of pancreatic islet beta cells.[11]

Zinc finger protein family

ZNF800 is part of a large zinc finger protein family, and several other members of this family have previously been described. Zinc finger proteins have both DNA binding and metal (zinc) binding properties. They are found in the cell nucleus. When compared to other members of the zinc finger protein family, ZNF800 has only 48-34% similarity to its closest zinc finger protein relatives. However, several zinc finger regions and possible nucleotide binding sites are conserved.

Transcript

The protein is made in small amounts, potentially due to the unfavorability of its Kozak sequence as compared to that of more favorable proteins.

Protein

Potential layout of the zinc finger protein 800 with significant regions highlighted, such as conserved areas of unknown function in grey, zinc finger regions in green, and unstructured region upstream of the zinc finger in blue.

ZNF800 is ubiquitously expressed in at least 27 different tissues, with the largest amounts in bone marrow, testis, and lymph nodes.[12] The ZNF800 protein contains 6 C2H2 zinc finger regions, 4 of which are conserved in paralogs, and 5 of which are conserved in orthologs of ZNF800. It also has two larger regions of unknown function, which are conserved in orthologs, and several key amino acids, the function of which in this particular protein has not yet been discovered.

Gene level regulation

Graph of tissue specific expression of ZNF800 in humans from The Human Protein Atlas.[13]
This graph shows levels of ZNF800 in human tissues from experimental values on GEO Profile[14] accessed with "GDS 424".
This graph shows experimental tissue specific expression of ZNF800 in humans found on GEO Profiles[14] when searching "GDS 424".
Allen Brain Atlas[15] image of ZNF800 Expression in the mouse brain. This shows ubiquitous expression in low levels.

The promoter for ZNF800 was predicted to be between 127391456 bp and 127393803bp, spanning 2348 base pairs.[16] This is located directly before the gene sequence. Several transcription factor binding sites were identified in this area. ZNF800 is expressed ubiquitously throughout the body at low levels. There does seem to be increased expression in the blood and thymus, but the evidence for this is somewhat contradictory. NCBI GEO Profiles across all tissues revealed the following expression levels in tissues throughout the body: highest expression of ZNF800 in bone marrow, liver, spleen and thymus. In situ hybridization data of ZNF800 in the mouse brain from Allen Brain Atlas showed ubiquitous expression of ZNF800 in the brain in low levels (Figure: "Mouse Brain ZNF800 Expression").

Transcript level regulation

Looking at the intron sequences next to the beginning and end of each exon and comparing those to favorable ones, it was found that all of the splice regions on ZNF800 are favorable. This explains why there are no shorter ZNF800 isoforms.

Protein level regulation

Several different tools were used on ExPASy Proteomics to analyze ZNF800 for likely protein modification sites, unique composition, and localization. The composition of human ZNF800 was compared to that of the Mummichog ZNF800, which has a 46% identity to the human ZNF800. Several different programs showed that ZNF800 is likely localized to the nucleus and does not contain any trans-membrane domains. Human ZNF800 is rich in lysine and poor in glycine and tryptophan, while the mummichog ZNF800 is rich in serine.[17]

Species Homo Sapiens (human) ZNF800 Fundulus heterclitus (Mummichog fish) ZNF800
Identity 100% 46%
Number of Amino Acids 664 914
Theoretical Isoelectric Point 9.54 9.60
Theoretical Molecular Weight 75,235.79 g/mol 100,454.56 g/mol
Comparative Compositional Analysis Extremes G− : 24( 3.6%)

K+ : 81(12.2%)

S : 64( 9.6%)

W− : 0( 0.0%)

G : 39( 4.3%)

K : 63( 6.9%)

S+ :128(14.0%)

W : 5( 0.5%)

Hydrophobic Segments None None
Transmembrane Segments none none
Important Carbon Spacings multiple zinc finger binding regions multiple zinc finger binding regions
Negative Charge Clusters (cmin = 10/30 or 13/45 or 16/60):

1) From 185 to 225: DTEVETVEPPPVEIVTDEVAPTSDEQPQESQADLETSDNSD

(cmin = 10/30 or 13/45 or 16/60):

1) From 555 to 577: EEAVEASDDDDDIDSSPAPSPAE

Multiple programs showed that ZNF800 is likely to be a nuclear protein. NetPhos [18] was used to predict the most likely phosphorylation sites (>0.99). These were compared to experimentally shown phosphorylation[2][3][4] sites found on NCBI Nucleotide[19] and matching phosphorylation sites were bolded. Experimentally shown sumoylation[5][6][7][8][9] sites were also included. The likely sites of Phosphorilation, O-Glycosilation, Sumoylation, and O-ß-GlcNAc attachment are detailed below:

Type of Post-Translational Modification Program Name (all programs found at ExPASy Proteomics[20]) Program Mission Human ZNF800 Results Degree of Certainty (when applicable) Conserved region?

(N/A when applied to whole protein)

Protein sorting DAS-TMfilter server Find transmembrane protein domain ZNF800 is not a transmembrane protein N/A
PSORT II Determine where in the cell the protein sorts to ZNF800 sorts to the nucleus 100% N/A
NNCN: Reinhardt's method for Cytplasmic/Nuclear discrimination Discriminate between cytoplasmic and nuclear proteins ZNF800 is a nuclear protein 94.1% N/A
k-NN Determine where in the cell the protein sorts to ZNF800 is a nuclear protein 100% N/A
NUCDISC: discrimination of nuclear localization signals Locate nuclear localization signal pat4: HKKK (3) at 253

pat4: RKPK (4) at 473

pat4: RRKR (5) at 529

pat7: none

bipartite: RRGVRRHIRKVHKKKME at 242

bipartite: KRDVIRHITVVHKKSSR at 531

content of basic residues: 17.8%

NLS Score: 1.27

Yes
Checking 63 PROSITE DNA binding motifs: Find prosite DNA binding motifs Zinc finger, C2H2 type, domains: 1) CCLCRKEFNS

RRGVRRHIRKVH at 232

2) CPVC1CKSFAT

KANVRRHFDEVH at 289

3)CKCLLCKRKY

SSQIMLKRHMQIVH at 359

4) CKLCKRQFT

SKQNLTKHIELH at 488

5)CNKCGKAFAK

KTYLEHHKKTH at 620

Yes
Phosphorylation NetPhos Serines and threonines that are likely to be phosphorylated 214 S, 317 S, 336 S, 348 S, 349 S, 415 S, 457 S, 460 S, 462 S, 578 S, 585 S, 587 S, 597 S, 647 S > 0.99 Yes
NCBI Nucleotide Serines and threonines have been experimentally shown to be phosphorylated Phosphorylation regions: 317 S, 319 T, 336 S, 422 S, 426 S, 455 S, 457 S, 460 S, 462 S
O-Glycosylation NetOGlycocilation Serines and threonines that are likely to be O-glycosylated 159, 317, 424, 645 >0.9 Yes
O-ß-GlcNAc attachment sites YingOYang Serines and threonines that are likely to have O-ß-GlcNAc attachment sites 38T, 157 T, 207 S, 460 S, 461 T, 593S, 653 T, 660 S ++ and +++ Yes
Sumoylation Uniprot Experimentally confirmed lysines that interact with SUMO-1 and SUMO-2 132K, 279K, 392K, 409K, 471K, 599K

Homology / Evolution

Paralogs of ZNF800

ZNF800 has 5 possible paralogs, however, all of these had a maximum identity of 33%, so it is unlikely they are true paralogs of ZNF800. They may just be similar because they are all zinc finger proteins and contain zinc finger binding domains. When multiple sequence alignments were made, the zinc finger binding domains were the areas with the most conservation.[21]

Paralogs Accession Length Query ID Similarity E
ZNF837 NP_612475.1 531 26% 24% 36% 2e-04
ZNFI3 NP_443114.1 472 23% 25% 44% 3e-04
ZNF366 NP_689838.1 744 12% 33% 48% 4e-04
ZNF335 NP_071378.1 1342 24% 21% 34% 0.007
ZNF260 NP_001012774.1 412 23% 25% 46% 0.007
This graph was made using calculations of m and n where n is 100-% identity of a protein and m=-ln(1-n/100)*100. Identity percentages were found using BLAST.[22]

Orthologs of ZNF800

ZNF800 has homologs in trichoplax, invertebrates, fish, amphibians, reptiles, birds, and mammals. Based on the found E values of the protein with its orthologs in the aforementioned categories using NCBI Blast,[23] ZNF800 is at least 930 Millions of years old. At first it was hypothesized that ZNF800 has an Ortholog in fungus, dating back to 1150 Millions of years ago, however, a BLAT[10] search of the fungus sequence in the human domain gave no results, which lead to the conclusion that these sequences are not similar enough to prove they are truly related.

Sequence # Genus + Species Common name Date of divergence Accession # Sequence length ( aa) Sequence ID Notes
1 Galeopterus variegatus Sunda flying lemur 82 MYA XP_008569698.1 665 98%
2 Pteropus alecto Black flying fox 94 MYA ELK13892.1 676 99%
3 Enhydra lutris kenyoni Sea otter 94 MYA XP_022365796.1 726 98%
4 Elephantulus edwardii Cape elephant shrew 102 MYA XP_006896138.1 665 95%
5 Gavialis gangeticus Gharial crocodile 320 MYA XP_019364835.1 666 86%
6 Xenopus tropicalis Western clawed frog 353 MYA NP_001120895.1 690 58%
7 Latimeria chalumnae West Indian Ocean coelacanth 414 MYA XP_005994297.1 708 72%
8 Fundulus heteroclitus Mummichog 432 MYA XP_012726041.1 914 46%
9 Rhincodon typus Whale shark 465 MYA XP_020372543.1 713 65%
10 Limulus polyphemus Spider 794 MYA XP_022245667.1 796 21%
11 Trichoplax adhaerens Trichoplax 930 MYA XP_002111948.1 735 28% E: 0.003, this large number may indicate it is not an ortholog
This tree from Clustal W[24] shows an unrooted phylogenetic tree of the ZNF800 orthologs showing relatedness and animal type.

Interacting Proteins

The most common transcription factors with high probability (>0.84) of binding ZNF800 promoter are shown in the figure. These were found using the ElDorado genome database on Genomatrix.[16]

[25] STRING Prediction of ZNF800 interacting proteins.

A few potential interacting proteins (depicted in Figure "Proteins Predicted to Interact with ZNF800" on the right) were found using STRING[25] and yscP, plague bacteria, was shown to bind ZNF800 using Y2H data.[26]

Clinical Significance

This image from GEO Profiles of ZNF800[27] shows a correlation between expression of ZNF800 and chronic B-lymphocytic leukemia.

While there is no obvious link so far between ZNF800 and specific diseases it may cause, several studies on GEO Profiles[27] have shown a correlation between ZNF800 and disease condition, including a correlation between increased ZNF800 expression and chronic B-lymphocytic leukemia (ZNF800 GEO Profile 1), as well as a correlation between decreased ZNF800 expression and myotonic dystrophy type 2 (GEO Profile of ZNF800 Figure 2).

This image from GEO Profiles of ZNF800[27] shows a relationship between patients with myotonic dystrophy type 2 and control. Based on the graph, control patients have significantly higher ZNF800 expression than patients with myotonic dystrophy, suggesting there may be some relationship.

The 20 most common SNP's in the protein coding sequence of ZNF800 were analyzed, of these, the most common was still found in only 10% of the population. Of these mutations, none were in the zinc finger binding portions of ZNF800.

Suggested Reading

Articles discussing ZNF800 are limited. There are 4 existing patents mention ZNF800 in lists of 100s-1000s, which address concepts such as “prostate cancer progression”, “progression risk of glaucoma”, “method for inducing pluripotency in human somatic cells”, and “modifying transcriptional regulatory networks in stem cells”.[28] It is also mentioned in 27 patent applications, several of which have to do with cancer biomarkers.[29]

References

  1. "Compute pI/Mw Tool". https://web.expasy.org/compute_pi/. 
  2. 2.0 2.1 "Toward a comprehensive characterization of a human cancer cell phosphoproteome" (in EN). Journal of Proteome Research 12 (1): 260–71. January 2013. doi:10.1021/pr300630k. PMID 23186163. https://zenodo.org/record/3425400. 
  3. 3.0 3.1 "A quantitative atlas of mitotic phosphorylation". Proceedings of the National Academy of Sciences of the United States of America 105 (31): 10762–7. August 2008. doi:10.1073/pnas.0805139105. PMID 18669648. Bibcode2008PNAS..10510762D. 
  4. 4.0 4.1 "An enzyme assisted RP-RPLC approach for in-depth analysis of human liver phosphoproteome". Journal of Proteomics 96: 253–62. January 2014. doi:10.1016/j.jprot.2013.11.014. PMID 24275569. 
  5. 5.0 5.1 "Uncovering global SUMOylation signaling networks in a site-specific manner" (in En). Nature Structural & Molecular Biology 21 (10): 927–36. October 2014. doi:10.1038/nsmb.2890. PMID 25218447. 
  6. 6.0 6.1 "Mapping of SUMO sites and analysis of SUMOylation changes induced by external stimuli". Proceedings of the National Academy of Sciences of the United States of America 111 (34): 12432–7. August 2014. doi:10.1073/pnas.1413825111. PMID 25114211. Bibcode2014PNAS..11112432I. 
  7. 7.0 7.1 "SUMO-2 Orchestrates Chromatin Modifiers in Response to DNA Damage". Cell Reports 10 (10): 1778–1791. March 2015. doi:10.1016/j.celrep.2015.02.033. PMID 25772364. 
  8. 8.0 8.1 "System-wide Analysis of SUMOylation Dynamics in Response to Replication Stress Reveals Novel Small Ubiquitin-like Modified Target Proteins and Acceptor Lysines Relevant for Genome Stability". Molecular & Cellular Proteomics 14 (5): 1419–34. May 2015. doi:10.1074/mcp.O114.044792. PMID 25755297. 
  9. 9.0 9.1 "Site-specific mapping of the human SUMO proteome reveals co-modification with phosphorylation" (in En). Nature Structural & Molecular Biology 24 (3): 325–336. March 2017. doi:10.1038/nsmb.3366. PMID 28112733. 
  10. 10.0 10.1 "Human Blat Results for ZNF800". https://genome.ucsc.edu/cgi-bin/hgBlat. 
  11. "UniProtKB - O43316 (PAX4_HUMAN)". https://www.uniprot.org/uniprot/O43316. 
  12. "ZNF800 zinc finger protein 800 [Homo sapiens (human) - Gene - NCBI"]. https://www.ncbi.nlm.nih.gov/gene/168850#gene-expression. 
  13. "The Human Protein Atlas". https://www.proteinatlas.org/. 
  14. 14.0 14.1 "GEO Profiles". https://www.ncbi.nlm.nih.gov/geoprofiles/?term=GDS424. 
  15. "Allen Brain Atlas (Mouse)". http://mouse.brain-map.org/. 
  16. 16.0 16.1 "Gene2Promoter". https://www.genomatix.de/online_help/help_eldorado/Gene2Promoter_Intro.html. 
  17. "Statistical Analysis of Protein Sequences "SAPS"". https://www.ebi.ac.uk/Tools/seqstats/saps/. 
  18. "NetPhos 3.1 Server". http://www.cbs.dtu.dk/services/NetPhos/. 
  19. "Nucleotide". https://www.ncbi.nlm.nih.gov/nucleotide?cmd=search. 
  20. "ExPASy Tools". https://www.expasy.org/tools/. 
  21. "Multiple Sequence Alignment". https://www.ebi.ac.uk/Tools/msa/clustalo/. 
  22. "Basic Local Alignment Search Tool". https://blast.ncbi.nlm.nih.gov/Blast.cgi. 
  23. "NCBI Blast". https://blast.ncbi.nlm.nih.gov/Blast.cgi. 
  24. "Clustal W". http://www.genome.jp/tools-bin/clustalw. 
  25. 25.0 25.1 "Functional Protein Association Networks". https://string-db.org/. 
  26. "The human-bacterial pathogen protein interaction networks of Bacillus anthracis, Francisella tularensis, and Yersinia pestis". PLOS ONE 5 (8): e12089. August 2010. doi:10.1371/journal.pone.0012089. PMID 20711500. Bibcode2010PLoSO...512089D. 
  27. 27.0 27.1 27.2 "ZNF800 GEO Profiles". https://www.ncbi.nlm.nih.gov/geoprofiles. 
  28. "US Patent and Trademark Office". http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=%2Fnetahtml%2FPTO%2Fsearch-adv.htm&r=0&p=1&f=S&l=50&Query=ZNF800%0D%0A&d=PTXT. 
  29. "PreGrant Publication Database Search Results: ZNF800 in PGPUB Production Database March 15th - September 30th 2001". http://appft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&p=1&u=/netahtml/PTO/search-adv.html&r=0&f=S&l=50&d=PG01&Query=ZNF800.