※ Computational Resources for proteins which play a pivotal role in phospho-signaling events:

    <1>. Protein Kinase Resources.

        (1) Protein kinase databases.

        (2) Phosphorylation Sites Database.

        (3) Prediction of kinase-specific phosphorylation sites.

        (4) Protein kinases associated with diseases.

    <2>. Protein phosphatase Resources.

    <3>. Phosphoprotein-binding domain Resources.

※ Public Databases integrated in iEKPD 2.0:

         (1) Cancer Mutation

        (2) Genetic Variation

        (3) Disease-associated Information

        (4) mRNA Expression

        (5) DNA & RNA Element

        (6) DNA Methylation

        (7) Molecular Interaction

        (8) Drug-target relation

        (9) Protein 3D Structure

        (10) Post-translational Modification (PTM)

        (11) Protein Expression/Proteomics

        (12) Subcellular Localization

        (13) Protein Functional Annotation

        (14) Basic Annotation


==================================================================================

<1>. Protein Kinase Resources.


1. Protein kinase databases.

(1) EKPD: a resource to get protein kinases and protein phosphatases information in eukaryotic organisms (Wang, et al., 2014).

(2) Kinase.com: is a resource including genomics, functions and evolution of protein kinases. Kinase.com also provide a accrute kinase database, KinBase which contains information on over 3000 kinase genes found in a variety of species from unicellular, plant, invertebrate and vertebrate (Manning, et al., 2002).

(3) Kinomer v. 1.0: is a protein kinase enzyme database utilize a highly sensitive and accurate hidden Markov model-based method for automatic detection and classification of protein kinases in 43 eukaryotic genomes, including fungi (16 species), plants (6), diatoms (1), amoebas (2), protists (1) and animals (17) (Martin, et al., 2009).

(4) Kinannote: a computer program to identify and classify members of the eukaryotic protein kinase superfamily (Goldberg, et al., 2013).

(5) KinG: is a comprehensive collection of serine/threonine/tyrosine-specific kinases and their homologues identified in various completed genomes using various sensitive sequence and profile search methods including PSI-BLAST, HMMER-2 and RPS-BLAST. The database allows user to search for kinases with a specific combination of domains and a specific subfamily (Krupa, et al., 2004).

(6) PKR: a Protein Kinase Resource database that includes expanded table set and features comprehensive coverage of all kinase-related data and derived information, including genomic sequences, detailed organism and tissue cataloging, multiple sequence alignments, clustering into families, literature citations and structural data (Niedner, et al., 2006).

(7) RTKdb: is the only database on tyrosine kinase receptors (Grassot, et al., 2003).

(8) PlantsP: is a curated database that combines information derived from sequences with experimental functional genomics information. PlantsP provides framework for proteins involved in phosphorylation, i.e. protein kinases, protein phosphatases and their substrates in plants. PlantsP also provides a a curated view of each protein that includes a comprehensive annotation of related sequence motifs, sequence family definitions and so on (Gribskov, et al., 2001).

2. Phosphorylation Sites Database.

(1) PhosphoSitePlus: A knowledgebase dedicated to mammalian post-translational modifications (PTMs), contains over 330,000 non-redundant PTMs, including phospho, acetyl, ubiquityl and methyl groups (Hornbeck, et al., 2015).

(2) dbPTM 2016: is a database that compiles information on protein post-translational modifications (PTMs), including the catalytic sites, solvent accessibility of amino acid residues, protein secondary and tertiary structures, protein domains and protein variations (Huang, et al., 2016).

(3) Phospho3D 2.0: is a database of three-dimensional structures of phosphorylation sites derived from Phospho.ELM database. The database also contains the results of a large-scale structural comparison procedure procedure providing clues for the identification of new putative phosphorylation sites (Zanzoni, et al., 2011).

3. Prediction of kinase-specific phosphorylation sites.

(1) GPS 2.1: A Group-based Prediction System could predict kinase-specific phosphorylation sites for 408 human Protein Kinases in hierarchy (Xue, et al., 2008).

(2) ScanSite 2.0: could identifies short protein sequence motifs recognized by modular signaling domains, phosphorylated by ser/thr/tyr kinase or mediate specific interactions with protein or phospholipid ligands using position-specific scoring matrix (PSSM) (Obenauer, et al., 2003).

4. Protein kinases associated with diseases.

(1) KinMutBase: is a comprehensive database of disease-causing mutations in protein kinase domains. This new release of the database contains 582 mutations in 20 tyrosine kinase domains and 13 serine/threonine kinase domains. The database refers 1790 cases from 1322 families. (Ortutay C, et al., 2005).

(2) MoKCa: (Mutations of Kinases in Cancer) has been developed to structurally and functionally annotate, and where possible predict, the phenotypic consequences of mutations in protein kinases implicated in cancer (Richardson CJ, et al., 2009).

<2>. Protein Phosphatase Resources.


(1) Protein Tyrosine Phosphatases: is a Web-Accessible Resource of Information on Protein Tyrosine Phosphatases. This website provied a peer-reviewed compendium on Protein Tyrosine Phosphatases (PTPs) and intergrates PTP related information, such as sequence, structure, cellular and biological function. This website allows reader to explore the diversity of the PTP family and download files containing distint PTP information, including multiple sequence alignments, phylogenetic trees, structure, molecular graphics files, chromosomal mapping data an so on. In addtion, phylogenetic classification based on sequence similarity is available by using the Blast search.

(2) Protein Phosphatase Database: cataloged the human protein phosphatome, composed of 189 known and predicted human protein phosphatase genes (Chen, et al., 2017).

(3) DEPOD: is a manually curated open access database providing human phosphatases, their protein and non-protein substrates, dephosphorylation sites, pathway involvements and external links to kinases and small molecule modulators (Duan, et al., 2015).

(4) PhosphaBase: is a Protein Phosphatase Information Resource. This database contains protein phosphatase and information about their protein sequences. The data resource is from Swiss-Prot and TrEMBL database and classificated phosphatase into five superfamily, including PTP, DUSP, PTEN/MTM, Ser/Thr and Histidine. This database also provide three phosphatase analysis tools: PhosphaScan, PhosphaBase3D and PhosphaClass, for analysis of sequence, structure and classification respectively (Wolstencroft, et al., 2005).

(5) HuPho: is an on-line web resource to assist scientists in the task of recovering information about human phosphatases (Liberti, et al., 2013).

(6) PlantsP: is a curated database that combines information derived from sequences with experimental functional genomics information. PlantsP provides framework for proteins involved in phosphory-lation, i.e. protein kinases, protein phosphatases and their substrates in plants. PlantsP also provides a a curated view of each protein that includes a comprehensive annotation of related sequence motifs, sequence family definitions and so on (Gribskov, et al., 2001).

(7) TAIR: (The Arabidopsis Information Resource): is a database containing genetic and molecular biology data for the model higher plant Arabidopsis thaliana. TAIR provides information about Arabidopsis thaliana including gene structure, gene product, metabolism, gene expression, genome maps etc. Protein phosphatase information is also provided in the database (Berardini, et al., 2015).

<3>. Phosphoprotein-binding domain Resources.


(1) PepCyber: P~Pep: a database of human protein-protein interactions mediated by phosphoprotein-binding domains (Gong, et al., 2007).

<1> Public Databases integrated in iEKPD 2.0

(1) Cancer Mutation:

1. TCGA: The Cancer Genome Atlas (TCGA) has generated comprehensive, multi-dimensional maps of the key genomic changes in 33 types of cancer. The TCGA dataset, 2.5 petabytes of data describing tumor tissue and matched normal tissues from more than 11,000 patients, is publically available (Cancer Genome Atlas Research Network, 2017).

2. ICGC: The Data Portal currently contains data from 24 cancer projects, and consists of 3478 genomes and 13 cancer types and subtypes (Zhang, et al., 2011).

3. COSMIC: Describes 2 002 811 coding point mutations in over one million tumor samples and across most human genes (Forbes, et al., 2014).

4. CGAP: The Cancer Genome Anatomy Project (CGAP) is an online database on normal, pre-cancerous and cancerous genomes (Schaefer, et al., 2001).

5. IntOGen: Integration and data mining of multidimensional oncogenomic data (Gundem, et al., 2010).

6. BioMuta: BioMuta and BioXpress: mutation and expression knowledgebases for cancer biomarker discovery (Hayley M Dingerdissen, et al., 2017).

7. TumorFusions : An integrative resource for cancer-associated transcript fusions (Hu, et al., 2018).

(2) Genetic Variation:

1. dbSNP: The NCBI database of genetic variation (Sherry, et al., 2001).

2. GVM : A data repository of genome variations in BIG Data Center (Song , et al., 2018).

3. ActiveDriverDB : Genome variation mapped against post-translational modifications (Krassowski, et al., 2018).

4. Kin-Driver: A database of driver mutations in protein kinases (Franco, et al., 2014).

5. VarCards : Interpretation of coding variants in the human genome (Li, et al., 2018).

6. m6AVar : A database of functional variants involved in m6A modification (Zheng, et al., 2018).

7. rSNPBase 3.0: An updated database of SNP-related regulatory elements, element-gene pairs and SNP-based gene regulatory networks (Guo, et al. 2018).

(3) Disease-associated Information:

1. ClinVar: A public archive of reports of the relationships among human variations and phenotypes with supporting evidence (Landrum, et al., 2016).

2. GWASdb: Generated a total of 252,530 unique TASs, mapped 1610 GWAS traits to 501 Human Phenotype Ontology (HPO) terms, 435 Disease Ontology (DO) terms and 228 Disease Ontology Lite (DOLite) terms (Li, et al., 2016).

3. PTMD: Contains 1,950 disease-associated PTM events in 749 proteins for 24 PTM types and 275 diseases.

4. OMIM: A comprehensive, authoritative and timely research resource of curated descriptions of human genes and phenotypes and the relationships between them (Amberger, et al., 2015).

5. MSDD : miRNA SNP Disease Database  (Yue, et al., 2018).

6. DiseaseEnhancer : A resource of human disease-associated enhancer catalog (Zhang, et al., 2018).

7. BRONCO: Biomedical entity Relation ONcology COrpus (BRONCO) contains more than 400 variants and their relations with genes, diseases, drugs, and cell lines in the context of cancer and anti-tumor drug screening research (Lee, et al., 2016).

8. HGVTB: HGV&TB, which hosts genetic variations reported to be associated with TB susceptibility in humans. It currently houses information on 307 variations in 98 genes. In total, 101 of these variations are exonic, whereas 78 fall in intronic regions (Sahajpal, et al., 2014).

9. DisGeNET: A comprehensive platform integrating information on human disease-associated genes and variants has been published in the NAR database issue 2017. See the publication here (Furlong, et al., 2017).

10. PancanQTL : Systematic identification of cis-eQTLs and trans-eQTLs in 33 cancer types (Gong, et al., 2018).

(4) mRNA Expression:

1. TCGA: The Cancer Genome Atlas (TCGA) has generated comprehensive, multi-dimensional maps of the key genomic changes in 33 types of cancer. The TCGA dataset, 2.5 petabytes of data describing tumor tissue and matched normal tissues from more than 11,000 patients, is publically available (Cancer Genome Atlas Research Network, 2017).

2. ICGC: The Data Portal currently contains data from 24 cancer projects, and consists of 3478 genomes and 13 cancer types and subtypes (Zhang, et al., 2011).

3. COSMIC: Describes 2 002 811 coding point mutations in over one million tumor samples and across most human genes (Forbes, et al., 2014).

4. GEO: NCBI gene expression and hybridization array data repository (Edgar, et al., 2002).

5. ArrayExpress: A public repository for microarray-based gene expression data, resulting from the implementation of the MAGE object model to ensure accurate data structuring and the MIAME standard, which defines the annotation requirements (Rocca-Serra, et al., 2003).

6. BioExpress : BioMuta and BioXpress: mutation and expression knowledgebases for cancer biomarker discovery (Hayley M Dingerdissen, et al., 2017).

7. TissGDB : Tissue-specific Gene DataBase in cancer (Kim, et al., 2018).

8. GXD: GXD includes >1.4 million expression results and 250,000 images (Smith, et al., 2013).

9. FFGED: The filamentous fungal gene expression database (Zhang, et al., 2010).

10. The Human Protein Atlas: 11,200 unique proteins corresponding to over 50% of all human protein-encoding genes have been analysed (Pontén, et al., 2011).

11. Human Proteome Map: Includes experimental verified microRNAs and experimental verified miRNA target genes in human, mouse, rat, and other metazoan genomes (Kim, et al., 2014).

12. SZDB: A comprehensive resource for schizophrenia research (Wu, et al. 2017).

13. TISSUES 2.0: TISSUES 2.0: an integrative web resource on mammalian tissue expression. (Palasca, et al. 2018).

(5) DNA & RNA Element:

1. UTRdb: A curated database of 5' and 3' untranslated sequences of eukaryotic mRNAs (Grillo, et al., 2010).

2. circBase: A database for circular RNAs (Glažar, et al., 2014).

3. circRNADb: Containing 32,914 human exonic circRNAs carefully selected from diversified sources (Chen, et al., 2016).

4. CircNet: The expression of circRNAs in 464 RNA-seq samples (Liu, et al. 2016).

5. Circ2Traits: A comprehensive database for circular RNA potentially associated with disease and traits (Ghosal, et al., 2013).

6. miRTarBase: Contains 4966 articles, 7439 strongly validated MTIs (using reporter assays or western blots) and 348 007 MTIs from CLIP-seq (Chou, et al., 2011).

7. microRNA.org: A comprehensive resource of microRNA target predictions and expression profiles (Betel, et al., 2008).

8. TRANSFAC: Describes transcription factors, their binding sites, nucleotide distribution matrices and regulated genes as well as the complementing database TRANSCompel on composite elements have been further enhanced on various levels (Matys, et al., 2006).

9. miRWalk: Offers information on miRNAs, genes, epigenomics, pathways, ontologies, protein classes, phenotype, genotype, single-nucleotide polymorphisms, functional networks, tandem mass spectra and relevant PubMed articles (Dweep, et al., 2015).

10. TargetScan: Predicting effective microRNA target sites in mammalian mRNAs (Agarwal, et al., 2015).

11. miRecords: Includes 1135 records of validated miRNA-target interactions between 301 miRNAs and 902 target genes in seven animal species (Xiao, et al., 2009).

12. miRNAMap: Experimental verified microRNAs and experimental verified miRNA target genes in human, mouse, rat, and other metazoan genomes (Hsu, et al., 2008).

13. SomamiR DB 2.0: 388 247 somatic mutations was mapped to the experimentally identified miRNA target sites (Bhattacharya, et al., 2016).

14. miRcode: Includes 10 419 lncRNA genes in the current version (Jeggari, et al., 2012).

15. RAID v2.0: Recruits more than 5.27 million RNA-associated interactions, referring to nearly 130 000 RNA/protein symbols across 60 species (Yi, et al., 2017).

16. LncRNADisease: Contains 2947 lncRNA-disease entries with controlled lncRNA and disease nomenclature (Chen, et al., 2013).

17. OverGeneDB : Overlapping protein-coding genes  (Rosikiewicz, et al., 2018).

18. SEA: A super-enhancer archive (Wei, et al., 2016).

(6) DNA Methylation:

1. TCGA: The Cancer Genome Atlas (TCGA) has generated comprehensive, multi-dimensional maps of the key genomic changes in 33 types of cancer. The TCGA dataset, 2.5 petabytes of data describing tumor tissue and matched normal tissues from more than 11,000 patients, is publically available (Cancer Genome Atlas Research Network, 2017).

2. ICGC: The Data Portal currently contains data from 24 cancer projects, and consists of 3478 genomes and 13 cancer types and subtypes (Zhang, et al., 2011).

3. COSMIC: Describes 2 002 811 coding point mutations in over one million tumor samples and across most human genes (Forbes, et al., 2014).

4. MethyCancer: Hosts both highly integrated data of DNA methylation, cancer-related gene, mutation and cancer information from public resources, and the CpG Island (CGI) clones derived from our large-scale sequencing (He, et al., 2008).

(7) Molecular Interaction:

1. HINT: High-quality protein interactomes and their applications in understanding human disease (Das, et al., 2012).

2. Mentha: A resource for browsing integrated protein-interaction networks (Calderone, et al., 2013).

3. InWeb_IM: >500,000 functional interpretation of >4,700 cancer genomes and genes involved in autism (Li, et al., 2017).

4. MIST : Molecular Interaction Search Tool (MIST): an integrated resource for mining gene and protein interaction data (Hu, et al., 2018).

5. IID: A major replacement of the I2D interaction database, with larger PPI networks (a total of 1,566,043 PPIs among 68,831 proteins) (Kotlyar, et al., 2016).

6. iRefIndex: A consolidated protein interaction database with provenance (Razick, et al., 2008).

7. PINA: Including multiple collections of interaction modules identified by different clustering approaches from the whole network of protein interactions ('interactome') for six model organisms (Cowley, et al., 2012).

8. RISE : A database of RNA interactome from sequencing experiments. (Gong, et al., 2018).

9. DifferentialNET : The DifferentialNet database of differential protein-protein interactions in human tissues (Basha, et al., 2018).

10. TRRUST v2 : An expanded reference database of human and mouse transcriptional regulatory interactions (Han, et al., 2018).

11. TIMBAL v2: A database holding molecules of molecular weight <1200 Daltons that modulate protein–protein interactions (Alicia, et al., 2013).

12. BindingDB: A public, web-accessible database of measured binding affinities, focusing chiefly on the interactions of protein considered to be drug-targets with small, drug-like molecules. BindingDB contains 1,447,692 binding data, for 7,058 protein targets and 648,871 small molecules (Gilson, et al., 2016).

13. PLIC: Protein-ligand interaction cluster (Anand, et al., 2014).

14. RAIN: RNA–protein Association and Interaction Networks (Junge, et al. 2017).

15. YTRP: Aimed to find the TRP information for the TFPE-identified TF-gene regulatory pairs (Yang, et al., 2014).

16. RegNetwork: Gene regulatory networks for human and mouse by collecting the documented regulatory interactions among TFs, miRNAs and target genes (Liu, et al. 2015).

(8) Drug-target relation:

1. TTD: Contains 2,025 targets, including 364 successful, 286 clinical trial, 44 discontinued and 1,331 research targets, 17,816 drugs, including 1,540 approved, 1,423 clinical trial, 14,853 experimental drugs and 3,681 multi-target agents (Zhu, et al., 2012).

2. DrugBank: Contains 9591 drug entries including 2037 FDA-approved small molecule drugs, 241 FDA-approved biotech (protein/peptide) drugs, 96 nutraceuticals and over 6000 experimental drugs (Law, et al., 2014).

3. KPID: A searchable database of specificities of 243 commonly used signal transduction inhibitors (MRC PPU International Centre for Kinase Profiling, 2012).

4. GRAC: Providing pharmacological, chemical, genetic, functional and pathophysiological data on the targets of approved and experimental drugs (Pawson, et al., 2014).

5. PDTD: Contains 1207 entries covering 841 known and potential drug targets with structures from the Protein Data Bank (Gao, et al., 2008).

6. ADReCS-Target: Provides comprehensive information for illustrating ADRs caused by drug interactions with protein, gene and genetic variation (Zhang, et al., 2007).

7. ECOdrug : A database connecting drugs and conservation of their targets across species (Verbruggen, et al., 2018).

8. DGIdb 3.0: A redesign and expansion of the drug-gene interaction database (Kelsy, et al., 2018).

9. CTD: A CTD-Pfizer collaboration: manual curation of 88,000 scientific articles text mined for drug-disease and drug-phenotype interactions (Davis, et al., 2013).

(9) Protein 3D Structure:

1. PDB: Contains 41599 distinct protein sequences, 36830 structures of human sequences and 9465 nucleic acid containing structures (Berman, et al., 2000).

2. MMDB: Close to 60% of protein sequences tracked in comprehensive databases can be mapped to a known three-dimensional (3D) structure by standard sequence similarity searches (Madej, et al., 2012).

3. SCOP: A prototype of a new structural classification of proteins (Andreeva, et al., 2014).

(10) Post-translational Modification (PTM):

1. PLMD: Contained 284,780 modification events in 53,501 proteins (Xu, et al., 2017).

2. dbPAF: Collected and integrated 54,148 phosphoproteins with 483,001 phosphorylation sites (Ullah, et al., 2016).

3. dbPPT: Contains 82,175 phosphorylation sites in 31,012 proteins from 20 plant organisms (Cheng, et al., 2014).

4. PhosSNP: Collected 91,797 nsSNPs from NCBI dbSNP, used GPS 2.0 software (Xue, et al., 2008) to predict kinase-specific phosphorylation sites for human proteins and nsSNP data, and classified all phosSNPs into five groups (Ren, et al., 2010).

5. PhosphoSitePlus: Contains over 330,000 non-redundant PTMs, including phospho, acetyl, ubiquityl and methyl groups (Hornbeck, et al., 2015).

6. dbPTM 2016: Curates over 12 000 modified peptides, including the emerging S-nitrosylation, S-glutathionylation and succinylation (Huang, et al., 2016).

7. HPRD: Comprises of 95,016 phosphosites mapped on to 13,041 proteins (Goel, et al., 2012).

8. Phospho.ELM: Currently comprises 42,574 serine, threonine and tyrosine non-redundant phosphorylation sites (Dinkel, et al., 2010).

9. UniProt: The UniProt knowledgebase is a large resource of protein sequences and associated detailed annotation (The UniProt Consortium, 2017).

10. PHOSIDA: Comprises more than 80,000 phosphorylated, N-glycosylated or acetylated sites from nine different species (Gnad, et al., 2011).

11. BioGRID: Contains 1 072 173 genetic and protein interactions, and 38 559 post-translational modifications (Chatr-Aryamontri, et al., 2017).

12. O-GlycBase: Has 242 glycoprotein entries (Gupta, et al., 1999).

13. PhosphoBase: Comprises 414 phosphoprotein entries covering 1052 phosphorylatable serine, threonine and tyrosine residues (Kreegipuu, et al. 1999).

14. mUbiSiDa: Deposited about 35,494 experimentally validated ubiquitinated proteins with 110,976 ubiquitination sites from five species (Chen, et al., 2014).

(11) Protein Expression/Proteomics:

1. The Human Protein Atlas: 11,200 unique proteins corresponding to over 50% of all human protein-encoding genes have been analysed. (Pontén, et al., 2011).

2. Human Proteome Map: Including 30 histologically normal human samples, resulted in identification of proteins encoded by 17,294 genes (Kim, et al., 2014).

(12) Subcellular Localization:

1. NLSdb : Nuclear Localization Signals  (Bernhofer, et al., 2018).

2. COMPARTMENTS: Unification and visualization of protein subcellular localization evidence (Binder, et al., 2014).

(13) Protein Functional Annotation:

1. CGDB: A database of circadian genes in eukaryotes (Li, et al. 2017).

2. THANATOS: THANATOS: an integrative data resource of proteins and post-translational modifications in the regulation of autophagy (Deng, et al. 2018).

3. RaftProt: Mammalian lipid raft proteome database (Shah, et al. 2015).

(14) Basic Annotation:

1. Ensembl: Ensembl 2017 (Aken, et al., 2017).

2. UniProt: UniProt: the Universal Protein knowledgebase (UniProt Consortium, 2018).

3. GeneBank: GenBank (Benson, et al., 2013).

4. GO: The Gene Ontology (GO) project in 2006 (Gene Ontology Consortium, 2006).

5. KEGG: KEGG: Kyoto Encyclopedia of Genes and Genomes (Ogata, et al., 1999).

6. PROSITE: New and continuing developments at PROSITE (Sigrist, et al., 2013).

7. InterPro: InterPro in 2017-beyond protein family and domain annotations (Finn, et al., 2017).

8. Pfam: The Pfam protein families database: towards a more sustainable future (Finn, et al., 2016).

9. SMART: SMART: a web-based tool for the study of genetically mobile domains (Schultz, et al., 2000).

10. RESID: The RESID Database of Protein Modifications as a resource and annotation tool (Garavelli, 2004).