UniProtKB/Swiss-Prot protein knowledgebase release 2014_10 statistics 1. INTRODUCTION Release 2014_10 of 29-Oct-14 of UniProtKB/Swiss-Prot contains 546790 sequence entries, comprising 194613039 amino acids abstracted from 232403 references. 357 sequences have been added since release 2014_09, the sequence data of 12 existing entries has been updated and the annotations of 383389 entries have been revised. Number of fragments: 9122 Number of additional sequences produced by alternative splicing, initiation or promoter usage, or ribosomal frameshifting: 38330 Protein existence (PE): entries % 1: Evidence at protein level 85041 15.6% 2: Evidence at transcript level 62710 11.5% 3: Inferred from homology 385647 70.5% 4: Predicted 11512 2.1% 5: Uncertain 1880 0.3% The growth of the database is summarized below. 2. TAXONOMIC ORIGIN Total number of species represented in this release of UniProtKB/Swiss-Prot: 13155 The first twenty species represent 115291 sequences: 21.1 % of the total number of entries. 2.1 Table of the frequency of occurrence of species Species represented 1x: 5474 2x: 1898 3x: 1009 4x: 657 5x: 480 6x: 397 7x: 290 8x: 224 9x: 214 10x: 121 11- 20x: 712 21- 50x: 422 51-100x: 211 >100x: 1046 2.2 Table of the most represented species ------ --------- -------------------------------------------- Number Frequency Species ------ --------- -------------------------------------------- 1 20193 Homo sapiens (Human) 2 16686 Mus musculus (Mouse) 3 13297 Arabidopsis thaliana (Mouse-ear cress) 4 7914 Rattus norvegicus (Rat) 5 6621 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) (Baker's yeast) 6 5988 Bos taurus (Bovine) 7 5103 Schizosaccharomyces pombe (strain 972 / ATCC 24843) (Fission yeast) 8 4433 Escherichia coli (strain K12) 9 4185 Bacillus subtilis (strain 168) 10 4128 Dictyostelium discoideum (Slime mold) 11 3476 Caenorhabditis elegans 12 3390 Xenopus laevis (African clawed frog) 13 3277 Oryza sativa subsp. japonica (Rice) 14 3237 Drosophila melanogaster (Fruit fly) 15 2934 Danio rerio (Zebrafish) (Brachydanio rerio) 16 2261 Gallus gallus (Chicken) 17 2219 Pongo abelii (Sumatran orangutan) (Pongo pygmaeus abelii) 18 2035 Mycobacterium tuberculosis (strain ATCC 25618 / H37Rv) 19 2027 Escherichia coli O157:H7 20 1887 Mycobacterium tuberculosis (strain CDC 1551 / Oshkosh) 21 1787 Methanocaldococcus jannaschii 22 1779 Salmonella typhimurium (strain LT2 / SGSC1412 / ATCC 700720) 23 1707 Haemophilus influenzae (strain ATCC 51907 / DSM 11121 / KW20 / Rd) 24 1701 Xenopus tropicalis (Western clawed frog) (Silurana tropicalis) 25 1688 Escherichia coli O6:H1 (strain CFT073 / ATCC 700928 / UPEC) 26 1679 Shigella flexneri 27 1414 Sus scrofa (Pig) 28 1346 Salmonella typhi 29 1273 Pseudomonas aeruginosa (strain ATCC 15692 / PAO1 / 1C / PRS 101 / LMG 12228) 30 1242 Mycobacterium bovis (strain ATCC BAA-935 / AF2122/97) 31 1170 Macaca fascicularis (Crab-eating macaque) (Cynomolgus monkey) 32 1049 Synechocystis sp. (strain PCC 6803 / Kazusa) 33 1025 Archaeoglobus fulgidus 34 1019 Yersinia pestis 35 956 Vibrio cholerae serotype O1 (strain ATCC 39315 / El Tor Inaba N16961) 36 952 Candida albicans (strain SC5314 / ATCC MYA-2876) (Yeast) 37 930 Salmonella paratyphi A (strain ATCC 9150 / SARB42) 38 927 Ashbya gossypii (strain ATCC 10895 / CBS 109.51 / FGSC 9923 / NRRL Y-1056) 39 925 Staphylococcus aureus (strain N315) 40 924 Staphylococcus aureus (strain Mu50 / ATCC 700699) 41 909 Acanthamoeba polyphaga mimivirus (APMV) 42 905 Kluyveromyces lactis 43 899 Staphylococcus aureus (strain COL) 44 895 Staphylococcus aureus (strain MW2) 45 889 Escherichia coli O6:K15:H31 (strain 536 / UPEC) 46 889 Staphylococcus aureus (strain MSSA476) 47 889 Oryctolagus cuniculus (Rabbit) 48 888 Staphylococcus aureus (strain MRSA252) 49 882 Salmonella choleraesuis (strain SC-B67) 50 878 Shigella sonnei (strain Ss046) 51 875 Rhizobium meliloti (strain 1021) (Ensifer meliloti) (Sinorhizobium meliloti) 52 863 Yersinia pseudotuberculosis serotype I (strain IP32953) 53 862 Candida glabrata 54 841 Escherichia coli O9:H4 (strain HS) 55 838 Neurospora crassa 56 834 Escherichia coli O139:H28 (strain E24377A / ETEC) 57 829 Shigella boydii serotype 4 (strain Sb227) 58 825 Escherichia coli (strain UTI89 / UPEC) 59 822 Shigella dysenteriae serotype 1 (strain Sd197) 60 819 Escherichia coli (strain ATCC 8739 / DSM 1576 / Crooks) 61 811 Canis familiaris (Dog) (Canis lupus familiaris) 62 791 Escherichia coli (strain SMS-3-5 / SECEC) 63 790 Pectobacterium atrosepticum (strain SCRI 1043 / ATCC BAA-672) 64 788 Vibrio parahaemolyticus serotype O3:K6 (strain RIMD 2210633) 65 787 Emericella nidulans 66 784 Aquifex aeolicus (strain VF5) 67 776 Pasteurella multocida (strain Pm70) 68 771 Escherichia coli (strain K12 / DH10B) 69 771 Escherichia coli O127:H6 (strain E2348/69 / EPEC) 70 770 Streptomyces coelicolor (strain ATCC BAA-471 / A3(2) / M145) 71 765 Escherichia coli (strain K12 / MC4100 / BW2952) 72 764 Escherichia coli O17:K52:H18 (strain UMN026 / ExPEC) 73 762 Escherichia coli (strain 55989 / EAEC) 74 761 Escherichia coli O8 (strain IAI1) 75 760 Shigella flexneri serotype 5b (strain 8401) 76 759 Staphylococcus epidermidis (strain ATCC 35984 / RP62A) 77 758 Staphylococcus epidermidis (strain ATCC 12228) 78 756 Escherichia coli (strain SE11) 79 756 Escherichia coli O45:K1 (strain S88 / ExPEC) 80 753 Escherichia coli O7:K1 (strain IAI39 / ExPEC) 81 748 Escherichia coli O157:H7 (strain EC4115 / EHEC) 82 747 Zea mays (Maize) 83 745 Photorhabdus luminescens subsp. laumondii (strain TT01) 84 742 Staphylococcus aureus (strain NCTC 8325) 85 737 Bacillus halodurans 86 736 Bacillus anthracis 87 736 Yersinia enterocolitica serotype O:8 / biotype 1B (strain NCTC 13174 / 8081) 88 733 Vibrio vulnificus (strain CMCP6) 89 731 Escherichia coli O81 (strain ED1a) 90 722 Salmonella enteritidis PT4 (strain P125109) 91 717 Vibrio vulnificus (strain YJ016) 92 716 Salmonella paratyphi B (strain ATCC BAA-1250 / SPB7) 93 715 Yersinia pestis bv. Antiqua (strain Nepal516) 94 715 Enterobacter sp. (strain 638) 95 714 Escherichia coli O1:K1 / APEC 96 714 Salmonella paratyphi A (strain AKU_12601) 97 713 Klebsiella pneumoniae subsp. pneumoniae (strain ATCC 700721 / MGH 78578) 98 713 Salmonella agona (strain SL483) 99 713 Salmonella newport (strain SL254) 100 713 Yersinia pseudotuberculosis serotype O:1b (strain IP 31758) 101 712 Salmonella schwarzengrund (strain CVM19633) 102 711 Yersinia pestis bv. Antiqua (strain Antiqua) 103 710 Salmonella heidelberg (strain SL476) 104 702 Pseudomonas putida (strain KT2440) 105 702 Salmonella dublin (strain CT_02021853) 106 698 Klebsiella pneumoniae (strain 342) 107 698 Shigella boydii serotype 18 (strain CDC 3083-94 / BS512) 108 695 Escherichia fergusonii (strain ATCC 35469 / DSM 13698 / CDC 0568-73) 109 691 Nostoc sp. (strain PCC 7120 / UTEX 2576) 110 689 Pan troglodytes (Chimpanzee) 111 687 Mycoplasma pneumoniae (strain ATCC 29342 / M129) 112 684 Salmonella gallinarum (strain 287/91 / NCTC 13346) 113 682 Oryza sativa subsp. indica (Rice) 114 678 Citrobacter koseri (strain ATCC BAA-895 / CDC 4225-83 / SGSC4696) 115 677 Pseudomonas syringae pv. tomato (strain DC3000) 116 671 Neosartorya fumigata (strain ATCC MYA-4609 / Af293 / CBS 101355 / FGSC A1100) 117 670 Serratia proteamaculans (strain 568) 118 668 Mycobacterium leprae (strain TN) 119 667 Yersinia pestis (strain Pestoides F) 120 666 Staphylococcus aureus (strain USA300) 121 665 Escherichia coli 122 660 Bradyrhizobium diazoefficiens 123 659 Bacillus cereus (strain ATCC 14579 / DSM 31) 124 658 Rhizobium sp. (strain NGR234) 125 653 Debaryomyces hansenii 126 644 Yarrowia lipolytica (strain CLIB 122 / E 150) (Yeast) (Candida lipolytica) 127 643 Staphylococcus aureus (strain bovine RF122 / ET3-1) 128 642 Salmonella arizonae (strain ATCC BAA-731 / CDC346-86 / RSK2980) 129 638 Yersinia pseudotuberculosis serotype O:3 (strain YPIII) 130 638 Agrobacterium tumefaciens (strain C58 / ATCC 33970) 131 636 Shewanella oneidensis (strain MR-1) 132 634 Yersinia pseudotuberculosis serotype IB (strain PB1/+) 133 622 Cronobacter sakazakii (strain ATCC BAA-894) (Enterobacter sakazakii) 134 619 Methanothermobacter thermautotrophicus 135 618 Treponema pallidum (strain Nichols) 136 613 Staphylococcus haemolyticus (strain JCSC1435) 137 609 Rhizobium loti (strain MAFF303099) (Mesorhizobium loti) 138 607 Listeria monocytogenes serovar 1/2a (strain ATCC BAA-679 / EGD-e) 139 606 Xanthomonas campestris pv. campestris (strain ATCC 33913 / NCPPB 528 / LMG 568) 140 603 Ralstonia solanacearum (strain GMI1000) (Pseudomonas solanacearum) 141 602 Photobacterium profundum (Photobacterium sp. (strain SS9)) 142 602 Staphylococcus saprophyticus subsp. saprophyticus 143 601 Salmonella paratyphi C (strain RKS4594) 144 600 Yersinia pestis bv. Antiqua (strain Angola) 145 591 Bacillus cereus (strain ATCC 10987) 146 591 Listeria innocua serovar 6a (strain CLIP 11262) 147 591 Pectobacterium carotovorum subsp. carotovorum (strain PC1) 148 589 Helicobacter pylori (strain ATCC 700392 / 26695) (Campylobacter pylori) 149 586 Rickettsia prowazekii (strain Madrid E) 150 579 Neisseria meningitidis serogroup B (strain MC58) 151 576 Brucella suis biovar 1 (strain 1330) 152 573 Brucella melitensis biotype 1 (strain 16M / ATCC 23456 / NCTC 10094) 153 572 Buchnera aphidicola subsp. Acyrthosiphon pisum (strain APS) 154 568 Caenorhabditis briggsae 155 568 Caulobacter crescentus (strain ATCC 19089 / CB15) 156 567 Bacillus thuringiensis subsp. konkukian (strain 97-27) 157 567 Pseudomonas syringae pv. syringae (strain B728a) 158 566 Helicobacter pylori (strain J99 / ATCC 700824) (Campylobacter pylori J99) 159 565 Pseudomonas aeruginosa (strain UCBPP-PA14) 160 564 Bacillus licheniformis (strain DSM 13 / ATCC 14580) 161 564 Vibrio fischeri (strain ATCC 700601 / ES114) 162 562 Buchnera aphidicola subsp. Schizaphis graminum (strain Sg) 163 561 Bacillus cereus (strain ZK / E33L) 164 558 Clostridium acetobutylicum 165 556 Xanthomonas axonopodis pv. citri (strain 306) 166 553 Oceanobacillus iheyensis (strain DSM 14371 / JCM 11309 / KCTC 3954 / HTE831) 167 553 Neisseria meningitidis serogroup A / serotype 4A (strain Z2491) 168 552 Pseudomonas fluorescens (strain Pf0-1) 169 546 Pseudomonas fluorescens (strain Pf-5 / ATCC BAA-477) 170 545 Pseudomonas syringae pv. phaseolicola (strain 1448A / Race 6) 171 538 Thermotoga maritima (strain ATCC 43589 / MSB8 / DSM 3109 / JCM 10099) 172 535 Lactococcus lactis subsp. lactis (strain IL1403) (Streptococcus lactis) 173 531 Erwinia tasmaniensis (strain DSM 17950 / Et1/99) 174 529 Sodalis glossinidius (strain morsitans) 175 529 Listeria monocytogenes serotype 4b (strain F2365) 176 525 Bordetella bronchiseptica (strain ATCC BAA-588 / NCTC 13252 / RB50) 177 522 Xylella fastidiosa (strain 9a5c) 178 515 Chromobacterium violaceum 179 515 Bordetella pertussis (strain Tohama I / ATCC BAA-589 / NCTC 13251) 180 515 Corynebacterium glutamicum 181 514 Vibrio cholerae serotype O1 (strain ATCC 39541 / Classical Ogawa 395 / O395) 182 512 Xylella fastidiosa (strain Temecula1 / ATCC 700964) 183 511 Pseudomonas aeruginosa (strain PA7) 184 510 Haemophilus ducreyi (strain 35000HP / ATCC 700724) 185 508 Staphylococcus aureus (strain Newman) 186 508 Bordetella parapertussis (strain 12822 / ATCC BAA-587 / NCTC 13253) 187 507 Buchnera aphidicola subsp. Baizongia pistaciae (strain Bp) 188 507 Deinococcus radiodurans 189 507 Geobacillus kaustophilus (strain HTA426) 190 506 Streptomyces avermitilis 191 502 Streptococcus pneumoniae serotype 4 (strain ATCC BAA-334 / TIGR4) 192 500 Pseudomonas entomophila (strain L48) 193 499 Brucella abortus biovar 1 (strain 9-941) 194 497 Rickettsia conorii (strain ATCC VR-613 / Malish 7) 195 496 Bacillus clausii (strain KSM-K16) 196 495 Burkholderia pseudomallei (strain K96243) 197 495 Haemophilus influenzae (strain 86-028NP) 198 494 Proteus mirabilis (strain HI4320) 199 494 Methanosarcina acetivorans (strain ATCC 35395 / DSM 2834 / JCM 12185 / C2A) 200 492 Pyrococcus furiosus (strain ATCC 43587 / DSM 3638 / JCM 8422 / Vc1) 201 492 Bacillus amyloliquefaciens subsp. plantarum 202 491 Xanthomonas campestris pv. campestris (strain 8004) 203 490 Vibrio campbellii (strain ATCC BAA-1116 / BB120) 204 490 Halobacterium salinarum (strain ATCC 700922 / JCM 11081 / NRC-1) 205 488 Mycobacterium smegmatis (strain ATCC 700084 / mc(2)155) 206 487 Shewanella sp. (strain MR-7) 207 486 Mannheimia succiniciproducens (strain MBEL55E) 208 484 Pseudomonas aeruginosa (strain LESB58) 209 484 Staphylococcus aureus (strain Mu3 / ATCC 700698) 210 484 Thermosynechococcus elongatus (strain BP-1) 211 484 Shewanella sp. (strain MR-4) 212 483 Acinetobacter baylyi (strain ATCC 33305 / BD413 / ADP1) 213 483 Methanosarcina mazei 214 483 Mycoplasma genitalium (strain ATCC 33530 / G-37 / NCTC 10195) 215 481 Pyrococcus horikoshii 216 479 Synechococcus elongatus (strain PCC 7942) (Anacystis nidulans R2) 217 477 Pseudomonas putida (strain F1 / ATCC 700007) 218 477 Brucella abortus (strain 2308) 219 477 Aspergillus oryzae (strain ATCC 42149 / RIB 40) (Yellow koji mold) 220 476 Streptococcus pneumoniae (strain ATCC BAA-255 / R6) 221 475 Pyrococcus abyssi (strain GE5 / Orsay) 222 474 Burkholderia sp. (strain 383) (Burkholderia cepacia 223 468 Clostridium perfringens (strain 13 / Type A) 224 467 Cupriavidus necator (strain ATCC 17699 / H16 / DSM 428 / Stanier 337) 225 466 Xanthomonas campestris pv. vesicatoria (strain 85-10) 226 466 Shewanella frigidimarina (strain NCIMB 400) 227 466 Pseudomonas putida (strain GB-1) 228 466 Rhodopseudomonas palustris (strain ATCC BAA-98 / CGA009) 229 464 Nicotiana tabacum (Common tobacco) 230 464 Sulfolobus solfataricus (strain ATCC 35092 / DSM 1617 / JCM 11322 / P2) 231 464 Aeromonas hydrophila subsp. hydrophila (strain ATCC 7966 / NCIB 9240) 232 464 Anabaena variabilis (strain ATCC 29413 / PCC 7937) 233 463 Rhodobacter sphaeroides (strain ATCC 17023 / 2.4.1 / NCIB 8253 / DSM 158) 234 463 Shewanella sp. (strain ANA-3) 235 462 Burkholderia mallei (strain ATCC 23344) 236 461 Lactobacillus plantarum (strain ATCC BAA-793 / NCIMB 8826 / WCFS1) 237 460 Enterococcus faecalis (strain ATCC 700802 / V583) 238 459 Cupriavidus pinatubonensis (strain JMP 134 / LMG 1197) (Ralstonia eutropha 239 457 Campylobacter jejuni subsp. jejuni serotype O:2 (strain NCTC 11168) 240 456 Methylococcus capsulatus (strain ATCC 33009 / NCIMB 11132 / Bath) 241 455 Staphylococcus aureus (strain JH1) 242 455 Ovis aries (Sheep) 243 454 Xanthomonas oryzae pv. oryzae (strain MAFF 311018) 244 453 Pseudomonas putida (strain W619) 245 453 Rickettsia felis (strain ATCC VR-1525 / URRWXCal2) (Rickettsia azadi) 246 452 Shewanella baltica (strain OS185) 247 451 Streptococcus mutans serotype c (strain ATCC 700610 / UA159) 248 451 Aeromonas salmonicida (strain A449) 249 450 Mycobacterium paratuberculosis (strain ATCC BAA-968 / K-10) 250 450 Dechloromonas aromatica (strain RCB) 2.3 Taxonomic distribution of the sequences Kingdom sequences (% of the database) Archaea 19295 ( 4%) Bacteria 331804 ( 61%) Eukaryota 179212 ( 33%) Viruses 16479 ( 3%) Within Eukaryota: Category sequences (% of Eukaryota) (% of the complete database) Human 20194 ( 11%) ( 4%) Other Mammalia 46093 ( 26%) ( 8%) Other Vertebrata 17765 ( 10%) ( 3%) Viridiplantae 35737 ( 20%) ( 7%) Fungi 31315 ( 17%) ( 6%) Insecta 8768 ( 5%) ( 2%) Nematoda 4352 ( 2%) ( 1%) Other 14988 ( 8%) ( 3%) 3. SEQUENCE SIZE Repartition of the sequences by size (excluding fragments) From To Number From To Number 1- 50 9103 1001-1100 3802 51- 100 41832 1101-1200 2649 101- 150 58222 1201-1300 2051 151- 200 58159 1301-1400 1922 201- 250 57018 1401-1500 1535 251- 300 50608 1501-1600 751 301- 350 50796 1601-1700 578 351- 400 44017 1701-1800 483 401- 450 36085 1801-1900 439 451- 500 28981 1901-2000 356 501- 550 20679 2001-2100 218 551- 600 14819 2101-2200 297 601- 650 12415 2201-2300 301 651- 700 8981 2301-2400 186 701- 750 7405 2401-2500 142 751- 800 5305 >2500 1111 801- 850 4614 851- 900 5086 901- 950 3928 951-1000 2794 The average sequence length in UniProtKB/Swiss-Prot is 355 amino acids. The shortest sequence is GWA_SEPOF (P83570): 2 amino acids. The longest sequence is TITIN_MOUSE (A2ASS6): 35213 amino acids. 4. JOURNAL CITATIONS Note: the following citation statistics reflect the number of distinct journal citations. Total number of journals cited in this release of UniProtKB/Swiss-Prot: 2451 4.1 Table of the frequency of journal citations Journals cited 1x: 801 2x: 321 3x: 163 4x: 118 5x: 93 6x: 78 7x: 59 8x: 48 9x: 34 10x: 37 11- 20x: 195 21- 50x: 196 51-100x: 112 >100x: 196 4.2 List of the most cited journals in UniProtKB/Swiss-Prot Nb Citations Journal name -- --------- ------------------------------------------------------------- 1 22300 Journal of Biological Chemistry 2 10112 Proceedings of the National Academy of Sciences of the U.S.A. 3 6068 Journal of Bacteriology 4 5223 Biochemical and Biophysical Research Communications 5 4760 Biochemistry 6 4714 Nucleic Acids Research 7 4677 Gene 8 4522 FEBS Letters 9 4384 The EMBO Journal 10 4062 Molecular and Cellular Biology 11 3946 Nature 12 3837 Journal of Molecular Biology 13 3338 Biochimica et Biophysica Acta 14 3298 European Journal of Biochemistry 15 3204 Cell 16 2717 Journal of Virology 17 2640 Science 18 2618 Biochemical Journal 19 2517 Genomics 20 2222 Molecular Microbiology 21 2052 Plant Physiology 22 1980 Journal of Cell Biology 23 1772 The American Journal of Human Genetics 24 1709 Plant Molecular Biology 25 1680 Genes and Development 26 1585 Virology 27 1581 Nature Genetics 28 1555 Human Molecular Genetics 29 1503 The Plant Cell 30 1473 Oncogene 31 1436 Molecular Biology of the Cell 32 1425 Development 33 1370 Human Mutation 34 1364 Molecular and General Genetics 35 1356 The Plant Journal 36 1283 Molecular Cell 37 1277 Journal of Biochemistry 38 1224 Journal of Immunology 39 1164 Structure 40 1154 Genetics 41 1046 Journal of General Virology 42 1032 Blood 43 1022 Journal of Cell Science 44 1015 Infection and Immunity 45 968 Archives of Biochemistry and Biophysics 46 949 Microbiology 47 948 PLoS ONE 48 875 Developmental Biology 49 867 Current Biology 50 831 Cancer Research 51 814 Yeast 52 771 FEMS Microbiology Letters 53 744 Acta Crystallographica, Section D 54 723 Applied and Environmental Microbiology 55 703 Protein Science 56 702 Journal of Neuroscience 57 697 Toxicon 58 644 Human Genetics 59 635 Nature Structural Biology 60 628 Mechanisms of Development 61 627 Neuron 62 611 Journal of Clinical Investigation 63 593 American Journal of Physiology 64 556 The Journal of Experimental Medicine 65 549 Current Genetics 66 539 Proteins 67 537 Plant and Cell Physiology 68 505 Molecular Endocrinology 69 488 Journal of Neurochemistry 70 483 Mammalian Genome 71 480 Nature Cell Biology 72 476 Journal of Medical Genetics 73 473 Bioscience, Biotechnology, and Biochemistry 74 463 Endocrinology 75 460 Immunogenetics 76 452 The Journal of Clinical Endocrinology and Metabolism 77 442 Molecular and Biochemical Parasitology 78 403 Journal of Molecular Evolution 79 401 Experimental Cell Research 80 390 Nature Structural and Molecular Biology 81 382 Molecular Biology and Evolution 82 379 DNA and Cell Biology 83 377 Peptides 84 369 The FEBS Journal 85 368 DNA Sequence 86 363 Developmental Cell 87 361 RNA 88 349 Antimicrobial Agents and Chemotherapy 89 346 Eukaryotic Cell 90 333 Planta 91 332 Journal of Investigative Dermatology 92 331 Brain Research. Molecular Brain Research 93 327 Comparative Biochemistry and Physiology 94 325 Tissue Antigens 95 323 Molecular Pharmacology 96 300 Biology of Reproduction 97 300 Neurology 98 294 Biological Chemistry Hoppe-Seyler 99 284 Developmental Dynamics 100 283 EMBO Reports 101 281 Cytogenetics and Cell Genetics 102 278 The FASEB Journal 103 277 Biochimie 104 276 Immunity 105 273 Genes to Cells 106 271 Genome Research 107 271 Virus Research 108 268 The New England Journal of Medicine 109 263 Journal of General Microbiology 110 263 Journal of the American Chemical Society 111 262 Molecular Plant-Microbe Interactions 112 262 PLoS Genetics 113 246 European Journal of Human Genetics 114 246 Acta Crystallographica, Section F 115 239 Annals of Neurology 116 238 European Journal of Immunology 117 225 Journal of Experimental Botany 118 224 Investigative Ophthalmology and Visual Science 119 220 DNA Research 120 218 BMC Genomics 121 218 Hoppe-Seyler's Zeitschrift fur Physiologische Chemie 122 218 Nature Immunology 123 218 Journal of Human Genetics 124 216 Archives of Microbiology 125 208 American Journal of Medical Genetics. Part A 126 206 Journal of Cellular Biochemistry 127 199 Glycobiology 128 198 Journal of Medicinal Chemistry 129 196 Clinical Genetics 130 195 Molecular and Cellular Endocrinology 131 195 Archives of Virology 132 192 Molecular Immunology 133 187 Circulation Research 134 186 Diabetes 135 183 Cell Cycle 136 181 PLoS Pathogens 137 179 Traffic 138 177 Applied Microbiology and Biotechnology 139 177 Molecular Genetics and Metabolism 140 176 Insect Biochemistry and Molecular Biology 141 175 Phytochemistry 142 172 Protein Expression and Purification 143 170 International Journal of Cancer 144 169 Molecular Phylogenetics and Evolution 145 168 American Journal of Medical Genetics 146 164 PLoS Biology 147 162 Biological Chemistry 148 160 Molecular Reproduction and Development 149 160 Proteomics 150 160 Molecular and Cellular Neuroscience 5. STATISTICS FOR SOME LINE TYPES The following table summarizes the total number of some UniProtKB/Swiss-Prot lines, as well as the number of entries with at least one such line, and the frequency of the lines. Total Number of Average Line type / subtype number entries per entry ------------------------------------ -------- --------- --------- References (RL) 1088886 1.99 Journal 895309 434326 1.64 1 Submitted to EMBL/GenBank/DDBJ 183532 162542 0.34 2 Submitted to other databases 7029 6530 0.01 3 Book citation 1467 1453 <0.01 4 Plant Gene Register 593 581 <0.01 5 Thesis 424 421 <0.01 6 Unpublished observations 335 331 <0.01 7 Patent 191 188 <0.01 8 Worm Breeder's Gazette 6 6 <0.01 9 Total number of distinct authors cited in UniProtKB/Swiss-Prot: 356242 Total Number of Average Line type / subtype number entries per entry Rank ------------------------------------ -------- --------- --------- ---- Comments (CC) 2518894 4.61 ALLERGEN 664 664 <0.01 25 ALTERNATIVE PRODUCTS 23968 23968 0.04 13 BIOPHYSICOCHEMICAL PROPERTIES 6111 6111 0.01 21 BIOTECHNOLOGY 424 422 <0.01 28 CATALYTIC ACTIVITY 252451 227452 0.46 5 CAUTION 9944 9755 0.02 18 COFACTOR 116002 106133 0.21 7 DEVELOPMENTAL STAGE 10376 10376 0.02 17 DISEASE 5785 3905 0.01 23 DISRUPTION PHENOTYPE 7838 7838 0.01 20 DOMAIN 42357 36858 0.08 9 ENZYME REGULATION 12930 12930 0.02 16 FUNCTION 436237 418207 0.80 2 INDUCTION 16834 16834 0.03 14 INTERACTION 13108 13108 0.02 15 MASS SPECTROMETRY 5837 4440 0.01 22 MISCELLANEOUS 33964 31372 0.06 12 PATHWAY 134068 121576 0.25 6 PHARMACEUTICAL 98 98 <0.01 29 POLYMORPHISM 1039 982 <0.01 24 PTM 47552 36678 0.09 8 RNA EDITING 627 627 <0.01 26 SEQUENCE CAUTION 42082 42082 0.08 10 SIMILARITY 657850 521984 1.20 1 SUBCELLULAR LOCATION 334602 328192 0.61 3 SUBUNIT 256452 256452 0.47 4 TISSUE SPECIFICITY 40875 40875 0.07 11 TOXIC DOSE 591 549 <0.01 27 WEB RESOURCE 8228 6884 0.02 19 Total number of comment topics: 29 Total Number of Average Line type / subtype number entries per entry Rank ------------------------------------ -------- --------- --------- ---- Features (FT) 3946777 7.22 ACT_SITE 149983 91898 0.27 10 BINDING 327328 86995 0.60 4 CA_BIND 3974 1671 0.01 35 CARBOHYD 109405 28088 0.20 15 CHAIN 554211 540434 1.01 1 COILED 20832 14347 0.04 26 COMPBIAS 56161 29929 0.10 18 CONFLICT 130387 45606 0.24 13 CROSSLNK 7176 4093 0.01 34 DISULFID 115296 31440 0.21 14 DNA_BIND 10741 9755 0.02 31 DOMAIN 171757 104062 0.31 8 HELIX 199296 19067 0.36 6 INIT_MET 17764 17764 0.03 27 INTRAMEM 2257 990 <0.01 37 LIPID 12231 7846 0.02 30 METAL 344889 85516 0.63 3 MOD_RES 174036 62748 0.32 7 MOTIF 37848 24546 0.07 24 MUTAGEN 50496 11562 0.09 19 NON_CONS 2069 758 <0.01 38 NON_STD 358 283 <0.01 39 NON_TER 12281 9387 0.02 29 NP_BIND 131378 77578 0.24 12 PEPTIDE 10502 7183 0.02 32 PROPEP 13021 11217 0.02 28 REGION 162119 77821 0.30 9 REPEAT 97717 14370 0.18 16 SIGNAL 39388 39378 0.07 23 SITE 49618 27771 0.09 21 STRAND 208675 18002 0.38 5 TOPO_DOM 133420 27697 0.24 11 TRANSIT 8741 8628 0.02 33 TRANSMEM 361429 74712 0.66 2 TURN 48381 15457 0.09 22 UNSURE 3439 732 0.01 36 VAR_SEQ 49985 21017 0.09 20 VARIANT 88491 16928 0.16 17 ZN_FING 29697 13122 0.05 25 Total number of feature keys: 39 Total Number of Average Line type / subtype number entries per entry Rank Category ------------------------------------ -------- --------- --------- ---- ------------------------------------------- Cross-references (DR) 16850797 30.82 Allergome 1633 1061 <0.01 104 Protein family/group databases ArachnoServer 767 759 <0.01 116 Organism-specific databases Bgee 38836 38836 0.07 50 Gene expression databases BindingDB 4993 4993 0.01 91 Chemistry BioCyc 324527 307344 0.59 19 Enzyme and pathway databases BioGrid 40184 39823 0.07 49 Protein-protein interaction databases BRENDA 4363 4351 0.01 94 Enzyme and pathway databases CAZy 7767 6989 0.01 78 Protein family/group databases CCDS 45492 33388 0.08 46 Sequence databases CGD 929 907 <0.01 111 Organism-specific databases ChEMBL 6007 6007 0.01 83 Chemistry ChiTaRS 12559 12555 0.02 73 Other CleanEx 30068 29429 0.05 55 Gene expression databases COMPLUYEAST-2DPAGE 99 98 <0.01 129 2D gel databases ConoServer 949 866 <0.01 110 Organism-specific databases CTD 72257 71559 0.13 38 Organism-specific databases CYGD 5596 5593 0.01 85 Organism-specific databases dictyBase 4204 4089 0.01 96 Organism-specific databases DIP 15733 15668 0.03 70 Protein-protein interaction databases DisProt 605 602 <0.01 121 3D structure databases DMDM 16406 16406 0.03 69 Polymorphism databases DNASU 18760 18690 0.03 65 Protocols and materials databases DOSAC-COBS-2DPAGE 149 147 <0.01 128 2D gel databases DrugBank 11195 1762 0.02 75 Chemistry EchoBASE 4161 4161 0.01 97 Organism-specific databases EcoGene 4294 4292 0.01 95 Organism-specific databases eggNOG 430839 430839 0.79 12 Phylogenomic databases EMBL 959256 535569 1.75 3 Sequence databases Ensembl 82959 48989 0.15 36 Genome annotation databases EnsemblBacteria 349512 330885 0.64 18 Genome annotation databases EnsemblFungi 19089 18747 0.03 62 Genome annotation databases EnsemblMetazoa 12180 9452 0.02 74 Genome annotation databases EnsemblPlants 18955 16177 0.03 63 Genome annotation databases EnsemblProtists 4460 4335 0.01 93 Genome annotation databases euHCVdb 55 44 <0.01 130 Organism-specific databases EuPathDB 811 811 <0.01 114 Organism-specific databases EvolutionaryTrace 16491 16490 0.03 68 Other ExpressionAtlas 32079 32079 0.06 52 Gene expression databases FlyBase 5939 5565 0.01 84 Organism-specific databases Gene3D 466636 344162 0.85 8 Family and domain databases GeneCards 20875 19800 0.04 59 Organism-specific databases GeneFarm 3313 3301 0.01 98 Organism-specific databases GeneID 497591 469564 0.91 6 Genome annotation databases GeneReviews 1156 1153 <0.01 109 Organism-specific databases GeneTree 56430 56405 0.10 45 Phylogenomic databases Genevestigator 68309 68309 0.12 40 Gene expression databases GeneWiki 10367 10281 0.02 76 Other GenoList 7074 7062 0.01 79 Organism-specific databases GenomeRNAi 21682 21682 0.04 58 Other GO 2551238 518194 4.67 1 Ontologies Gramene 6231 6231 0.01 81 Organism-specific databases GuidetoPHARMACOLOGY 2012 2012 <0.01 103 Chemistry H-InvDB 5590 4770 0.01 86 Organism-specific databases HAMAP 323963 321028 0.59 20 Family and domain databases HGNC 19981 19819 0.04 61 Organism-specific databases HOGENOM 385677 385677 0.71 16 Phylogenomic databases HOVERGEN 75661 75661 0.14 37 Phylogenomic databases HPA 22351 15981 0.04 57 Organism-specific databases InParanoid 134777 134777 0.25 27 Phylogenomic databases IntAct 42343 42343 0.08 48 Protein-protein interaction databases InterPro 1902533 526070 3.48 2 Family and domain databases KEGG 476711 447937 0.87 7 Genome annotation databases KO 376977 376485 0.69 17 Phylogenomic databases LegioList 765 763 <0.01 117 Organism-specific databases Leproma 671 668 <0.01 118 Organism-specific databases MaizeGDB 503 498 <0.01 123 Organism-specific databases