ProDom-CG CG47

Release date = December 2001

 

Sequence and Family Data

  • Proteins used to build ProDom CG47:
    A subset of the proteins used to build ProDom 2001.3
  • 158245
  • Number of complete genomes
  • 47
  • domain families with at least 2 sequences
  • 49943
  • domain families
  • 182217
  • PDB links
  • 2700
  • Prosite links (Patterns, Profiles and Prefiles)
  • 1340
  • Pfam-A links
  • 2422
  • InterPro links
  • 2844


    How is Prodom-CG built ?

    ProDom-CG (Complete Genome) is a version of the ProDom database which holds only data originating from the complete genome sequencing projects. This release was builded using the following process:

     

    About the AC numbers

    Because ProDom-CG was built from ProDom v 2001.3, we could use an AC numbering scheme consistent with ProDom 2001.3:

    For instance, the ProDom family PD000271 (393 sequences) is found in ProDom-CG as CG000271 (192 sequences). But the ProDom family PD000646 (250 sequences) is deleted because it did not content any sequence from the complete genomes, so that there is no CG000646 family in ProDom-CG.


    Complete genomes found in ProDom-CG47

    There are 9 archeal, 34 bacterial and 4 eukaryotic genomes in Prodom-CG.

    Type Organism TaxId Short name nb of seq
            CG17 CG20 CG42 CG47
    A Aeropyrium pernix K1 56636 AERPE     2695 2700
    A Archaeoglobus fulgidus 2234 ARCFU 2407 2407 2396 2400
    A Halobacterium sp. (strain NRC-1) 64091 HALN1     2370 2425
    A Methanococcus jannaschii 2190 METJA 1770 1770 1773 1771
    A Methanobacterium thermoautotrophicum. 145262 METTH 1869 1869 1869 1869
    A Pyrococcus abyssi 29292 PYRAB     1765 1762
    A Pyrococcus horikoshii 53953 PYRHO 2061 2061 2062 2066
    A Sulfolobus solfataricus 2287 SULSO       2938
    A Thermoplasma acidophilum 2303 THEAC     1478 1479
    B Aquifex aeolicus 63363 AQUAE 1522 1522 1550 1549
    B Bacillus halodurans 86665 BACHD     4066 3988
    B Bacillus subtilis 1423 BACSU 4100 4100 4133 4119
    B Borrelia burgdorferi 139 BORBU 1255 1255 1251 1247
    B Buchnera aphidicola 118099 BUCAI     568 573
    B Campylobacter jejuni 197 CAMJE     1613 1607
    B Caulobacter crescentus 69394 CAUCR       3717
    B Chlamydia muridarum 83560 CHLMU     916 919
    B Chlamydia pneumoniae 83558 CHLPN   1052 1052 1052
    B Chlamydia trachomatis 813 CHLTR 894 894 894 895
    B Deinococcus radiodurans 1299 DEIRA     3084 3083
    B Escherichia coli 562 ECOLI 4289 4289 4301 4301
    B Haemophilus 727 HAEIN 1709 1709 1712 1707
    B Helicobacter pylori J99 85963 HELPJ     1488 1486
    B Helicobacter pylori 210 HELPY 1565 1565 1554 1548
    B Lactococcus lactis (subsp. lactis) 1360 LACLA     2222 2224
    B Mycoplasma genitalium 2097 MYCGE 480 480 483 485
    B Mycobacterium leprae 1769 MYCLE     1555 1590
    B Mycoplasma pneumoniae 2104 MYCPN 677 677 687 687
    B Mycoplasma pulmonis 2107 MYCPU       772
    B Mycobacterium tuberculosis 1773 MYCTU 3918 3918 3874 3847
    B Neisseria meningitidis (serogroup A) 65699 NEIMA     2039 2006
    B Neisseria meningitidis (serogroup B) 491 NEIMB     1967 1964
    B Pasteurella multocida 747 PASMU     2010 2014
    B Pseudomonas aeruginosa 287 PSEAE     5556 5553
    B Rhizobium loti (Mesorhizobium loti) 381 RHILO       7255
    B Rickettsia prowazekii 782 RICPR   834 835 834
    B Streptococcus pyogenes 1314 STRPY       1688
    B Synechocystis sp. (strain PCC 6803) 1148 SYNY3   3166 3143 3138
    B Thermotoga maritima 2336 THEMA     1847 1852
    B Treponema pallidum 160 TREPA 1031 1031 1030 1029
    B Ureaplasma parvum 134821 UREPA     611 603
    B Vibrio cholerae 666 VIBCH     3784 3784
    B Xylella fastidiosa 2371 XYLFA     2774 2772
    E Caenorhabditis elegans 6239 CAEEL   16332 19810 17964
    E Arabidopsis thaliana 3702 ARATH     28624 25306
    E Drosophila melanogaster 7227 DROME     16387 13600
    E Saccharomyces cerevisiae 4932 YEAST     6594 6081

    Global statistics

    Distribution of the number of domains per sequence

    NOTE - The sequences with more than 50 domains are not shown.


    Distribution of the radius of gyration

    NOTES -

    1. The families with radius of gyration higher than 200 PAM are not shown
    2. Only families with at least 2 sequences are used here


    distribution of the diameter

    NOTES -

    1. The families with diameter higher than 500 PAM are not shown
    2. Only families with at least 2 sequences are used here