ProDom-CG CG42

Release date = August 2001

 

Sequence and Family Data

  • Proteins used to build ProDom CG42:
    A subset of the proteins used to build ProDom 2001.2
  • 148279
  • Number of complete genomes
  • 42
  • domain families with at least 2 sequences
  • 53854
  • domain families
  • 170857
  • PDB links
  • 16564
  • Prosite links (Patterns, Profiles and Prefiles)
  • 7474
  • Pfam-A links
  • 34513
  • InterPro links
  • 51524


    How is Prodom-CG built ?

    ProDom-CG (Complete Genome) is a version of the ProDom database which holds only data originating from the complete genome sequencing projects. This release was builded using the following process:

     

    About the AC numbers

    Because ProDom-CG was built from ProDom v 2001.2, we could use an AC numbering scheme consistent with ProDom 2001.2:

    For instance, the ProDom family PD000271 (177 sequences) is found in ProDom-CG as CG000271 (104 sequences). But the ProDom family PD000646 (238 sequences) is deleted because it did not content any sequence from the complete genomes, so that there is no CG000646 family in ProDom-CG.


    Complete genomes found in ProDom-C

    There are 8 archeal, 30 bacterial and 4 eukaryotic genomes in Prodom-CG.

    Type Organism TaxId Short name nb of seq
            CG17 CG20 CG42
    A Aeropyrium pernix K1 56636 AERPE     2695
    A Archaeoglobus fulgidus 2234 ARCFU 2407 2407 2396
    A Halobacterium sp. (strain NRC-1) 64091 HALN1     2370
    A Methanococcus jannaschii 2190 METJA 1770 1770 1773
    A Methanobacterium thermoautotrophicum. 145262 METTH 1869 1869 1869
    A Pyrococcus abyssi 29292 PYRAB     1765
    A Pyrococcus horikoshii 53953 PYRHO 2061 2061 2062
    A Thermoplasma acidophilum 2303 THEAC     1478
    B Aquifex aeolicus 63363 AQUAE 1522 1522 1550
    B Bacillus halodurans 86665 BACHD     4066
    B Bacillus subtilis 1423 BACSU 4100 4100 4133
    B Borrelia burgdorferi 139 BORBU 1255 1255 1251
    B Buchnera aphidicola 118099 BUCAI     568
    B Campylobacter jejuni 197 CAMJE     1613
    B Chlamydia muridarum 83560 CHLMU     916
    B Chlamydia pneumoniae 83558 CHLPN   1052 1052
    B Chlamydia trachomatis 813 CHLTR 894 894 894
    B Deinococcus radiodurans 1299 DEIRA     3084
    B Escherichia coli 562 ECOLI 4289 4289 4301
    B Haemophilus 727 HAEIN 1709 1709 1712
    B Helicobacter pylori J99 85963 HELPJ     1488
    B Helicobacter pylori 210 HELPY 1565 1565 1554
    B Lactococcus lactis (subsp. lactis) 1360 LACLA     2222
    B Mycoplasma genitalium 2097 MYCGE 480 480 483
    B Mycobacterium leprae 1769 MYCLE     1555
    B Mycoplasma pneumoniae 2104 MYCPN 677 677 687
    B Mycobacterium tuberculosis 1773 MYCTU 3918 3918 3874
    B Neisseria meningitidis (serogroup A) 65699 NEIMA     2039
    B Neisseria meningitidis (serogroup B) 491 NEIMB     1967
    B Pasteurella multocida 747 PASMU     2010
    B Pseudomonas aeruginosa 287 PSEAE     5556
    B Rickettsia prowazekii 782 RICPR   834 835
    B Synechocystis sp. (strain PCC 6803) 1148 SYNY3   3166 3143
    B Thermotoga maritima 2336 THEMA     1847
    B Treponema pallidum 160 TREPA 1031 1031 1030
    B Ureaplasma parvum 134821 UREPA     611
    B Vibrio cholerae 666 VIBCH     3784
    B Xylella fastidiosa 2371 XYLFA     2774
    E Caenorhabditis elegans 6239 CAEEL   16332 19810
    E Arabidopsis thaliana 3702 ARATH     28624
    E Drosophila melanogaster 7227 DROME     16387
    E Saccharomyces cerevisiae 4932 YEAST     6594

    Global statistics

    Distribution of the number of domains per sequence

    NOTE - The sequences with more than 50 domains are not shown.


    Distribution of the radius of gyration

    NOTES -

    1. The families with radius of gyration higher than 200 PAM are not shown
    2. Only families with at least 2 sequences are used here


    distribution of the diameter

    NOTES -

    1. The families with diameter higher than 500 PAM are not shown
    2. Only families with at least 2 sequences are used here