The ProDom Help File
|
|
If you find bugs, wish to make comments or suggestions about ProDom
please send a message to Aurelie Laugraud
|
|
Help on available Web Services can be found there :
http://prodom.prabi.fr/prodom/current/html/webservices.html
|
|
ProDom is a protein domain family database
constructed automatically by clustering homologous
segments. The ProDom building procedure MKDOM2 is based on
recursive PSI-BLAST
searches [ALTS2]. The
source protein sequences are non-fragmentary sequences
derived from UniProtKB (Swiss-Prot and TrEMBL databases). ProDom was
first established in 1993 [SONN] and maintained by the
Laboratoire de Génétique Cellulaire and the Laboratoire de
Interactions Plantes-Microorganismes
(INRA/CNRS) in Toulouse. It is now maintained by the PRABI (bioinformatics center of Rhone-Alpes). The ProDom database consists of
domain family entries. Each entry provides a multiple
sequence alignment of homologous domains and a family
consensus sequence.
|
|
|
The Main form
|
The main form is separated in two or three parts:
- ProDom Browsing
- Compare your sequence with ProDom
- Search by Kingdom (ProDom-CG only)
|
ProDom Browsing |
With this form, you may select one or several ProDom entries, using different search criteria:
- Display a ProDom entry
Type a ProDom AC number to display the corresponding entry
Ex: PD000039 (or, shorter PD39)
- All Proteins in ProDom families
Type one or several ProDom AC to display:
- The domain decomposition of proteins found in all those families (AND button selected)
- The domain decomposition of proteins found in One or several of those families (OR button selected)
Ex: PD39 PD309
- Search by related databases
Type one or several ID, AC, entry name belonging to a cross-referred database
(ie interpro,pfamA,prosite,PDB), you'll get the list of ProDom families
who have a link to those entries.
Ex:
kringle: All the ProDom families who have a link to any related
database entry with kringle as name
interpro kringle: this restricts the previous request to the interpro entries
1aac: all the ProDom families who have a link to this PDB protein.
IPR000001: all the ProDom families linked to the IPR000001 entry of interpro.
PS00010: all the ProDom families linked to the IPR000001 entry of prosite.
PF00034: all the ProDom families linked to the PF00034 entry of pfamA.
- UniProtKB entry
Type one or several UniProtKB ID or AC to retrieve the domain decomposition of those proteins:
fixj: all the FIXJ proteins
mouse: all the proteins from the MOUSE organism
UFO_HUMAN: only the UFO protein from the HUMAN (from little green men ?).
P30530: The same request as before, but with a UniProtKB AC instead of an ID.
- Keyword search
This requests looks for the keyword(s) you typed inside the KW line of the database. You may type
one or several keywords, they may be connected with AND or OR booleans.
|
Compare your sequence with ProDom
|
With this form, you may start a Blast-P or Blast-X search against:
- The consensus sequence provided with the ProDom families
- The multiple alignments provided with each ProDom family
The first search is faster, as it is less CPU intensive, but
the second is more sensisitive.
We use the Blast program from NCBI.
The Results page shows:
- The results as a graphical representation: the HSP
are superimposed, lower scores at bottom and higher
scores at top: the lower scores may thus be hidden
by the higher scores.
- A form to let you execute the program multalin to
align your sequence with the HSP found
- If there are PDB links in the ProDom family
retrieved, a form to let you execute Swiss-Model
jobs on your sequence.
- If there is a PDB link with your sequence, a form is
provided, letting you execute a geno3d job with your sequence.
- The results, as a textual representation: be careful,
however, the original output is filtered to yield non-redundant similarities
|
Search by Kingdom (ProDom CG only)
|
When using ProDom-CG, we may search the database with kingdom relative criteria: for each kingdom
(archae, bacteria, eukaryotic), you may select:
- None: Only the families without any protein from an organism belonging to this kingdom.
- Some: Only the families with some (at least one) proteins from one of several
organisms belonging to this kingdom.
- Maybe:This kingdom is not taken into consideration for the search
- All: Only the families with at least one protein from every organism of this kingdom
which is present in ProDom-CG.
|
 |
|
|
The ProDom entry (upper frame)
|
 |
|
|
what do all those icons mean ?
|
 |
This motif is the graphic representation of this
family. Several families have a motif
representation, consistent throughout the whole
database. |
 |
Graphic representation of all proteins in this
domain, with the decomposition in domains |
 |
The list of ProDom families which are related to the
current family. "Related" means that there are far
homologies between them. This button is not drawn
if the family is not related to any other one, or for
the families with poor alignment or homogeneity
(norMD values low).
|
 |
If no pdb links are found for this family or its
related families, and if this family satisfies
several quality checks, , then this family
could be a good candidate for structural
determination. If present, click the button for more infos. |
 |
When in ProDom, use this button to access the corresponding ProDom-CG family,
if it exists. When in ProDom-CG, use it to access the corresponding
ProDom family, which should always exist. |
 |
To retrieve the ProDom family in MSF format. |
 |
To retrieve the ProDom family in Fasta format. |
 |
To compute a profile, using psi-blast against this family.
Warning !!! You should retrieve the
binary file, unfortunately this file will not
work on every architecture. This functionality is
still experimental. |
 |
The normd value is computed for every ProDom
family. If this stamp is displayed here, the normd
may be considered as "high" (> 0.4), meaning the
alignment is of "good quality". |
 |
To access the Predict Protein server, through a pre-filled form. |
 |
Fill ESPript with this family, to print a high quality representation of
this family. |
 |
Fill STRAP with this family, to see the alignment and phylogenetic tree of this family based on structure. |
 |
This frame may be printed: just press this button. |
|
The consistency indicators
|
- Distances are counted in PAM (percent accepted mutations = number of
accepted point mutations per hundred residues).
For example 20 PAM correspond to 82% identity.
- The DIAMETER is the largest distance between 2 domains in the family.
- RADIUS OF GYRATION is the root mean distance between the consensus and all
members of the family
The Smaller these two values, the most homogeneous the family.
|
Gene Ontology Links |
- We tried to compute some Gene Ontology annotation for as much as ProDom families as possible.
- The data displayed are:
- The entry name
- The ontology: F for Molecular Function, P for Biological Process, C for Cellular Component
- The precision: from 0 to 1, the higher, the more precise the term is inside the Gene Ontology graph.
- The probability of assignment
|
InterPro Links |
- The links between InterPro and ProDom were computed using the MatchDom program
(search for overlaps between InterPro and ProDom domains).
- We scanned all InterPro domain families (release 18.0)
with each ProDom family.
|
pfam-A Links |
- Links between Pfam-A and ProDom were computed using the MatchDom program
(search for overlaps between Pfam-A and ProDom domains).
- We scanned all Pfam-A domains with each ProDom family.
|
PROSITE Links |
- Links between PROSITE and ProDom were computed using the
LASSAP program.
- We scanned ProDom consensus sequences with PROSITE patterns, excluding
the most frequent patterns.
- We also scanned ProDom consensus sequences with PROSITE profiles, using
the pfsearch program.
|
PDB Links |
- A fasta file is generated from the current release of the pdb, using the
ATOMS lines (NOT SEQRES).
- Links between the PDB and PRODOM were then computed using the LASSAP program.
|
 |
| |
The ProDom entry (lower frame)
|
 |
what is the use of those tools ?
|
 |
The family, or the current subfamily, will be represented as a tree, using the displayFam program, if
you press this button.
|
 |
Develop the whole family, or only develop a subfamily (a cluster of domains)
Please note you can adjust parameters for the clustering process with the form
found under the alignment |
 |
You are lost inside a very deep subfamily ? Press this button to recover the start default display.
|
|
 |
The ProDom-CG evolutionary scenario (upper frame, ProDom-CG only)
|
 |
|
A most probable evolutionary scenario is proposed for every non unique protein domain family.
It is computed using a bayesian network algorithm, and displayed in superposed to the species tree.
Nodes are red colored when the protein domain is present.
|
 |
| |
Requests producing a list of proteins
|
 |
what is the use of those tools ?
|
 |
The simplified output display mode (default)
consists in representing a line
per each protein architecture, not a line per
protein. Click this button to enter the complete
output mode, in which you represent a line for each
protein.
|
 |
When in complete output display, press this
button to enter the simplified output display.
When in simplified output display, press one
of those buttons (they are drawn near some
architecture representations) to open the "same
architecture" window.
|
 |
Press this button to display the list of proteins
related to the protein displayed; related means here
that they share at least a prodom domain.
|
| FIXJ_AZOCA |
The name of the protein is a link, follow it to go to
the corresponding entry of the UniProtKB server.
|
 |
Protein or family lists may be very long. No more
than 200 items will be displayed together. Press
this button to display the 200 first items of the
list.
|
 |
Shift the displayed list by 200. There is no
overlap between the current and the new
displays. Convenient to rapidly scan the retrieved
proteins of families.
|
 |
Shift the displayed list by 50. There is a big
overlap thetween the current and the new
displays. Convenient to carefully compare the lines.
|
| |New Window| |
Open a new window, exactly the same as the current
window. Combine this with the shift keys to compare
far objects in the list
|
| |Close| |
Close this window.
|
|
 |
| |
The same architecture screen
|
 |
This screen presents, for each architecture shared by
several proteins, the list of proteins sharing this
architecture.
The list of protein Ids is in fact a list of links:
You may click every name to go to the expasy server.
|
|
 |
| |
Requests producing a list of families
|
 |
what is the use of those tools ?
|
 |
Press this button to display the list of proteins
containing a domain of this family.
|
 |
Press this button to display this family prodom
entry.
|
 |
The logo of this family, if any.
|
| PD001842 |
The accession number of this family.
|
| 64 |
The effectif (number of domains) of this family.
The accession number of the ProDom family retrieved.
|
 |
The "Normd quality" logo, if the normd of this family is above 0.4.
Please note that the logo will never be printed for mailies whose effectif
is lower than 3, as this makes no sens computing the normd in this case.
|
 |
Protein or family lists may be very long. No more
than 200 items will be displayed together. Press
this button to display the 200 first items of the
list.
|
 |
Shift the displayed list by 200. There is no
overlap between the current and the new
displays. Convenient to rapidly scan the retrieved
proteins of families.
|
 |
Shift the displayed list by 50. There is a big
overlap thetween the current and the new
displays. Convenient to carefully compare the lines.
|
| |New Window| |
Open a new window, exactly the same as the current
window. Combine this with the shift keys to compare
far objects in the list
|
| |Close| |
Close this window.
|
|
 |
| |
Known bugs |
 |
Something wrong ? not clear ? found a bug ? Please press here and send us a mail!!!
|
| |
Due to some problems in the ProDom build process, some
families or some proteins may have disappeared from this
release. We apologize for any inconvenience.
|
 |
The biggest families in ProDom will not be correctly printed in ESPript |
 |
The biggest families in ProDom will not be correctly sent in MSF format |
 |
You should retrieve only binary profile files, as
texte profile files are not really interesting. However,
those files will not work with any computer architecture. |
 |
The normd value could not be computed for every
family. If there is no value displayed, it means that we
could not compute this value. |
| |
Some very looooooooooooooooooooong proteins (see
Q8WZ42_HUMAN for instance) are not correctly displayed by
any browser |
| |
This Graphical User Interface was tested with following browsers:
- Netscape 4.7x
- Mozilla 1.x and others (Netscape 7.x, Mozilla Firefox)
- Opera 6.x
- Internet Explorer 5.5
Please note that your browser must be Javascript and CSS enabled, and must be able
to display png graphics. Thus, older versions of those browsers are not
supported.
|
|
 |
|
© The ProDom database is copyrighted by INRA and CNRS
© UniProtKB copyright (c) 2002-2011 UniProt Consortium
ProDom - Server maintained by
Dominique Guyot ,
on behalf of the
ProDom team
Graphics design
Sandrine Dalmar
Last updated on December 22nd, 2011.
|