Protein Domains Analysis

Biological context :

In genome annotation and protein family curation, it is a recurrent task to make the inventory and classify all proteins from a specific family encoded in a genome. Because most proteins are combinatorial arrangements of protein domains, it is not sufficient to identify all proteins pertaining to a particular family. The user also needs to comprehend the various classes (domain arrangements) of proteins within this set. The current workflow is meant to streamline this process and to accelerate the discovery of novel types of domain arrangements.


- “ProDomAC”: ProDom accession number for a protein domain family (eg PD000039).

- “Bank”: a set of protein sequences in fasta format (e.g. all proteins encoded in a genome)

- “Param”: Blast parameters (enter “default” if you don’t want any special parameters). Default is -e 0.01 -j 10 -h 0.001 -m 7, please see blast documentation for more details. The –m 7 parameter (XML output) is compulsory in this workflow.

- “Parammkdom”: Blast parameters to use in Mkdom program. If you don’t want to change the default parameters, please type "default". It will be -j 10 -e 0.01 -h 0.01. If you want to change something : ITER=value for -j=value E_EXPECT=value for -e=value and H_EXPECT=value for -h=value.


- Sequences selected from the bank pertaining to the ProDom family (fasta format).

- Result from mkdom analysis in xdom format.

- The alignment file from Mkdom analysis in srs format.

- A simplified log file.

- An URL which directs towards an application to visualise easily your results.

- Result from the InterProScan analysis

- url-interpro-visualisation the url where you can find your results with graphical visualisation.

What it does:

As the user chooses one ProDom family, the fetchdom program extracts corresponding domain sequences from ProDom and returns their alignment and consensus sequence. The psi-blast program is then run with these previous outputs as query against the user-defined set of protein sequences (the “bank”). The sequences thus detected as relevant to the ProDom family of interest are extracted and their domain arrangements analyzed using the mkdom program on the one hand, InterProScan on the other hand. Domain arrangements derived from the mkdom program can be further inspected using the xdom visualization tool to be installed on the client machine (to be obtained from