NIF LinkOut Portal

Options
Only Pubmed Central
Include Pubmed Central
Sections
Title
Abstract
Introduction
Methods
Results
Supplement
Appendix
Contributions
Background
Commentary
Funding
Limitations
Caption
FILTERS

Exhaustive enumeration of protein domain families.

Authors:
Heger A, Holm L
Affiliation:
Journal:
Journal of molecular biology

Abstract

Domains are considered as the basic units of protein folding, evolution, and function. Decomposing each protein into modular domains is thus a basic prerequisite for accurate functional classification of biological molecules. Here, we present ADDA, an automatic algorithm for domain decomposition and clustering of all protein domain families. We use alignments derived from an all-on-all sequence comparison to define domains within protein sequences based on a global maximum likelihood model. In all, 90% of domain boundaries are predicted within 10% of domain size when compared with the manual domain definitions given in the SCOP database. A representative database of 249,264 protein sequences were decomposed into 450,462 domains. These domains were clustered on the basis of sequence similarities into 33,879 domain families containing at least two members with less than 40% sequence identity. Validation against family definitions in the manually curated databases SCOP and PFAM indicates almost perfect unification of various large domain families while contamination by unrelated sequences remains at a low level. The global survey of protein-domain space by ADDA confirms that most large and universal domain families are already described in PFAM and/or SMART. However, a survey of the complete set of mobile modules leads to the identification of 1479 new interesting domain families which shuffle around in multi-domain proteins. The data are publicly available at ftp://ftp.ebi.ac.uk/pub/contrib/heger/adda.

  1. Welcome

    Welcome to NIF. Explore available research resources: data, tools and materials, from across the web

  2. Community Resources

    Search for resources specially selected for NIF community

  3. More Resources

    Search across hundreds of additional biomedical databases

  4. Literature

    Search Pub Med abstracts and full text from PubMed Central

  5. Insert your Query

    Enter your search terms here and hit return. Search results for the selected tab will be returned.

  6. Join the Community

    Click here to login or register and join this community.

  7. Categories

    Narrow your search by selecting a category. For additional help in searching, view our tutorials.

  8. Query Info

    Displays the total number of search results. Provides additional information on search terms, e.g., automated query expansions, and any included categories or facets. Expansions, filters and facets can be removed by clicking on the X. Clicking on the + restores them.

  9. Search Results

    Displays individual records and a brief description. Click on the icons below each record to explore additional display options.

X