Index of /func/multidom
Name Last modified Size Description
Parent Directory 20-Aug-2004 17:59 -
README 12-May-2001 08:33 4k
function-codes.txt 12-May-2001 08:33 16k
new_combo_wmultidomc..> 12-May-2001 08:33 609k
realsingle_sfam_wfun..> 12-May-2001 08:33 165k
This directory contains supplementary data relevant to:
Hegyi & Gerstein,
Functional divergence in multidomain proteins:
a large-scale survey of structural superfamilies
__________________
function-codes.txt
~~~~~~~~~~~~~~~~~~
We divided functions into enzyme and non-enzyme. Enzymatic functions
were classified by the EC system (Bairoch, 2000). Comparisons of
enzymatic functions were treated the same way as in our earlier
analysis, i.e. if they differ in the first 3 components of their
respective EC numbers, they were considered different. This implied
that our analysis dealt with a total of 112 enzymatic
functions. Non-enzymatic functions were classified into 508 different
categories based on a simple thesaurus we assembled of synonymous
keywords drawn from Swiss-prot description lines. In addition, we
created 49 categories for functions that have an enzymatic component
but which are not part of the EC system. This gave us a total of 669
functions (112+508+49).
The file contains a list of synonyms in the Swiss-prot database, used
for our functional classification. One can have more than 1 function
for a given protein. E is EC system, and D is an enzyme that is not
annotated in the EC system.
__________________________________________
new_combo_wmultidomcoverage_wlen_wcode.lst
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
4763 lines
XXXXXXXXXXXXXXXXXXXXXX|YYYYYYYYYY|ZZZ|WWW|U#| ... |VVVVVVVV....
1.100.1 3.5.4 4.33.1|CH60_ACTAC|546|1 1|1#|1.100.1 4.33.1 3.5.4 4.33.1 1.100.1|60 KD CHAPERONIN (PROTEIN CPN60) (GROEL PROTEIN)
1.100.1 3.5.4 4.33.1|CH60_ACTPL|546|1 1|1#|1.100.1 4.33.1 3.5.4 4.33.1 1.100.1|60 KD CHAPERONIN (PROTEIN CPN60) (GROEL PROTEIN)
1.100.1 3.5.4 4.33.1|CH60_ACYPS|548|1 0|1#|1.100.1 4.33.1 3.5.4 1.100.1|60 KD CHAPERONIN (PROTEIN CPN60) (GROEL PROTEIN) (SYMBIONIN)
1.100.1 3.5.4 4.33.1|CH60_AGRTU|544|1 0|1#|1.100.1 4.33.1 3.5.4 1.100.1|60 KD CHAPERONIN (PROTEIN CPN60) (GROEL PROTEIN)
2.1.1 4.15.1|HB2P_HUMAN|258|1 0|45#|4.15.1 2.1.1|HLA CLASS II HISTOCOMPATIBILITY ANTIGEN, DP(W4) BETA CHAIN PRECURSOR - HOMO SAPIENS (HUMAN)
2.1.1 4.15.1|HB2P_RABIT|257|1 0|45#|4.15.1 2.1.1|RLA CLASS II HISTOCOMPATIBILITY ANTIGEN, DP BETA CHAIN PRECURSOR (D10 HAPLOTYPE) - ORYCTOLAGUS CUNICULUS (RABBIT)
2.1.1 4.15.1|HB2Q_HUMAN|258|1 0|45#|4.15.1 2.1.1|HLA CLASS II HISTOCOMPATIBILITY ANTIGEN, DP(W2) BETA CHAIN PRECURSOR (SB-2-BETA) - HOMO SAPIENS (HUMAN)
2.1.1 4.15.1|HB2Q_MOUSE|265|1 0|45#|4.15.1 2.1.1|H-2 CLASS II HISTOCOMPATIBILITY ANTIGEN, A-Q BETA CHAIN PRECURSOR - MUS MUSCULUS (MOUSE)
2.1.1 4.15.1|HB2S_MOUSE|263|1 0|45#|4.15.1 2.1.1|H-2 CLASS II HISTOCOMPATIBILITY ANTIGEN, A-S BETA CHAIN PRECURSOR - MUS MUSCULUS (MOUSE)
4.72.1 5.13.1|GYRB_HELPY|773|1 0|E5.99.1|4.72.1 5.13.1|DNA GYRASE SUBUNIT B (EC 5.99.1.3)
4.72.1 5.13.1|GYRB_MYCCA|643|1 0|E5.99.1|4.72.1 5.13.1|DNA GYRASE SUBUNIT B (EC 5.99.1.3)
2.39.2 3.69.1|BISC_RHOSH|744|1 1|D369#|3.69.1 2.39.2|BIOTIN SULFOXIDE REDUCTASE (EC 1.-.-.-) (BDS REDUCTASE) (BSO REDUCTASE) - RHODOBACTER SPHAEROIDES (RHODOPSEUDOMONAS SPHAEROIDES)
2.39.2 3.69.1|BISZ_ECOLI|809|1 1|D369#|3.69.1 2.39.2|BIOTIN SULFOXIDE REDUCTASE 2 (EC 1.-.-.-) (BDS REDUCTASE 2) (BSO REDUCTASE 2)
X = scop superfamily identifiers, giving non-redundant combination
Y = swiss id
Z = length
W = is it fully covered? (1 1)
U = function code
V = swiss prot description (DE) line
____________________________________________
realsingle_sfam_wfun_nofrag_nohypo_wcode.lst
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1818 lines
describing single domain classifications
similar format to the multi-domain file above
YYYYYYYYYY|XXXXX|UUU#|VVVVVV.....
GLB1_CALSO|1.1.1|167#|GLOBIN I (HB I) - CALYPTOGENA SOYOAE (DEEP-SEA COLD-SEEP CLAM)
GLB1_PHESE|1.1.1|167#|GLOBIN I, EXTRACELLULAR (ERYTHROCRUORIN) - PHERETIMA SIEBOLDI (EARTHWORM)
GLB2_CALSO|1.1.1|167#|GLOBIN II (HB II) - CALYPTOGENA SOYOAE (DEEP-SEA COLD-SEEP CLAM)
GLBC_NIPBR|1.1.1|167#|GLOBIN, CUTICULAR ISOFORM PRECURSOR
GLB1_GLYDI|1.1.1|167#|GLOBIN, MAJOR MONOMERIC COMPONENT - GLYCERA DIBRANCHIATA (BLOODWORM)
X = scop superfamily identifiers
Y = swiss id
U = function code
V = swiss prot description (DE) line
---
last updated on 2000,12.11