SWISS-PROT protein database is constructed by Amos Bairoch of University of
Geneva. SWISS-PROT also contains sequences translated from the EMBL Nucleotide
Sequence Database.
Example
ID TNFA_HUMAN STANDARD; PRT; 233 AA.
AC P01375;
DT 21-JUL-1986 (REL. 01, CREATED)
DT 21-JUL-1986 (REL. 01, LAST SEQUENCE UPDATE)
DT 01-FEB-1995 (REL. 31, LAST ANNOTATION UPDATE)
DE TUMOR NECROSIS FACTOR PRECURSOR (TNF-ALPHA) (CACHECTIN).
GN TNFA.
OS HOMO SAPIENS (HUMAN).
OC EUKARYOTA; METAZOA; CHORDATA; VERTEBRATA; TETRAPODA; MAMMALIA;
OC EUTHERIA; PRIMATES.
RN [1]
RP SEQUENCE FROM N.A.
RM 87217060
RA NEDOSPASOV S.A., SHAKHOV A.N., TURETSKAYA R.L., METT V.A.,
RA AZIZOV M.M., GEORGIEV G.P., KOROBKO V.G., DOBRYNIN V.N.,
RA FILIPPOV S.A., BYSTROV N.S., BOLDYREVA E.F., CHUVPILO S.A.,
RA CHUMAKOV A.M., SHINGAROVA L.N., OVCHINNIKOV Y.A.;
RL COLD SPRING HARB. SYMP. QUANT. BIOL. 51:611-624(1986).
RN [2]
RP SEQUENCE FROM N.A.
RM 85086244
RA PENNICA D., NEDWIN G.E., HAYFLICK J.S., SEEBURG P.H., DERYNCK R.,
RA PALLADINO M.A., KOHR W.J., AGGARWAL B.B., GOEDDEL D.V.;
RL NATURE 312:724-729(1984).
RN [3]
RP SEQUENCE FROM N.A.
RM 85137898
RA SHIRAI T., YAMAGUCHI H., ITO H., TODD C.W., WALLACE R.B.;
RL NATURE 313:803-806(1985).
CC -!- FUNCTION: CYTOKINE WITH A WIDE VARIETY OF FUNCTIONS: IT CAN
CC CAUSE CYTOLYSIS OF CERTAIN TUMOR CELL LINES, IT IS IMPLICATED
CC IN THE INDUCTION OF CACHEXIA, IT IS A POTENT PYROGEN CAUSING
CC FEVER BY DIRECT ACTION OR BY STIMULATION OF IL-1 SECRETION, IT
CC CAN STIMULATE CELL PROLIFERATION & INDUCE CELL DIFFERENTIATION
CC UNDER CERTAIN CONDITIONS.
CC -!- SUBUNIT: HOMOTRIMER.
CC -!- SUBCELLULAR LOCATION: TYPE II MEMBRANE PROTEIN. ALSO EXISTS AS
CC AN EXTRACELLULAR SOLUBLE FORM.
CC -!- PTM: THE SOLUBLE FORM DERIVES FROM THE MEMBRANE FORM BY
CC PROTEOLYTIC PROCESSING.
CC -!- DISEASE: CACHEXIA ACCOMPANIES A VARIETY OF DISEASES, INCLUDING
CC CANCER AND INFECTION, AND IS CHARACTERIZED BY GENERAL ILL
CC HEALTH AND MALNUTRITION.
CC -!- SIMILARITY: BELONGS TO THE TUMOR NECROSIS FACTOR FAMILY.
DR EMBL; X02910; HSTNFA.
DR EMBL; M16441; HSTNFAB.
DR EMBL; X01394; HSTNFR.
DR EMBL; M10988; HSTNFAA.
DR PIR; B23784; QWHUN.
DR PIR; A44189; A44189.
DR PDB; 1TNF; 15-JAN-91.
DR PDB; 2TUN; 31-JAN-94.
DR MIM; 191160; 11TH EDITION.
DR PROSITE; PS00251; TNF.
KW CYTOKINE; CYTOTOXIN; TRANSMEMBRANE; GLYCOPROTEIN; SIGNAL-ANCHOR;
KW MYRISTYLATION; 3D-STRUCTURE.
FT PROPEP 1 76
FT CHAIN 77 233 TUMOR NECROSIS FACTOR.
FT TRANSMEM 36 56 SIGNAL-ANCHOR (TYPE-II PROTEIN).
FT LIPID 19 19 MYRISTATE.
FT LIPID 20 20 MYRISTATE.
FT DISULFID 145 177
FT MUTAGEN 108 108 R->W: BIOLOGICALLY INACTIVE.
FT MUTAGEN 112 112 L->F: BIOLOGICALLY INACTIVE.
FT MUTAGEN 162 162 S->F: BIOLOGICALLY INACTIVE.
FT MUTAGEN 167 167 V->A,D: BIOLOGICALLY INACTIVE.
FT MUTAGEN 222 222 E->K: BIOLOGICALLY INACTIVE.
FT CONFLICT 63 63 F -> S (IN REF. 5).
SQ SEQUENCE 233 AA; 25644 MW; 279986 CN;
MSTESMIRDV ELAEEALPKK TGGPQGSRRC LFLSLFSFLI VAGATTLFCL LHFGVIGPQR
EEFPRDLSLI SPLAQAVRSS SRTPSDKPVA HVVANPQAEG QLQWLNRRAN ALLANGVELR
DNQLVVPSEG LYLIYSQVLF KGQGCPSTHV LLTHTISRIA VSYQTKVNLL SAIKSPCQRE
TPEGAEAKPW YEPIYLGGVF QLEKGDRLSA EINRPDYLDF AESGQVYFGI IAL
//
- ID
- This line is always the first line of an entry. This line consists of the
entry name, data class, the word 'PRT' which means molecule type (PRoTein),
and sequence length. The entry name is symbolized as X_Y, where X is a
mnemonic code representing the protein name, and Y is a mnemonic species
identification code. There are two data classes available,
- STANDARD ---- Data which are complete to the standards laid down by
the SWISS-PROT data bank
- PRELIMINARY - Data for which only the sequence and bibliographic
information have been submitted to thorough checks
- AC
- This line lists the accession numbers associated with an entry. Entries
will have more than one accession number if they have been merged or split.
- DT
- These lines show the date of entry or last modification of the sequence
entry.
- DE
- These lines contain general descriptive free-format information about
the sequence stored.
- GN
- This line contains the name(s) of the gene(s) that encode for the
stored protein sequence. In the case that more than one name has been
assigned to an individual locus, the synonyms will be listed separating
by the word `OR'. In the case that multiple genes encode for an identical
protein, all the different gene names will be listed separating by the
word `AND'.
- KW
- These lines provide information which can be used to generate
cross-reference indexes of the sequence entries based on functional,
structural, or other categories.
- OS, OG, OC
- These fields contain information about source organism.
- OS
- This line specifies the organism(s) which was the source of the stored
sequence. The species designation consists of the Latin genus and species
designation followed by the English name (in parentheses).
- OG
- This line indicates if the gene coding for a protein originates from
the organelle such as mitochondria, the chloroplast, a cyanelle, or a plasmid.
- OC
- These lines contain taxonomic classification of the source organism.
The classification is listed top-down as nodes in a taxonomic tree in which
the most general grouping is given first.
- RN, RP, RC, RM, RA, RL
- These fields comprise the literature citations within SWISS-PROT. The
citations indicate the papers from which the data has been abstracted.
- RN
- This line gives a sequential number to each reference citation in an entry.
- RP
- This line describes the extent of the work carried out by the authors
of the reference cited.
- RC
- This lines are are used to store comments relevant to the reference cited.
The format is 'TOKEN1=TEXT; TOKEN2=TEXT; ... ', where the currently
defined tokens are PLASMID, SPECIES, STRAIN, TISSUE, and TRANSPOSON. The
`SPECIES' token is only used when an entry describes a sequence which is
identical in more than one species; similarly the `PLASMID' is only used if
an entry describes a sequence identical in more than one plasmid.
- RM
- This line indicates the Medline Unique ID of a reference.
- RA
- These lines list the authors of the paper (or other work) cited. All of
the authors are included, and are listed in the order given in the paper.
- RL
- These lines contain the conventional citation information for the
reference like below.
- Journal citations
- Book citations
- Unpublished results
- Unpublished observations
- Thesis
- Patent applications
- Submissions
When a reference is made to a paper which is `in press' at the time when the
data bank is released, the page range, and eventually the volume number are
indicated as '0' (zero).
- DR
- These lines are used as pointers to information related to SWISS-PROT
entries and found in data collections other than SWISS-PROT. Each line has
a database identifier and entry IDs in it.
- FT
- This table describes regions or sites of interest in the sequence. In
general the feature table lists post-translational modifications, binding
sites, enzyme active sites, local secondary structure or other characteristics
reported in the cited references. Sequence conflicts between references are
also included in the feature table. Each line of this table consists of
feature key, `FROM' and `TO' endpoint
specifications, and optionally a description which contains additional
information about the feature.
- If the `FROM' and `TO' specifications are equal, the feature indicated
consists of the single amino acid at that position.
- When a feature is known to extend beyond the end(s) of the sequenced
region, the endpoint specification will be preceded by < for features which
continue to the left end (N-terminal direction) or by > for features which
continue to the right end (C-terminal direction).
- Unknown endpoints are denoted by `?'.
- SQ
- This line marks the beginning of the sequence data and gives a quick
summary of its content, which are sequence length (AA), molecular weight (MW),
and checking number (CN). For checking number, please refer to:
Bairoch A., Biochem. J. 203:527-528(1982).
- CC
- These lines are free text comments on the entry, and may be used to convey
any useful information. A major proportion of the comment blocks, each of
which is started with a mark `-!-', are arranged according to what we
designate as 'topics`. See also topics table.