The Protein Data Bank (PDB) is an archive of experimentally determined three-dimensional structures of biological macromolecules, serving a global community of researchers, educators, and students. The archives contain atomic coordinates, bibliographic citations, primary and secondary structure information, as well as crystallographic structure factors and NMR experimental data.
Every PDB file may be broken into a number of lines terminated by an end-of-line indicator. Each line in the PDB entry file consists of 80 columns. The last character in each PDB entry should be an end-of-line indicator.
Each line in the PDB file is self-identifying. The first six columns of every line contain a record name, left-justified and blank-filled. This must be an exact match to one of the stated record names.
The program ignores all the description lines and uses only three types of records: 'HEADER', 'TITLE' and 'SEQRES'.
After the record name in columns 1 to 6, the program uses the chain identifier displayed in column 12 to select the appropriate chain of residues (a PDB file can contain more than one chains). If this chain identifier is the one which is selected by the user, the program collects all the residues (displayed with their 3-letter code) in the line from column 20 to column 70.
More information about the Protein Data Bank (mirror site at EMBL)
Protein Data Bank Contents Guide (mirror site at EMBL)
1 2 3 4 5 6 7
1234567890123456789012345678901234567890123456789012345678901234567890
HEADER MUSCLE PROTEIN 02-JUN-93 1MYS
HEADER HYDROLASE (CARBOXYLIC ESTER) 08-APR-93 2PHI
HEADER COMPLEX (LECTIN/TRANSFERRIN) 07-JAN-94 1LGB
1 2 3 4 5 6 7
1234567890123456789012345678901234567890123456789012345678901234567890
TITLE RHIZOPUSPEPSIN COMPLEXED WITH REDUCED PEPTIDE INHIBITOR
TITLE BETA-GLUCOSYLTRANSFERASE, ALPHA CARBON COORDINATES ONLY
1 2 3 4 5 6 7
1234567890123456789012345678901234567890123456789012345678901234567890
SEQRES 1 A 21 GLY ILE VAL GLU GLN CYS CYS THR SER ILE CYS SER LEU
SEQRES 2 A 21 TYR GLN LEU GLU ASN TYR CYS ASN
SEQRES 1 B 30 PHE VAL ASN GLN HIS LEU CYS GLY SER HIS LEU VAL GLU
SEQRES 2 B 30 ALA LEU TYR LEU VAL CYS GLY GLU ARG GLY PHE PHE TYR
SEQRES 3 B 30 THR PRO LYS ALA
SEQRES 1 C 21 GLY ILE VAL GLU GLN CYS CYS THR SER ILE CYS SER LEU
SEQRES 2 C 21 TYR GLN LEU GLU ASN TYR CYS ASN
SEQRES 1 D 30 PHE VAL ASN GLN HIS LEU CYS GLY SER HIS LEU VAL GLU
SEQRES 2 D 30 ALA LEU TYR LEU VAL CYS GLY GLU ARG GLY PHE PHE TYR
SEQRES 3 D 30 THR PRO LYS ALA