The Protein Data Bank (PDB) is an archive of experimentally determined three-dimensional structures of biological macromolecules, serving a global community of researchers, educators, and students. The archives contain atomic coordinates, bibliographic citations, primary and secondary structure information, as well as crystallographic structure factors and NMR experimental data.
Every PDB file may be broken into a number of lines terminated by an end-of-line indicator. Each line in the PDB entry file consists of 80 columns. The last character in each PDB entry should be an end-of-line indicator.
Each line in the PDB file is self-identifying. The first six columns of every line contain a record name, left-justified and blank-filled. This must be an exact match to one of the stated record names.
The program ignores all the description lines and uses only three types of records: 'HEADER', 'TITLE' and 'SEQRES'.
After the record name in columns 1 to 6, the program uses the chain identifier displayed in column 12 to select the appropriate chain of residues (a PDB file can contain more than one chains). If this chain identifier is the one which is selected by the user, the program collects all the residues (displayed with their 3-letter code) in the line from column 20 to column 70.
More information about the Protein Data Bank (mirror site at EMBL)
Protein Data Bank Contents Guide (mirror site at EMBL)
1 2 3 4 5 6 7 1234567890123456789012345678901234567890123456789012345678901234567890 HEADER MUSCLE PROTEIN 02-JUN-93 1MYS HEADER HYDROLASE (CARBOXYLIC ESTER) 08-APR-93 2PHI HEADER COMPLEX (LECTIN/TRANSFERRIN) 07-JAN-94 1LGB
1 2 3 4 5 6 7 1234567890123456789012345678901234567890123456789012345678901234567890 TITLE RHIZOPUSPEPSIN COMPLEXED WITH REDUCED PEPTIDE INHIBITOR TITLE BETA-GLUCOSYLTRANSFERASE, ALPHA CARBON COORDINATES ONLY
1 2 3 4 5 6 7 1234567890123456789012345678901234567890123456789012345678901234567890 SEQRES 1 A 21 GLY ILE VAL GLU GLN CYS CYS THR SER ILE CYS SER LEU SEQRES 2 A 21 TYR GLN LEU GLU ASN TYR CYS ASN SEQRES 1 B 30 PHE VAL ASN GLN HIS LEU CYS GLY SER HIS LEU VAL GLU SEQRES 2 B 30 ALA LEU TYR LEU VAL CYS GLY GLU ARG GLY PHE PHE TYR SEQRES 3 B 30 THR PRO LYS ALA SEQRES 1 C 21 GLY ILE VAL GLU GLN CYS CYS THR SER ILE CYS SER LEU SEQRES 2 C 21 TYR GLN LEU GLU ASN TYR CYS ASN SEQRES 1 D 30 PHE VAL ASN GLN HIS LEU CYS GLY SER HIS LEU VAL GLU SEQRES 2 D 30 ALA LEU TYR LEU VAL CYS GLY GLU ARG GLY PHE PHE TYR SEQRES 3 D 30 THR PRO LYS ALA