Modules currently running in DAM-Bio

I. Sequence related tools

Study of Residue Periodicities in Sequences: This computer package uses a Fast Fourier Transform (FFT) algorithm to locate residue periodicities in either protein or DNA sequences. The search and study of residue periodicities in protein sequences is a powerful tool for the structural and functional study of proteins, whose spatial conformation has not been solved yet. Such periodicities may reveal the existence of repeating patterns and help towards an understanding of the molecular structure of a fibrous/structural protein. Also, they may reveal ways of assembly of such proteins. Periodicities in DNA may dictate structural and functional characteristics of the molecule.

Secondary Structure Prediction: SecStr is a tool to Predict the Secondary Structure of a protein from its aminoacid sequence alone. The SecStr package uses six different secondary structure prediction methods (Nagano, Garnier et al., Burges et al., Chou and Fasman , Lim and Dufton and Hider). The results of those methods are combined into a Joint Prediction Histogram (JPH, as described by Hamodrakas, 1988)


A web-based classification system of DNA-binding protein families: The DnaProt resource is an annotated and searchable collection of protein sequences for the families of DNA-binding proteins. The database contains 3238 full-length sequences (retrieved from SWISS-PROT database, release 38) that include, at least, a DNA-binding domain. Sequence entries are organised into families defined by PROSITE patterns, PRINTS motifs and de-novo excised signatures. DNA-binding proteins are classified into 33 unique classes, which helps to reveal comprehensive family relationships. To maximise family information retrieval, DnaProt contains a collection of multiple alignments for each DNA-binding family while the recognised motifs can be used as diagnostically functional fingerprints. All available structural class representatives have been referenced. The resource was developed as a Web-based management system for online free access of customised data sets. Entries are fully hyper-linked to facilitate easy retrieval of the original records from the source databases.

Prediction of Transmembrane Segments in proteins based on statistical analysis: PRED-TMR is a novel method that predicts transmembrane domains in proteins using solely information contained in the sequence itself. The algorithm refines a standard hydrophobicity analysis with a detection of potential termini ("edges", starts and ends) of transmembrane regions. This allows both to discard highly hydrophobic regions not delimited by clear start and end configurations and to confirm putative transmembrane segments not distinguishable by their hydrophobic composition.

PRED-TMR application with Neural Network preprocessing: We have now extended this application with a pre-processing stage represented by an artificial neural network which is able to discriminate with a high accuracy transmembrane proteins from soluble or fibrous ones. Applied on several test sets of transmembrane proteins, the system gives a perfect prediction rating of 100% by classifying all the sequences in the transmembrane class. Applied on 995 non-transmembrane protein extracted from the PDBSELECT database, the neural network predicts falsely 23 of them to be transmembrane (97.7% of correct assignment).

Topology Prediction of transmembrane proteins and segments: A computer software that predicts the topology of transmembrane proteins from sequence alone, utilizing an initial definition of transmembrane segments. It uses position-specific statistical information for amino acid residues which belong to putative non-transmembrane segments derived from a statistical analysis of non-transmembrane regions of membrane proteins stored in the SwissProt database. Its accuracy compares well with that of other popular existing methods.

Classification of proteins into one of four possible classes: A system of cascading neural networks that classifies any protein, given its aminoacid sequence alone, into one of four possible classes:
  • the membrane protein class,
  • the globular protein class,
  • the fibrous protein class,
  • the mixed (fibrous and globular) protein class

Multiple sequence alignment: A fast alogorithm for simultaneous alignment of multiple sequences. Alignments are based on the determination of higlhly conserved oligopeptides, present in each sequence. A number of transformation tables is also offered, in order to produce alignments based on specific properties of the residues (such as hydrophobicity).


