A tool that predicts the coupling specificity of G-protein
coupled receptors to G-proteins

Version 2.00

General description of the Method

PRED-COUPLE2 is a method for the prediction of the coupling specificity of G-protein coupled receptors (GPCRs) to the four families of G-proteins (including G12/13). The method is able to predict coupling to more that one family of G-proteins, as examplified by several experimentally determined promiscuous receptors.
Like its predecessor, the PRED-COUPLE2 system implements a library of refined profile Hidden Markov Models (pHMMs). These profiles have been trained by the intracellular domain sequences of 188 GPCRs with known coupling properties, in a way that provides high discriminative power. Profiles that indicate coupling to G12/13 proteins have also been developed. All pHMMs were constructed and calibrated by the HMMER software package in a way similar to the construction of signatures in the pfam database. The HMMER package is also used to calculate scores and E-values in queries against the refined library (for a more in depth discussion, on profile HMMs and HMMER scores see the HMMER software package related documentation).
In order to produce the final prediction, Scores from the 25 individual profiles are combined by a feed-forward Artificial Neural Network, originally constructed with the NevProp platform, and currently implemented by a perl script running the method. Scores are fed directly into the Network, without applying any profile-specific cutoffs.

Results interpentation

The final outputs of the Artificial Neural Network are four numbers ranging from 0 to 1, which corrsepond to the posterior probability that the GPCR under query couples to each of the four families of G-proteins. In this step, PRED-COUPLE2 applies a safe cutoff of 0.3 exclusive, that has been found to discriminate between positive and negative predictions. Therefore, results below this limit are not considered as positive predictions, however they are presented to the user. Multiple predictions should be made for promiscuous receptors in the form of more than one outputs above the threshold.

Filtering out non-GPCR sequences

The PRED-COUPLE2 method applies a novel approach for the control of sequences that do not belong to GPCRs. This process has two parts that run independently before executing the prediction algorithm, as described above. In the first part, results (E-values) from all 25 profiles in the refined library are combined by the QFAST (Bailey and Gribskov, 1998) algorithm, regardless of their coupling selectivity. A threshold is then applied to the combined E-value of all profiles, to exclude non-GPCR sequences. In this case, the message "No matches found" is shown and the prediction is halted.
In addition, profile HMMs from the pfam Version 17.0 database that describe seven (7) transmembrane domain receptors or putative GPCRs are also implemented to verify sequences that belong to GPCRs. Profiles that characterize the the mlo family of fungal and plant receptos and the novel family of fungal receptors PTH11 have also been included in this selected library from pfam. The hypothesis that a query sequence belongs to a GPCR is tested in a precedent step, by querying against those fifteen pHMMs. If the sequence is not recognized by any of the pfam profiles, the message CAUTION!!!!Probable non-GPCR sequence is shown.
More signatures for GPCRs exist (or are expected to be discovered) in pfam or other protein pattern and domain databases like PROSITE,PRINTS and InterPro. We strongly suggest querying your sequence(s) against those databases before running the PRED-COUPLE method.

Original papers :

  • "Prediction of the coupling specificity of GPCRs to four families of G-proteins using Hidden Markov Models and Artificial Neural Networks"
    Sgourakis N.G., Bagos P.G. and Hamodrakas S.J. Submitted 2005

  • "A method for the prediction of GPCRs coupling specificity to G-proteins using refined profile Hidden Markov Models"
    Sgourakis N.G., Bagos P.G., Papasaikas P.G. and Hamodrakas S.J. BMC Bioinformatics, 6:104, 2005