Machine Learning and Neural Networks group
Dipartimento di Sistemi e Informatica
Università di Firenze
Via Santa Marta 3
50139 Firenze - Italy
Tel:+39 055 4796361
Fax:+39 055 4796363

METAL DETECTOR

Cysteines and Histidines Bonding State Predictor

A new updated version of the predictor is available at
http://metaldetector.dsi.unifi.it/v2.0

Help and References


The server: description



MetalDetector is a classifier which predicts the bonding state of cysteines and histidines, given a protein sequence. For the cysteines, it discriminates three different classes: disulfide bonded (D), metal bonded (M), and free (F); for histidines, only two classes are allowed: metal bonded or free. The predictor has been created combining two different classifiers, Disulfind and Metal Ligand Predictor, in a single decision tree architecture, as depicted in the following figure:

Decision tree architecture


1st stage: Disulfind

At the first level of the decision tree, Disulfind [1] is run on the query sequence. Disulfind is an online server available at http://disulfind.dsi.unifi.it) which consists in a binary classifier which predicts the disulfide bonding state of cysteines.

2nd stage: Metal Ligand Predictor

At the second level of the decision tree, Metal Ligand Predictor [2] is questioned on the same sequence: this is a classifier which discriminates the three possible bonding states of cysteines (disulfide bonded, metal bonded, free) and the two ones of histidines.

Combining the two classifiers

The outputs of the two classifiers are then combined, according to the values of two thresholds, which can be chosen by the user: a threshold TD for disulfide class, and a threhsold TM for metal class. To understand the meaning of the two thresholds, let us assume initially that their values are both equal to 1. In this case, if Disulfind predicts the disulfide class for a certain residue, the prediction of Metal Ligand Predictor is ignored; this is done because Disulfind has a very high precision on disulfide class. If, on the other way, Disulfind outputs negative class, the class with the higher probability between M and F is predicted. Note that for the histidines, only Metal Ligand Predictor predictions are obviously accepted. The two thresholds are used in order to allow Metal Ligand Predictor to eventually contradict the response of Disulfind: to enhance Disulfind recall, in fact, we can allow disulfide class predictions even from Metal Ligand Predictor, but only if the probability of class D predicted by Metal Ligand Predictor is greater than threshold TD. With the very same idea, Metal Ligand Predictor can predict class M even if Disulfind had previously predicted disulfide class: the prediction is accepted only if the probability of class M overcomes the chosen threshold TM.

Disulfide Class Threshold

Our architecture works in two stages: first of all Disulfind binary predictor is called, to obtain disulfide bonds prediciton; then MetalDetector is run on the same sequences, in order to predict metal bindings. Predictions of disulfide class made by MetalDetector stage will be also accepted, but only if the probability of that class will be greater than this threshold; otherwise, they will be ignored and the second greatest class predicted by MetalDetector will be the output.
In practice, by setting this threshold to 1, you will obtain the disulfide class predictions made by Disulfind; lowering this threshold, you will enhance the recall of disulfide class by recovering some Disulfind false negatives using MetalDetector, despite a lack in the precision of the same class.

Metal Class Threshold

In the case in which Disulfind predicts disulfide class, it is still possible to change the prediction, but only if the metal class probability predicted by MetalDetector overcomes this second threshold.
In practice, by lowering this threshold, you will enhance the recall on the metal class, despite a little lack in the disulfide class performance.

Input formats

Email Address

Your email address, the place where the prediction will be delivered, if option Email is selected.
NOTE: Check that you typed your address correctly.

Query Name

An optional name for your query. We strongly suggest that you use one, especially if sending more than one query. The order in which you send your queries may not correspond to the order in which you receive the answers.

Amino Acid Sequence

The sequence of aminoacids:

  • A bare sequence is accepted. Please no FASTA format.
  • Spaces, newlines and tabs will be simply ignored.
  • Non alphabetical chars will cause the rejection of the query.
  • Only 1 letter amino acid code accepted. Please do not send nucleotide sequences. If so, A will be treated as Alanine, C as Cysteine, etc...
  • At least one Cysteine of Histidine must be present in the sequence.


Options

By varying the thresholds TD and TM in the above decision tree, different operating points in the Recall-Precision curve below are selected. You can choose one of the following three presets: high accuracy (default), high precision, or high recall for the metal class, as estimated by 5-fold cross-validation on our data set. The three preset operating points are shown in the curve below.
Recall-precision curve

Output

Replies are sent either by email or as an html page. In the latter case, after pressing the send button, an intermediate page will appear and auto-refresh every 10 seconds until the final output is returned.


Dataset

The dataset used in the experiments can be downloaded here.

References

Please cite:

[1] M. Lippi, A. Passerini, M. Punta, B. Rost, P. Frasconi. MetalDetector: a web server for predicting metal binding sites and disulfide bridges in proteins from sequence. Bioinformatics 2008. doi: 10.1093/bioinformatics/btn371. Download PDF

For Metal Ligand Predictor see also:

[2] A. Passerini, M. Punta, A. Ceroni, B. Rost, and P. Frasconi. Identifying Cysteines and Histidines in Transition-Metal-Binding Sites Using Support Vector Machines and Neural Networks. Proteins: Structure, Function, and Bioinformatics 65:305-316, 2006. Download PDF

For DISULFIND see also:

[3] A. Ceroni, A. Passerini, A. Vullo and P. Frasconi. DISULFIND: a Disulfide Bonding State and Cysteine Connectivity Prediction Server, Nucleic Acids Research, 34(Web Server issue):W177--W181, 2006. Download PDF

Metal detector was implemented by Marco Lippi

Copyright notice

The documents listed in this site are provided as a means to ensure timely dissemination of scholarly and technical work on a noncommercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.


Home

Help & References

© 2008 Machine Learning and Neural Networks Group.
For questions and comments: metaldetector at dsi dot unifi dot it.