Run RESCUE V 2.0 (1H & 15N)...
The assignment of the 1H spectrum of a protein or a polypeptide
is a prerequisite of the NMR study of the molecule. We present here a computer
tool, based on the artificial neural network technology, which tries to
extract the nature of the amino acid from the values of the chemical shifts.
Artificial neural networks design:
The artificial neural networks used in this work consist in a classical
perceptron design ( 3 layers network with one hidden layer ). in which
the input data (chemical shifts) are presented to the input layer, and
results (the amino-acid type) are obtained from the output layer. We used
an additional fuzzy logic layer in order to code on a constant number of
input the set of chemical shifts to analyse.
The analysis is performed in two steps: a first artificial neural network
determines in which group a given spin-system falls, then if this group
consists of more than one amino-acid, a second independent network, specialised
on this group, determines more precisely the amino-acid.
Training process:
The artificial neural network used in this work was trained on a set
of chemical shifts extracted from the BioMagResBank
(BMRB) database [Seavey, B.R., et al (1991) J. Biomol. NMR, 1,
217-236]. In this database 1H chemical shifts are referenced
to TSP or DSS, and no corrections for reference, pH or temperature bias
were applied.
Many BMRB entries were rejected for the training set: (proteins or
peptides not fully assigned, proteins with paramagnetic centre, homologeous
proteins...). This set contains 142 different
proteins and was used as a training set when building the artificial
neural network.
Test step:
The BMRB was used to realise the test procedures of the artificial
neural network studied, with the condition that entries used for training
were not used in the test phase. Tests were made on a total of 8037 assigned
amino acid entries. RESCUE presents a mean rate of success above 90% on
the test set
Reliability:
The difference between the actual output vector and the ideal vector
for this target
is used to evaluate the reliability of the answer. The program computes
the quantity pt(O):
were
is the variance of the ith element of the output vector
as observed when evaluating the neural network output for all the amino-acid
t
of the test set, and Rt is the rate of success observed
for this amino-acid. The output vector issued to the user, as well as the
quantity p(O) expressed in percents.
This program has been called RESCUE for RESidue prediCtion
with neUral nEtworks.and has been used for helping manual
assignment of peptides and proteins, and can also be used as a step in
an automated approach to assignment.
RESCUE v 2.0 is optimized in order to predict the amino-acid type
from the 15N, HN, Halpha and Hbeta chemical shifts (3D 15N TOCSY- HSQC).
It was designed in the same way than Rescue V 1.0 (perceptron
design with 3 layer network, training process, test step and reliability
are identical as Rescue V1.0)
Rescue V 2.0 present a mean rate of success above 77.2% on the test
set.