1,720,993 research outputs found
Predicting secondary structures of membrane proteins with neural networks
Back-propagation, feed-forward neural networks are used to predict the secondary structures of membrane proteins whose structures are known to atomic resolution. These networks are trained on globular proteins and can predict globular protein structures having no homology to those of the training set with correlation coefficients (C) of 0.45, 0.32 and 0.43 for agra-helix, beta-strand and random coil structures, respectively. When tested on membrane proteins, neural networks trained on globular proteins do, on average, correctly predict (Qi) 62%, 38% and 69% of the residues in the agr-helix, beta-strand and random coil structures. These scores rank higher than those obtained with the currently used statistical methods and are comparable to those obtained with the joint approaches tested so far on membrane proteins. The lower success score for beta-strand as compared to the other structures suggests that the sample of beta-strand patterns contained in the training set is less representative than those of a-helix and random coil. Our analysis, which includes the effects of the network parameters and of the structural composition of the training set on the prediction, shows that regular patterns of secondary structures can be successfully extrapolated from globular to membrane proteins
A stochastic and computational method for estimating the folding rates of wild type and mutant proteins
Dynamics of the Minimally Frustrated Helices Determine the Hierarchical Folding of Small Helical Proteins
In this paper we aim at determining the key residues of small helical proteins in order to build up reduced
models of the folding dynamics. We start by arguing that the folding process can be dissected into concurrent
fast and slow dynamics. The fast events are the quasiautonomous coil-to-helix transitions occurring in the
minimally frustrated initiation sites of folding in the early stages of the process. The slow processes consist in
the docking of the fluctuating helices formed in these critical sites. We show that a neural network devised to
predict native secondary structures from sequence can be used to estimate the probabilities of formation of
these helical traits as they are embedded in the protein. The resulting probabilities are shown to correlate well
with the experimental helicities measured in the same isolated peptides. The relevance of this finding to the
hierarchical character of folding is confirmed within the framework of a diffusion-collision-like mechanism.
We demonstrate that thermodynamic and topological features of these critical helices allow accurate estimation
of the folding times of five proteins that have been kinetically studied. This suggests that these critical helices
determine the fundamental events of the whole folding process. A remarkable feature of our model is that not
all of the native helices are eligible as critical helices, whereas the whole set of the native helices has been used
so far in other reconstructions of the folding mechanism. This stresses that the minimally frustrated helices of
these helical proteins comprise the minimal set of determinants of the folding process
A predictor of transmembrane alpha-helix domains of proteins based on neural networks
Back-propagation, feed-forward neural networks are used to predict alpha-helical transmembrane segments of proteins. The networks are trained on the few membrane proteins whose transmembrane alpha-helix domains are known to atomic or nearly atomic resolution. When testing is performed with a jackknife procedure on the proteins of the training set, the fraction of total correct assignments is as high as 0.87, with an average length for the transmembrane segments of 20 residues. The method correctly fails to predict any transmembrane domain for porin, whose transmembrane segments are beta-sheets. When tested on globular proteins, lower and upper limits of 1.6 and 3.5% for a total of 26826 residues are determined for the mispredicted cases, indicating that the predictor is highly specific for alpha-helical domains of membrane proteins. The predictor is also tested on 37 membrane proteins whose transmembrane topology is partially known. The overall accuracy is 0.90, two percentage points higher than that obtained with statistical methods. The reliability of the prediction is 100% for 60% of the total 18242 predicted residues of membrane proteins. Our results show that the local directional information automatically extracted by the neural networks during the training phase plays a key role in determining the accuracy of the prediction
Neural networks to study invariant features of protein folding
Protein secondary structures result both from short-range and long-range interactions. Here neural networks are used to implement a procedure to detect regions of the protein backbone where local interactions have an overwhelming effect in determining the formation of stretches in α-helical conformation. Within the framework of a modular view of protein folding we have argued that these structures correspond to the initiation sites of folding. The hypothesis to be tested in this paper is that sequence identity beside ensuring similarity of the three-dimensional conformation also entails similar folding mechanisms. In particular, we compare the location and sequence variability of the initiation sites extracted from a set of proteins homologous to horse heart cytochrome c. We present evidence that the initiation sites conserve their position in the aligned sequences and exhibit a more reduced variability in the residue composition than the rest of the protein
An entropy criterion to detect minimally frustrated intermediates in native proteins
The analysis of the information flow in a feed-forward neural network suggests that the output of the network can be used to compute a structural entropy for the sequence-to-secondary structure mapping. On this basis, me formulate a minimum entropy criterion For the identification of minimally frustrated traits with helical conformation that correspond to initiation sites of protein folding. The entropy of protein segments can be viewed as a nucleation propensity that is useful to characterize putative regions where folding is likely to be initiated with the formation of stretches of a-helices under the predominant influence of local interactions. Our procedure is successfully tested in the search for initiation sites of protein folding for which independent experimental and computational evidence exists. Our results lend support to the view that folding is a hierarchical event in which, in harmony with the minimal frustration principle, the final conformation preserves structural modules formed in the early stages of the process
Predictions of protein segments with the same aminoacid sequence and different secondary structure: A benchmark for predictive methods
The most stringent test for predictive methods of protein secondary structure is whether identical short sequences that are known to be present with different conformations in different proteins known at atomic resolution can be correctly discriminated. In this study, we show that the prediction efficiency of this type of segments in unrelated proteins reaches an average accuracy per residue ranging from about 72 to 75% (depending on the alignment method used to generate the input sequence profile) only when methods of the third generation are used. A comparison of different methods based on segment statistics (2nd generation methods) and/or including also evolutionary information (3rd generation methods) indicate that the discrimination of the different conformations of identical segments is dependent on the method used for the prediction. Accuracy is similar when methods similarly performing on the secondary structure prediction are tested. When evolutionary information is taken into account as compared to single sequence input, the number of correctly discriminated pairs is increased twofold. The results also highlight the predictive capability of neural networks for identical segments whose conformation differs in different proteins. (C) 2000 Wiley-Liss, Inc
Neural networks predict protein folding and structure: Artificial intelligence faces biomolecular complexity
In the genomic era DNA sequencing is increasing our knowledge of the molecular structure of genetic codes from bacteria to man at a hyperbolic rate. Billions of nucleotides and millions of aminoacids are already filling the electronic files of the data bases presently available, which contain a tremendous amount of information on the most biologically relevant macromolecules, such as DNA, RNA and proteins. The most urgent problem originates from the need to single out the relevant information amidst a wealth of general features. Intelligent tools are therefore needed to optimise the search. Data mining for sequence analysis in biotechnology has been substantially aided by the development of new powerful methods borrowed from the machine learning approach. In this paper we discuss the application of artificial feedforward neural networks to deal with some fundamental problems tied with the folding process and the structure-function relationship in proteins
- …
