1,721,090 research outputs found

    Structural adaptation of enzymes to low temperatures

    No full text
    A systematic comparative analysis of 21 psychrophilic enzymes belonging to different structural families from prokaryotic and eukaryotic organisms is reported. The sequences of these enzymes were multiply aligned to 427 homologous proteins from mesophiles and thermophiles. The net flux of amino acid exchanges from meso/thermophilic to psychrophilic enzymes was measured. To assign the observed preferred exchanges to different structural environments, such as secondary structure, solvent accessibility and subunit interfaces, homology modeling was utilized to predict the secondary structure and accessibility of amino acid residues for the psychrophilic enzymes for which no experimental three-dimensional structure is available. Our results show a clear tendency for the charged residues Arg and Glu to be replaced at exposed sites on alpha-helices by Lys and Ala, respectively, in the direction from 'hot' to 'cold' enzymes. Val is replaced by Ala at buried regions in alpha-helices. Compositional analysis of psychrophilic enzymes shows a significant increase in Ala and Asn and a decrease in Arg at exposed sites. Buried sites in beta-strands tend to be depleted of VAL: Possible implications of the observed structural variations for protein stability and engineering are discussed

    A databank (3D_ALI) collecting related protein sequences and structures

    No full text
    An updated version of the 3D_ali databank (Pascarella and Argos, 1992) was constructed to incorporate new protein structural and sequence data acquired since the original release in 1992. The databank has proved useful in many research fields, such as protein sequence and structure analysis and comparison, protein folding, engineering and design, evolution, and the like. The collection enhances present protein structural knowledge by merging information from proteins having a similar main-chain fold with homologous primary structures taken from large databases of known sequences. However, the construction philosophy of the databank has been modified. Originally, the Protein Data Bank (PDB; Bernstein et al, 1977) of known 3-D structures was exhaustively scanned for fold redundancy, and all the possible unique structures were incorporated either in multiple structural alignment files or in those containing only one structure where no relatives could be detected. The tertiary structural superpositioning of the backbones, which yielded spatial and topological Co atom equivalencing and thus corresponding sequence alignments, was mostly taken from the literature, but was sometimes determined by the authors using the Rossman-Argos superposition technique (Argos and Rossmann, 1979). In the updated 3D_ali databank, only published alignments based on superpositioning by the authors of the tertiary structures were collected and only folds with more than one sample structure were considered. Different literature alignments were also merged if they included common folds. As in the former release, only full coordinate sets with assigned side chains were included, while NMR structures, excluded in the 1992 release, are now incorporated

    Repeating structure of chick tropoelastin revealed by complementary DNA cloning.

    No full text
    A cDNA library was constructed from chick aorta poly(adenylic acid)-containing RNA in the expression vector pEX1. Several clones were identified by screening the library with a polyclonal antiserum raised against chick tropoelastin and confirmed by DNA sequencing. Analysis of the deduced amino acid sequence, corresponding to the mature tropoelastin and most of the signal peptide, revealed that the molecule is composed of at least 8, and possibly 13, repeating units. The common features of each unit include an N-terminal region composed largely of alanines and lysines and ending with an aromatic amino acid, followed by a GAG span and then a C-terminal region consisting mostly of valines, prolines, and glycines often present in several copies of the sequence (VPGV). This structure is discussed in terms of the functional properties of the molecule

    Easy method to predict solvent accessibility from multiple protein sequence alignment

    No full text
    An easy and uncomplicated method to predict the solvent accessibility state of a site in a multiple protein sequence alignment is described. The approach is based on amino acid exchange and compositional preference matrices for each of three accessibility states: buried, exposed, and intermediate. Calculations utilized a modified version of the 3D_ali databank, a collection of multiple sequence alignments anchored through protein tertiary structural superpositions. The technique achieves the same accuracy as much more complex methods and thus provides such advantages as computational affordability, facile updating, and easily understood residue substitution patterns useful to biochemists involved in protein engineering, design, and structural prediction. The program is available from the authors; and, due to its simplicity, the algorithm can be readily implemented on any system. For a given alignment site, a hand calculation can yield a comparative prediction

    Prediction of secondary structural content of proteins from their amino acid composition alone. I. New analytic vector decomposition methods

    No full text
    Abstract The predictive limits of the amino acid composition for the secondary structural content (percentage of residues in the secondary structural states helix, sheet, and coil) in proteins are assessed quantitatively. For the first time, techniques for prediction of secondary structural content are presented which rely on the amino acid composition as the only information on the query protein. In our first method, the amino acid composition of an unknown protein is represented by the best (in a least square sense) linear combination of the characteristic amino acid compositions of the three secondary structural types computed from a learning set of tertiary structures. The second technique is a generalization of the first one and takes into account also possible compositional couplings between any two sorts of amino acids. Its mathematical formulation results in an eigenvalue/eigenvector problem of the second moment matrix describing the amino acid compositional fluctuations of secondary structural types in various proteins of a learning set. Possible correlations of the principal directions of the eigenspaces with physical properties of the amino acids were also checked. For example, the first two eigenvectors of the helical eigenspace correlate with the size and hydrophobicity of the residue types respectively. As learning and test sets of tertiary structures, we utilized representative, automatically generated subsets of Protein Data Bank (PDB) consisting of non‐homologous protein structures at the resolution thresholds ≤1.8Å, ≤2.0Å, ≤2.5Å, and ≤3.0Å. We show that the consideration of compositional couplings improves prediction accuracy, albeit not dramatically. Whereas in the self‐consistency test (learning with the protein to be predicted), a clear decrease of prediction accuracy with worsening resolution is observed, the jackknife test (leave the predicted protein out) yielded best results for the largest dataset (≤3.0 Å, almost no difference to the self‐consistency test!), i.e., only this set, with more than 400 proteins, is sufficient for stable computation of the parameters in the prediction function of the second method. The average absolute error in predicting the fraction of helix, sheet, and coil from amino acid composition of the query protein are 13.7, 12.6, and 11.4%, respectively with r.m.s. deviations in the range of 8.6 ÷ 11.8% for the 3.0 Å dataset in a jackknife test. The absolute precision of the average absolute errors is in the range of 1 ÷ 3% as measured for other representative subsets of the PDB. Secondary structural content prediction methods found in the literature have been clustered in accordance with their prediction accuracies. To our surprise, much more complex secondary structure prediction methods utilized for the same purpose of secondary structural content prediction achieve prediction accuracies very similar to those of the present analytic techniques, implying that all the information beyond the amino acid composition is, in fact, mainly utilized for positioning the secondary structural state in the sequence but not for determination of the overall number of residues in a secondary structural type. This result implies that higher prediction accuracies cannot be achieved relying solely on the amino acid composition of an unknown query protein as prediction input. Our prediction program SSCP has been made available as a World Wide Web and E‐mail service. © 1996 Wiley‐Liss, Inc
    corecore