1,721,037 research outputs found

    Transiently beneficial insertions could maintain mobile DNA sequences in variable environments

    No full text
    The maintenance of mobile DNA sequences in clonal organisms has been seen as a paradox. If selfish mobile sequences spread through genomes only by overreplication in transposition, then sexuality is necessary for their spread through populations. The persistence of bacterial transposable elements without obvious dominant selectable markers has previously been explained by horizontal transfer. However, advantageous insertions of mobile DNAs are known in bacteria. Here we model maintenance of an otherwise selfish mobile DNA element in a clonal species in which selection for null mutations occurs during one of two temporally alternating environments. Large areas of parameter space permit maintenance of mobile DNAs where, without selection, they would have gone extinct. Horizontal transfer diminishes, rather than enhances, mean copy number. In finite populations, effective population sizes are greatly reduced by selective sweeps, and mean copy number can be increased as the reduced variance in copy number results in reduced selection

    Computational prediction of short linear motifs from protein sequences

    Full text link
    Short Linear Motifs (SLiMs) are functional protein microdomains that typically mediate interactions between a short linear region in one protein and a globular domain in another. SLiMs usually occur in structurally disordered regions and mediate low affinity interactions. Most SLiMs are 3-15 amino acids in length and have 2-5 defined positions, making them highly likely to occur by chance and extremely difficult to identify. Nevertheless, our knowledge of SLiMs and capacity to predict them from protein sequence data using computational methods has advanced dramatically over the past decade. By considering the biological, structural, and evolutionary context of SLiM occurrences, it is possible to differentiate functional instances from chance matches in many cases and to identify new regions of proteins that have the features consistent with a SLiM-mediated interaction. Their simplicity also makes SLiMs evolutionarily labile and prone to independent origins on different sequence backgrounds through convergent evolution, which can be exploited for predicting novel SLiMs in proteins that share a function or interaction partner. In this review, we explore our current knowledge of SLiMs and how it can be applied to the task of predicting them computationally from protein sequences. Rather than focusing on specific SLiM prediction tools, we provide an overview of the methods available and concentrate on principles that should continue to be paramount even in the light of future developments. We consider the relative merits of using regular expressions or profiles for SLiM discovery and discuss the main considerations for both predicting new instances of known SLiMs, and de novo prediction of novel SLiMs. In particular, we highlight the importance of correctly modelling evolutionary relationships and the probability of false positive predictions

    BADASP: predicting functional specificity in protein families using ancestral sequences

    No full text
    Burst After Duplication with Ancestral Sequence Predictions (BADASP) is a software package for identifying sites that may confer subfamily-specific biological functions in protein families following functional divergence of duplicated proteins. A given protein phylogeny is grouped into subfamilies based on orthology/paralogy relationships and/or user definitions. Ancestral sequences are then predicted from the sequence alignment and the functional specificity is calculated using variants of the Burst After Duplication method, which tests for radical amino acid substitutions following gene duplications that are subsequently conserved. Statistics are output along with subfamily groupings and ancestral sequences for an easy analysis with other package

    Computational identification and analysis of protein short linear motifs

    No full text
    Short linear motifs (SLiMs) in proteins can act as targets for proteolytic cleavage, sites of post-translational modification, determinants of sub-cellular localization, and mediators of protein-protein interactions. Computational discovery of SLiMs involves assembling a group of proteins postulated to share a potential motif, masking out residues less likely to contain such a motif, down-weighting shared motifs arising through common evolutionary descent, and calculation of statistical probabilities allowing for the multiple testing of all possible motifs. Much of the challenge for motif discovery lies in the assembly and masking of datasets of proteins likely to share motifs, since the motifs are typically short (between 3 and 10 amino acids in length), so that potential signals can be easily swamped by the noise of stochastically recurring motifs. Focusing on disordered regions of proteins, where SLiMs are predominantly found, and masking out non-conserved residues can reduce the level of noise but more work is required to improve the quality of high-throughput experimental datasets (e.g. of physical protein interactions) as input for computational discovery

    CompariMotif: quick and easy comparisons of sequence motifs

    No full text
    CompariMotif is a novel tool for making motif–motif comparisons, identifying and describing similarities between regular expression motifs. CompariMotif can identify a number of different relationships between motifs, including exact matches, variants of degenerate motifs and complex overlapping motifs. Motif relationships are scored using shared information content, allowing the best matches to be easily identified in large comparisons. Many input and search options are available, enabling a list of motifs to be compared to itself (to identify recurring motifs) or to datasets of known motifs

    SLiMDisc: short, linear motif discovery, correcting for common evolutionary descent

    No full text
    Many important interactions of proteins are facilitated by short, linear motifs (SLiMs) within a protein's primary sequence. Our aim was to establish robust methods for discovering putative functional motifs. The strongest evidence for such motifs is obtained when the same motifs occur in unrelated proteins, evolving by convergence. In practise, searches for such motifs are often swamped by motifs shared in related proteins that are identical by descent. Prediction of motifs among sets of biologically related proteins, including those both with and without detectable similarity, were made using the TEIRESIAS algorithm. The number of motif occurrences arising through common evolutionary descent were normalized based on treatment of BLAST local alignments. Motifs were ranked according to a score derived from the product of the normalized number of occurrences and the information content. The method was shown to significantly outperform methods that do not discount evolutionary relatedness, when applied to known SLiMs from a subset of the eukaryotic linear motif (ELM) database. An implementation of Multiple Spanning Tree weighting outperformed two other weighting schemes, in a variety of settings

    Masking residues using context-specific evolutionary conservation significantly improves short linear motif discovery

    No full text
    Motivation: Short linear motifs (SLiMs) are important mediators of protein–protein interactions. Their short and degenerate nature presents a challenge for computational discovery. We sought to improve SLiM discovery by incorporating evolutionary information, since SLiMs are more conserved than surrounding residues.Results: We have developed a new method that assesses the evolutionary signal of a residue in its sequence and structural context. Under-conserved residues are masked out prior to SLiM discovery, allowing incorporation into the existing statistical model employed by SLiMFinder. The method shows considerable robustness in terms of both the conservation score used for individual residues and the size of the sequence neighbourhood. Optimal parameters significantly improve return of known functional motifs from benchmarking data, raising the return of significant validated SLiMs from typical human interaction datasets from 20% to 60%, while retaining the high level of stringency needed for application to real biological data. The success of this regime indicates that it could be of general benefit to computational annotation and prediction of protein function at the sequence level

    QSLiMFinder: improved short linear motif prediction using specific query protein data

    Full text link
    MOTIVATION: The sensitivity of de novo short linear motif (SLiM) prediction is limited by the number of patterns (the motif space) being assessed for enrichment. QSLiMFinder uses specific query protein information to restrict the motif space and thereby increase the sensitivity and specificity of predictions.RESULTS: QSLiMFinder was extensively benchmarked using known SLiM-containing proteins and simulated protein interaction datasets of real human proteins. Exploiting prior knowledge of a query protein likely to be involved in a SLiM-mediated interaction increased the proportion of true positives correctly returned and reduced the proportion of datasets returning a false positive prediction. The biggest improvement was seen if a short region of the query protein flanking the interaction site was known.AVAILABILITY AND IMPLEMENTATION: All the tools and data used in this study, including QSLiMFinder and the SLiMBench benchmarking software, are freely available under a GNU license as part of SLiMSuite, at: http://bioware.soton.ac.uk.CONTACT: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online

    Estimation and efficient computation of the true probability of recurrence of short linear protein sequence motifs in unrelated proteins.

    No full text
    Background: large datasets of protein interactions provide a rich resource for the discovery of Short Linear Motifs (SLiMs) that recur in unrelated proteins. However, existing methods for estimating the probability of motif recurrence may be biased by the size and composition of the search dataset, such that p-value estimates from different datasets, or from motifs containing different numbers of non-wildcard positions, are not strictly comparable. Here, we develop more exact methods and explore the potential biases of computationally efficient approximations. Results: a widely used heuristic for the calculation of motif over-representation approximates motif probability by assuming that all proteins have the same length and composition. We introduce pv, which calculates the probability exactly. Secondly, the recently introduced SLiMFinder statistic Sig, accounts for multiple testing (across all possible motifs) in motif discovery. However, it approximates the probability of all other possible motifs, occurring with a score of p or less, as being equal to p. Here, we show that the exhaustive calculation of the probability of all possible motif occurrences that are as rare or rarer than the motif of interest, Sig', may be carried out efficiently by grouping motifs of a common probability (i.e. those which have permuted orders of the same residues). Sig'v, which corrects both approximations, is shown to be uniformly distributed in a random dataset when searching for non-ambiguous motifs, indicating that it is a robust significance measure. Conclusions: a method is presented to compute exactly the true probability of a non-ambiguous short protein sequence motif, and the utility of an approximate approach for novel motif discovery across a large number of datasets is demonstrated

    Metabotropic glutamate receptors: modulators of context-dependent feeding behaviour in C. elegans.

    No full text
    Glutamatergic neurotransmission is evolutionarily conserved across animal phyla. A major class of glutamate receptors are the metabotropic glutamate receptors (mGluRs). In C. elegans three mGluR genes mgl-1, mgl-2 and mgl-3 are organised into three sub-groups, similar to their mammalian counterparts. Cellular reporters identified expression of the mgls in the nervous system of C. elegans and overlapping expression in the pharyngeal microcircuit that controls pharyngeal muscle activity and feeding behaviour. The overlapping expression of mgls within this circuit allowed investigation of receptor signalling per se and in the context of receptor interactions within a neural network that regulates feeding. We utilized the pharmacological manipulation of neuronally regulated pumping of the pharyngeal muscle in wild type and mutants to investigate mgl function. This defined a net mgl-1 dependent inhibition of pharyngeal pumping which is modulated by mgl-3 excitation. Optogenetic activation of the pharyngeal glutamatergic inputs combined with electrophysiological recordings from the isolated pharyngeal preparations provided further evidence for a presynaptic mgl-1 dependent regulation of pharyngeal activity. Analysis of mgl-1, mgl-2 and mgl-3 mutant feeding behaviour in the intact organism after acute food removal identified a significant role for mgl-1 in the regulation of an adaptive feeding response. Our data describes the molecular and cellular organisation of mgl-1, mgl-2 and mgl-3. Pharmacological analysis identified that in these paradigms mgl-1 and mgl-3, but not mgl-2, can modulate the pharyngeal microcircuit. Behavioural analysis identified mgl-1 as a significant determinant of the glutamate-dependent modulation of feeding, further highlighting the significance of mGluRs in complex C. elegans behaviour
    corecore