1,721,134 research outputs found

    Bioinformatic study of selection in animal genomes

    No full text
    Les gènes orthologues divergent sur plusieurs aspects durant l'évolution. Après une revue de la littérature cherchant à montrer de la divergence entre les orthologues de l'humain et de la souris, j'ai souligné les différentes causes de cette divergence. En comparant les gènes qui divergent en fonction, je n'ai pas trouvé de lien avec la divergence des séquences, pour cette raison je me suis penché sur l'étude de l'expression. Notamment, j'ai étudié le niveau, la spécificité ainsi que la présence/absence d'expression des orthologues humain-souris liés aux maladies Mendéliennes. Malgré les similarités trouvées entre l'humain et la souris, j'ai détecté une différence d'expression spécifique à une des deux espèces liée a un phénotype précis (gène essentiel/non-essentiel). Cela m'a permis de conclure que la différence sur le plan phénotypique entre l'humain et la souris est mieux expliquée par les patrons d'expression plutôt que le niveau d'expression ou la sélection. J'ai été également intéressé par l'évolution des séquences d'ADN codantes pour des protéines, en particulier sur le rôle de la sélection. J'ai commencé par une étude sur la fiabilité de détection de la sélection positive en comparant des séquences divergentes. J'ai trouvé, en utilisant le model de branche-site que la sélection peut être détectée sur des séquences qui ont divergé il y a plus de 500 millions d'années. J'ai analysé le biais de GC entres les séquences sans trouver une influence sur l'estimation de la sélection positive. Finalement, Je crois que ce travail est une première étape dans l'établissement d'un lien entre la sélection et les patrons d'expression des gènes chez les vertébrés

    Supervised Learning for Detection of Duplicates in Genomic Sequence Databases.

    Full text link
    First identified as an issue in 1996, duplication in biological databases introduces redundancy and even leads to inconsistency when contradictory information appears. The amount of data makes purely manual de-duplication impractical, and existing automatic systems cannot detect duplicates as precisely as can experts. Supervised learning has the potential to address such problems by building automatic systems that learn from expert curation to detect duplicates precisely and efficiently. While machine learning is a mature approach in other duplicate detection contexts, it has seen only preliminary application in genomic sequence databases.We developed and evaluated a supervised duplicate detection method based on an expert curated dataset of duplicates, containing over one million pairs across five organisms derived from genomic sequence databases. We selected 22 features to represent distinct attributes of the database records, and developed a binary model and a multi-class model. Both models achieve promising performance; under cross-validation, the binary model had over 90% accuracy in each of the five organisms, while the multi-class model maintains high accuracy and is more robust in generalisation. We performed an ablation study to quantify the impact of different sequence record features, finding that features derived from meta-data, sequence identity, and alignment quality impact performance most strongly. The study demonstrates machine learning can be an effective additional tool for de-duplication of genomic sequence databases. All Data are available as described in the supplementary material

    JelisavetaDjordjevic/Sex_biased_gene_expression: Version 1.1

    No full text
    Version corresponding to the PCI peer reviewed manuscript (https://doi.org/10.24072/pci.evolbiol.100135) Djordjevic J, Dumas Z, Robinson-Rechavi M, Schwander T, Parker DJ (2021) Dynamics of sex-biased gene expression during development in the stick insect Timema californicum. bioRxiv, 2021.01.23.427895, ver. 6 peer-reviewed and recommended by Peer community in Evolutionary Biology

    Comparative modular analysis of gene expression in vertebrate development

    No full text
    The focus of my PhD research was the concept of modularity. In the last 15 years, modularity has become a classic term in different fields of biology. On the conceptual level, a module is a set of interacting elements that remain mostly independent from the elements outside of the module. I used modular analysis techniques to study gene expression evolution in vertebrates. In particular, I identified ``natural'' modules of gene expression in mouse and human, and I showed that expression of organ-specific and system-specific genes tends to be conserved between such distance vertebrates as mammals and fishes. Also with a modular approach, I studied patterns of developmental constraints on transcriptome evolution. I showed that none of the two commonly accepted models of the evolution of embryonic development (``evo-devo'') are exclusively valid. In particular, I found that the conservation of the sequences of regulatory regions is highest during mid-development of zebrafish, and thus it supports the ``hourglass model''. In contrast, events of gene duplication and new gene introduction are most rare in early development, which supports the ``early conservation model''. In addition to the biological insights on transcriptome evolution, I have also discussed in detail the advantages of modular approaches in large-scale data analysis. Moreover, I re-analyzed several studies (published in high-ranking journals), and showed that their conclusions do not hold out under a detailed analysis. This demonstrates that complex analysis of high-throughput data requires a co-operation between biologists, bioinformaticians, and statisticians

    Evolution of proteins after whole-genome duplication

    No full text
    Evolution of proteins after whole-genome duplicationGene and genome duplication are considered major mechanisms in the creation of newfunctions in genomes, or in the refinement of networks by the division of function amongmore genes. In animals, the best demonstrated whole genome duplication occurred at theorigin of Teleost fishes. This makes fishes an ideal model to study the consequences ofgenome duplication, particularly since we have a good sampling of genome sequences,abundant functional information, and a very well studied outgroup: the tetrapodes (includinghuman). More specifically, I studied the consequences of duplication on proteins usingevolutionary models to infer adaptive events. I analysed the influence of positive selection invertebrate genes, by contrasting singleton genes and duplicated genes. The conclusion of theanalyses was threefold: (i) positive selection affects diverse phylogenetic branches anddiverse gene categories during vertebrate evolution; (ii) it concerns only a small proportion ofsites (1%-5%); and (iii) whole genome duplication had no detectable impact on theprevalence of this positive selection.I also studied evolution at the amino acid level with different methods to detect functionalshifts (covarion process and constant-but-different process). As in my previous research, Ifound similar numbers of functional shifts between duplicates and between orthologs.The accepted framework for studies of molecular evolution is that orthologs share the samefunction, whereas the function of paralogs diverges. This framework gives a special place togene duplication in evolution, as the main mechanism for generating novelty. With myprevious results showing that duplication and speciation are not so different, we investigatedthe literature to question the evidence for similar or divergent evolution of gene function afterduplication relative to speciation genes. This led us to propose a more rigorous design offuture studies of gene duplication.Finally, based on my automated protocol, we built a database of positive selection invertebrates' genes, Selectome. This database is freely available on the web and will helpfuture evolutionary as well as biochemical studies

    DarrenJParker/Timema_convergent_gene_expression: version 1

    No full text
    <p>This is the version used to produce the results in:</p> <p>Parker, D. J., Bast, J., Jalvingh, K., Dumas, Z., Robinson-Rechavi, M., Schwander, T. 2018. Repeated evolution of asexuality involves convergent gene expression changes. Molecular Biology and Evolution. msy217, <a href="https://doi.org/10.1093/molbev/msy217">https://doi.org/10.1093/molbev/msy217</a>.</p&gt

    Going Beyond Counting First Authors in Author Co-citation Analysis

    Full text link
    The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed

    Combining molecular information on chromatin organisation with eQTLs and evolutionary conservation provides strong candidates for the evolution of gene regulation in mammalian brains

    No full text
    International audienceCite as: Robinson-Rechavi, M. 2017. Combining molecular information on chromatin organisation wit eQTLs and evolutionary conservation provides strong candidates for the evolution of gene regulation in mammalian brains. Peer Community in Evolutionary Biology, 100035. In this manuscript [1], Francisco J. Novo proposes candidate non-coding genomic elements regulating neurodevelopmental genes. What is very nice about this study is the way in which public molecular data, including physical interaction data, is used to leverage recent advances in our understanding to molecular mechanisms of gene regulation in an evolutionary context. More specifically, evolutionarily conserved non coding sequences are combined with enhancers from the FANTOM5 project, DNAse hypersensitive sites, chromatin segmentation, ChIP-seq of transcription factors and of p300, gene expression and eQTLs from GTEx, and physical interactions from several Hi-C datasets. The candidate regulatory regions thus identified are linked to candidate regulated genes, and the author shows their potential implication in brain development

    Variations on the Author

    Full text link
    “Variations on the Author” discusses two of Eduardo Coutinho’s recent films (Um Dia na Vida, from 2010, and Últimas Conversas, posthumously released in 2015) and their contribution to the general question of documentary authorship. The director’s filmography is characterized by a consistent yet self-effacing form of authorial self-inscription: Coutinho often features as an interviewer that rather than express opinions propels discourses; an interviewer that is good at listening. This mode of self-inscription characterizes him as an author who is not expressive but who is nonetheless markedly present on the screen. In Um Dia na Vida, however, Coutinho is completely absent form the image, while Últimas Conversas, on the contrary, includes a confessional prologue that moves the director from the margins to the center of his films. This article examines the ways in which these works stand out in the filmography of a director who offers new insights into the notion of cinematic authorship
    corecore