1,721,311 research outputs found

    The genome of the tardigrade Hypsibius dujardini

    No full text
    <p>These data files accompany the bioRxiv preprint "The genome of the tardigrade Hypsibius dujardini"</p> <p>Edinburgh genome assembly and annotation<br /> ========================================</p> <p>1. nHd.2.3.abv500.fna.gz - Edinburgh (EDI) genome assembly version 2.3. Reads were assembled as single-end with CLC to calculate the insert size distributions of the libraries and check for contaminants. Insert size distributions are calculated by mapping the reads back to the assembly with CLC. The MP library insert distribution wasn't normally distributed. The single-end assembly is checked for contamination using the blobtools software package which creates a TAGC plot. Inspection of the TAGC plot revealed multiple contaminations with distinct coverage and GC content that did not have a reference genome in public databases. The PE reads were normalised with one-pass khmer and were assembled with Velvet using a k-mer size of 55. Contaminants in the Velvet assembly were identified based on the coverage and GC of the scaffolds. The non-normalised reads were mapped to the assembly using CLC and reads were removed if either pair mapped to a contig identified as contaminant. The process was repeated two more times since newly assembled contaminants could be identified. Gaps were filled in the final assembly using GapFiller. Finally the MP library was used to scaffold the gap-filled assembly with SSPACE, accepting only the information from reads mapping 2 kb from the ends of the scaffolds. The final assembly spans 140 megabases (Mb) with median coverage of 86X.</p> <p>2. nHd.2.3.1.aug.gff.gz - Gene model GFF file as predicted by Augustus for nHd.2.3 genome assembly. This is Augustus run as a second pass annotation (using transcriptome assembly as evidence) after a first pass Maker (see below)</p> <p>3. nHd.2.3.1.aug.proteins.fasta.gz - Protein fasta file generated by Augustus for nHd.2.3 genome assembly.</p> <p>4. nHd.2.3.1.aug.transcripts.fasta.gz - Transcript CDS fasta file generated by Augustus for nHd.2.3 genome assembly.</p> <p><br /> Edinburgh genome assembly and annotation - intermediate files<br /> =============================================================</p> <p>1. nHd.1.0.contigs.cov.fna.gz - Preliminary assembly of all data, without any contamination screening</p> <p>2. maker1.gff3.gz - Gene model GFF file as generated by MAKER run as a first pass to generate enough genes to train genefinders more thoroughly</p> <p>3. all.maker.proteins.edit.fasta.gz - Protein fasta file generated by MAKER run as a first pass.</p> <p>4. all.maker.transcripts.edit.fasta.gz - Transcript CDS file generated by MAKER run as a first pass.</p> <p>Blob plots<br /> ==========</p> <p>1. nHd.2.3.nHd_lib350-cov.BlobDB.json.gz - A blobDB (a JSON file generated using the blobtools package) which contains mapping, assembly and taxonomic information for the Edinburgh assembly and our read data. http://drl.github.io/blobtools/</p> <p>2. nHd.1.0.BlobDB.json.gz - A blobDB (a JSON file generated using the blobtools package) which contains mapping, assembly and taxonomic information for the Edinburgh preliminary assembly nHd.1.0 and Edinburgh read data. http://drl.github.io/blobtools/</p> <p>3. unc.TG-cov.BlobDB.json.gz - A blobDB (a JSON file generated using the blobtools package) which contains mapping, assembly and taxonomic information for the UNC assembly and their read data.  http://drl.github.io/blobtools/</p> <p>4. unc.nHd-cov.uniref.nt.BlobDB.json.gz - A blobDB (a JSON file generated using the blobtools package) which contains mapping, assembly and taxonomic information for the UNC assembly and the Edinburgh read data. http://drl.github.io/blobtools/</p> <p>5. tardi_RNASeq.vs.unc.bam.reads_cov.catcolour.txt.gz - Space delimited text file with classification of each UNC scaffold by avg coverage of each base by PolyA-selected RNAseq reads</p> <p>6. tardi_RNASeq.vs.nHd.2.3.bam.reads_cov.catcolour.txt.gz - Space delimited text file with classification of each Edinburgh scaffold by avg coverage of each base by PolyA-selected RNAseq reads</p> <p>H dujardini transcriptome data<br /> ==============================</p> <p>1. Trinity.fasta.c99.gz - Preliminary transcriptome assembly by Itai Yanai's lab. Please do not use in any publications without checking with yanailab.technion.ac.il first</p> <p> </p> <p>Abstract of bioRxiv paper at http://dx.doi.org/10.1101/033464</p> <p>====================================== <br /> The genome of the tardigrade Hypsibius dujardini <br /> ======================================</p> <p>Background: Tardigrades are meiofaunal ecdysozoans that may be key to understanding the origins of Arthropoda. Many species of Tardigrada can survive extreme conditions through adoption of a cryptobiotic state. A recent high profile paper suggested that the genome of a model tardigrade, Hypsibius dujardini, has been shaped by unprecedented levels of horizontal gene transfer (HGT) encompassing 17% of protein coding genes, and speculated that this was likely formative in the evolution of stress resistance. We tested these findings using an independently sequenced and assembled genome of H. dujardini, derived from the same original culture isolate. </p> <p>Results: Whole-organism sampling of meiofaunal species will perforce include gut and surface microbiotal contamination, and our raw data contained bacterial and algal sequences. Careful filtering generated a cleaned H. dujardini genome assembly, validated and annotated with GSSs, ESTs and RNA-Seq data, with superior assembly metrics compared to the published, HGT-rich assembly. A small amount of additional microbial contamination likely remains in our 135 Mb assembly. Our assembly length fits well with multiple empirical measurements of H. dujardini genome size, and is 120 Mb shorter than the HGT-rich version. Among 23,021 protein coding gene predictions we found 216 genes (0.9%) with similarity to prokaryotes, 196 of which were expressed, suggestive of HGT. We also identified ~400 genes (<2%) that could be HGT from other non-metazoan eukaryotes. Cross-comparison of the assemblies, using raw read and RNA-Seq data, confirmed that the overwhelming majority of the putative HGT candidates in the previous genome were predicted from scaffolds at very low coverage and were not transcribed. Crucially much of the natural contamination in both projects was non-overlapping, confirming it as foreign to the shared target animal genome. </p> <p>Conclusions: We find no support for massive horizontal gene transfer into the genome of H. dujardini. Many of the bacterial sequences in the previously published genome were not present in our raw reads. In construction of our assembly we removed most, but still not all, contamination with approaches derived from metagenomics, which we show are very appropriate for meiofaunal species. We conclude that HGT into H. dujardini accounts for 1-2% of genes and that the proposal that 17% of tardigrade genes originate from HGT events is an artefact of undetected contamination.</p&gt

    No evidence for extensive horizontal gene transfer in the genome of the tardigrade Hypsibius dujardini

    Full text link
    No evidence for extensive horizontal gene transfer in the genome of the tardigrade Hypsibius dujardini These files accompany the peer-reviewed version of http://dx.doi.org/10.1101/033464 A previous dataset https://zenodo.org/record/45162 accompanied the version of this manuscript at BioRxiv - biorxiv.org/content/early/2015/12/13/033464 This dataset includes all files from https://zenodo.org/record/45162 plus all the Supplemental files, and one additional file HGT_phylogenetic_files.tgz. All files are described in Hypsibius_dujardini_files_README.md Abstract Tardigrades are meiofaunal ecdysozoans that are key to understanding the origins of Arthropoda. Many species of Tardigrada can survive extreme conditions through cryptobiosis. In a recent paper (Boothby TC et al (2015) Evidence for extensive horizontal gene transfer from the draft genome of a tardigrade. Proc Natl Acad Sci USA 112:15976-15981) the authors concluded that the tardigrade Hypsibius dujardini had an unprecedented proportion (17%) of genes originating through functional horizontal gene transfer (fHGT), and speculated that fHGT was likely formative in the evolution of cryptobiosis. We independently sequenced the genome of H. dujardini. As expected from whole-organism DNA sampling, our raw data contained reads from non-target genomes. Filtering using metagenomics approaches generated a draft H. dujardini genome assembly of 135 Mb with superior assembly metrics to the previously published assembly. Additional microbial contamination likely remains. We found no support for extensive fHGT. Among 23,021 gene predictions we identified 0.2% strong candidates for fHGT from bacteria, and 0.2% strong candidates for fHGT from non-metazoan eukaryotes. Cross-comparison of assemblies showed that the overwhelming majority of HGT candidates in the Boothby et al. genome derived from contaminants. We conclude that fHGT into H. dujardini accounts for at most 1-2% of genes and that the proposal that one sixth of tardigrade genes originate from functional HGT events is an artefact of undetected contamination

    Defining operational taxonomic units using DNA barcode data

    Full text link
    The scale of diversity of life on this planet is a significant challenge for any scientific programme hoping to produce a complete catalogue, whatever means is used. For DNA barcoding studies, this difficulty is compounded by the realization that any chosen barcode sequence is not the gene 'for' speciation and that taxa have evolutionary histories. How are we to disentangle the confounding effects of reticulate population genetic processes? Using the DNA barcode data from meiofaunal surveys, here we discuss the benefits of treating the taxa defined by barcodes without reference to their correspondence to 'species', and suggest that using this non-idealist approach facilitates access to taxon groups that are not accessible to other methods of enumeration and classification. Major issues remain, in particular the methodologies for taxon discrimination in DNA barcode data

    A conserved set of maternal genes? Insights from a molluscan transcriptome

    Full text link
    The early animal embryo is entirely reliant on maternal gene products for a ‘jump-start’ that transforms a transcriptionally inactive embryo into a fully functioning zygote. Despite extensive work on model species, it has not been possible to perform a comprehensive comparison of maternally-provisioned transcripts across the Bilateria because of the absence of a suitable dataset from the Lophotrochozoa. As part of an ongoing effort to identify the maternal gene that determines left-right asymmetry in snails, we have generated transcriptome data from 1 to 2-cell and ~32-cell pond snail (Lymnaea stagnalis) embryos. Here, we compare these data to maternal transcript datasets from other bilaterian metazoan groups, including representatives of the Ecydysozoa and Deuterostomia. We found that between 5 and 10% of all L. stagnalis maternal transcripts (~300-400 genes) are also present in the equivalent arthropod (Drosophila melanogaster), nematode (Caenorhabditis elegans), urochordate (Ciona intestinalis) and chordate (Homo sapiens, Mus musculus, Danio rerio) datasets. While the majority of these conserved maternal transcripts (“COMATs”) have housekeeping gene functions, they are a non-random subset of all housekeeping genes, with an overrepresentation of functions associated with nucleotide binding, protein degradation and activities associated with the cell cycle. We conclude that a conserved set of maternal transcripts and their associated functions may be a necessary starting point of early development in the Bilateria. For the wider community interested in discovering conservation of gene expression in early bilaterian development, the list of putative COMATs may be useful resource

    Bacterial colonization and weathering of terrestrial obsidian in Iceland

    No full text
    Through weathering processes, volcanic rocks contribute both to nutrient flux into the biosphere and atmospheric CO2 drawdown. As rhyolitic rocks are of higher silica content and have lower concentrations of biologically-important elements than basalts they might be expected to be less easily weathered by a biota. Investigations on the microbial diversity and weathering of silica-rich rhyolitic glass (obsidian) from a lava flow in Iceland are reported. 16S rDNA analysis of rock whole genome DNA shows that the rock hosts remarkable eubacterial diversity. Irregular pitted weathering textures correspond to regions of eubacterial colonization as shown by FISH. Weathering processes proceed at alteration fronts, with a preference for potentially nutrient-rich regions containing plagioclase and pyroxene crystals, although these features are less well defined than those previously reported from basaltic glass, consistent with the lower rates of chemical weathering previously reported for rhyolites compared to basalts. In-vitro weathering of the rock was tested by culturing in the laboratory resulting in a biofilm examined by FIB-SEM. This biofilm contained a population consisting of one dominant organism that did not correspond to any sequence in the environmental 16S rDNA analysis, showing that laboratory weathering experiments are unrepresentative of the potential complexity of prokaryotic weathering in nature

    Going Beyond Counting First Authors in Author Co-citation Analysis

    Full text link
    The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed

    Variations on the Author

    Full text link
    “Variations on the Author” discusses two of Eduardo Coutinho’s recent films (Um Dia na Vida, from 2010, and Últimas Conversas, posthumously released in 2015) and their contribution to the general question of documentary authorship. The director’s filmography is characterized by a consistent yet self-effacing form of authorial self-inscription: Coutinho often features as an interviewer that rather than express opinions propels discourses; an interviewer that is good at listening. This mode of self-inscription characterizes him as an author who is not expressive but who is nonetheless markedly present on the screen. In Um Dia na Vida, however, Coutinho is completely absent form the image, while Últimas Conversas, on the contrary, includes a confessional prologue that moves the director from the margins to the center of his films. This article examines the ways in which these works stand out in the filmography of a director who offers new insights into the notion of cinematic authorship

    Transcriptional profiling of shell calcification in bivalves

    Full text link
    Mollusc shells are unique adaptations that serve to protect the organisms that make them, and are a defining feature of the phylum. However the molecular underpinnings of shell forming processes are still largely unexplored. To further understand mollusc shell formation, I studied three bivalve species in this project: the blue mussel Mytilus edulis, the Pacifc oyster Crassostrea gigas, and the king scallop Pecten maximus. While previous analyses of the shell proteomes showed species specificity, transcriptomes of the mantle tissues revealed more commonalities. To reconcile these dfferences, I studied dfferential gene expression in shell damage-repair experiments and during the formation of the �rst larval shell, to produce a comprehensive overview of shell formation processes. Expression data showed large biological variability between individuals, requiring matched-pair experimental designs to detect dfferential gene expression during shell repair. Loci dfferentially expressed during shell repair and in the larvae encoded shell matrix proteins, transmembrane transporters, and novel transcripts. A large number of shell matrix proteins, encoded in dfferentially expressed loci, were common in all three species during shell formation, indicating that shell forming proteins between dfferent species may be more common than previously thought. Dfferential expression of transmembrane transporters during shell repair indicated that the animals may be regulating bicarbonate ions during shell formation. Finally, the experiments revealed novel transcripts, with unknown annotations to public datasets, that may putatively be involved in shell formation

    Appropriate Similarity Measures for Author Cocitation Analysis

    Full text link
    We provide a number of new insights into the methodological discussion about author cocitation analysis. We first argue that the use of the Pearson correlation for measuring the similarity between authors’ cocitation profiles is not very satisfactory. We then discuss what kind of similarity measures may be used as an alternative to the Pearson correlation. We consider three similarity measures in particular. One is the well-known cosine. The other two similarity measures have not been used before in the bibliometric literature. Finally, we show by means of an example that our findings have a high practical relevance.information science;Pearson correlation;cosine;similarity measure;author cocitation analysis
    corecore