1,721,097 research outputs found

    SUPERSEDED - Hi-C genomes from "Assembly of hundreds of microbial genomes from the cow rumen reveals novel microbial species encoding enzymes with roles in carbohydrate metabolism"

    No full text
    ## This item has been replaced by the one which can be found at https://doi.org/10.7488/ds/2296. ## The cow rumen is a specialised organ adapted for the efficient breakdown of plant material into energy and nutrients, and it is largely the rumen microbiome that encodes the enzymes responsible. Many of these enzymes are of significant industrial interest. Despite this, rumen microbes are under-represented in public databases. Here we present 283 draft bacterial and archaeal genomes assembled directly from over 800 gigabases of rumen metagenomic sequence data and 43 samples, using both metagenomic binning and Hi-C-based Proximity-Guided Assembly. Comparative analysis with current publicly available genomes reveals that the majority of these represent previously unsequenced strains and species of bacteria and archaea. The genomes contain over 16,000 proteins predicted to be involved in carbohydrate metabolism, over 90% of which do not have a good match in public databases. Inclusion of the 283 genomes presented here improves metagenomic read classification by 2-3-fold, both in our data and in other publicly available rumen datasets. This release improves the coverage of rumen microbes in the public databases, and represents a highly valuable resource for biomass-degrading enzyme discovery and studies of the rumen microbiome

    SUPERSEDED - Assembly of hundreds of microbial genomes from the cow rumen reveals novel microbial species encoding enzymes with roles in carbohydrate metabolism

    No full text
    ## This item has been replaced by the one which can be found at https://doi.org/10.7488/ds/2296. ## The cow rumen is a specialised organ adapted for the efficient breakdown of plant material into energy and nutrients, and it is the rumen microbiome that encodes the enzymes responsible. Many of these enzymes are of huge industrial interest. Despite this, rumen microbes are under-represented in the public databases. Here we present 220 high quality bacterial and archaeal genomes assembled directly from 768 gigabases of rumen metagenomic sequence data. Comparative analysis with current publicly available genomes reveals that the majority of these represent previously unsequenced strains and species of bacteria and archaea. The genomes contain over 13,000 proteins predicted to be involved in carbohydrate metabolism, over 90% of which do not have a good match in the public databases. Inclusion of the 220 genomes presented here improves metagenomic read classification by 2-3-fold, both in our data and in other publicly available rumen datasets. This release improves the coverage of rumen microbes in the public databases, and represents a hugely valuable resource for biomass-degrading enzyme discovery and studies of the rumen microbiome

    Chromosome level genome assembly and comparative genomics between three falcon species reveals a pattern of genome organization not typical for birds

    No full text
    Whole genome assemblies are crucial for understanding a wide range of aspects of falcon biology including morphology, ecology and physiology and thus essential for their care and conservation. A key aspect of the genome of any species is its karyotype, the arrangement of its chromosomes, which can then be linked to the whole genome sequence to generate a so-called chromosome level assembly. Chromosome level assemblies are essential for marker assisted selection and genotype-phenotype correlations in breeding regimes as well as determining patterns of gross genomic evolution. To date only two falcon species have been sequenced and neither initially to chromosome level. Falcons have atypical avian karyotypes with fewer chromosomes than other birds, presumably brought about by wholesale fusion. To date however published chromosome preparations are of poor quality, few chromosomes have been distinguished and standard ideograms have not been made. The purpose of this study was to generate analyzable karyotypes and ideograms of peregrine, saker and gyr falcons, report on our recent generation of chromosome level sequence assemblies of peregrine and saker falcons, and for the first time sequence the gyr falcon genome. Finally, we aimed to generate comparative genomic data between all three species and the reference chicken genome. Results revealed a diploid number of 2n=50 for peregrine falcon and 2n=52 for saker and gyr through high quality banded chromosomes. Standard ideograms generated here helped to map predicted chromosomal fragments (PCFs) from the genome sequences directly to chromosomes and thus generate chromosome level sequence assemblies for peregrine and saker falcons. Whole genome sequencing was successful in gyr falcon but read depth and coverage was not sufficient to generate a chromosome level assembly. Nonetheless comparative genomics revealed no differences in genome organization between gyr and saker falcons. When compared to peregrine falcon, saker/gyr differed by 1 interchromosomal and 7 intrachromosomal rearrangements (a fusion plus 7 inversions) whereas peregrine and saker/gyr differ from the reference chicken genomes by 14/13 fusions (11 microchromosomal) and 6 fissions. The chromosomal differences between the species could possibly provide the basis of a screening test for hybrid animals. We have preserved these partial assemblies here for future use. The final assemblies for this genome will be submitted to the nucleotide archives GenBank/ENA.falcon.contig.fasta.gz - version 0.1 of the assembly (contigs) falcon_contigs_260911.fasta.gz - version 0.2 of the assembly (contigs) falcon_scaffolds_260911.fasta.gz - version 0.2 of the assembly (scaffolds

    Open prediction of polysaccharide utilisation loci (PUL) in 5414 public Bacteroidetes genomes using PULpy

    No full text
    Polysaccharide utilisation loci (PUL) are regions within the genomes of Bacteroidetes that encode all the necessary machinery for the cleavage of particular carbohydrates. Prediction of PUL from genomic data alone involves the identification of carbohydrate-active enzymes (CAZymes) co-localised with susCD gene pairs. Here we present the open prediction of PUL in 5414 public Bacteroidetes genomes, and an open-source pipeline to reproduce or extend the results. The PULpy code "Open prediction of Polysaccharide Utilisation Loci (PUL)" can be obtained via GitHub as documented in the attached README.txt file.samples.tsv describes the 5414 genomes used in the analysis all.pul.tsv is a data table providing one row of information for each gene within each PUL prediction sum.pul.tsv is a data table providing one row of information for each PU

    Assembly of 913 microbial genomes from metagenomic sequencing of the cow rumen

    No full text
    This dataset represents 913 draft bacterial and archaeal genomes assembled from over 800 gigabases of rumen metagenomic sequence data derived from 43 Scottish cattle, using both metagenomic binning and Hi-C-based proximity-guided assembly. Most of these genomes represent previously unsequenced strains and species. The draft genomes contain over 1.2 million predicted protein sequences, and 69,000 proteins predicted to be involved in carbohydrate metabolism. ## Relation to earlier versions ## This data is referenced by Watson et al. (In Submission). A previous paper, in bioRXiv, referenced the earlier dataset "Assembly of hundreds of microbial genomes from the cow rumen reveals novel microbial species encoding enzymes with roles in carbohydrate metabolism" https://datashare.is.ed.ac.uk/handle/10283/2772. This in turn was superseded by the more recent version Hi-C genomes from "Assembly of hundreds of microbial genomes from the cow rumen reveals novel microbial species encoding enzymes with roles in carbohydrate metabolism" https://datashare.is.ed.ac.uk/handle/10283/2911. The paper underwent many rounds of review, the first-round revised paper referenced the second (Hi-C) dataset and the final, accepted version will reference the DOI of this dataset. The datasets changed in nature and in name during this process

    Simulated metagenomic dataset for Smith et al. 2022

    No full text
    This dataset is simulated metagenomic data created by Rebecca (Becky) Smith, PhD student at the Roslin Institute in Mick Watson's group. This data is described in detail in Smith et al. 2022, but briefly these reads were simulated using InSilicoSeq (https://doi.org/10.1093/bioinformatics/bty630) with the hiseq exponential model, and 150bp. The genomes used to create this data are from the Hungate Collection (paper at https://www.nature.com/articles/nbt.4110 and sequences at https://genome.jgi.doe.gov/portal/HungateCollection/HungateCollection.info.html ).Smith, Rebecca; Watson, Mick. (2022). Simulated metagenomic dataset for Smith et al. 2022, [dataset]. University of Edinburgh. The Roslin Institute. https://doi.org/10.7488/ds/3444

    Chicken cecal metagenome assembled genomes

    No full text
    We sequenced DNA from cecal contents samples taken from 24 chickens belonging to either a fast or slower growing breed consuming either a vegetable-only diet or a diet containing fish meal. We utilised 1.6T of Illumina data to construct 469 draft metagenome-assembled bacterial genomes, including 460 novel strains, 283 novel species and 42 novel genera.Glendinning, Laura; Watson, Kellie; Pallen, Mark; Stewart, Robert; Watson, Mick. (2019). Chicken cecal metagenome assembled genomes, [dataset]. University of Edinburgh. The Roslin Institute. https://doi.org/10.7488/ds/258

    Going Beyond Counting First Authors in Author Co-citation Analysis

    Full text link
    The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed

    Variations on the Author

    Full text link
    “Variations on the Author” discusses two of Eduardo Coutinho’s recent films (Um Dia na Vida, from 2010, and Últimas Conversas, posthumously released in 2015) and their contribution to the general question of documentary authorship. The director’s filmography is characterized by a consistent yet self-effacing form of authorial self-inscription: Coutinho often features as an interviewer that rather than express opinions propels discourses; an interviewer that is good at listening. This mode of self-inscription characterizes him as an author who is not expressive but who is nonetheless markedly present on the screen. In Um Dia na Vida, however, Coutinho is completely absent form the image, while Últimas Conversas, on the contrary, includes a confessional prologue that moves the director from the margins to the center of his films. This article examines the ways in which these works stand out in the filmography of a director who offers new insights into the notion of cinematic authorship
    corecore