1,721,681 research outputs found

    The challenge of genome sequence assembly

    No full text
    Background: Although whole genome sequencing is enabling numerous advances in many fields achieving complete chromosome-level sequence assemblies for diverse species presents difficulties. The problems in part reflect the limitations of current sequencing technologies. Chromosome assembly from ‘short read’ sequence data is confounded by the presence of repetitive genome regions with numerous similar sequence tracts which cannot be accurately positioned in the assembled sequence. Longer sequence reads often have higher error rates and may still be too short to span the larger gaps between contigs. Objective: Given the emergence of exciting new applications using sequencing technology, such as the Earth BioGenome Project, it is necessary to further develop and apply a range of strategies to achieve robust chromosome-level sequence assembly. Reviewed here are a range of methods to enhance assembly which include the use of cross-species synteny to understand relationships between sequence contigs, the development of independent genetic and/or physical scaffold maps as frameworks for assembly (for example, radiation hybrid, optical motif and chromatin interaction maps) and the use of patterns of linkage disequilibrium to help position, orient and locate contigs. Results and Conclusion: A range of methods exist which might be further developed to facilitate cost-effective large-scale sequence assembly for diverse species. A combination of strategies is required to best assemble sequence data into chromosome-level assemblies. There are a number of routes towards the development of maps which span chromosomes (including physical, genetic and linkage disequilibrium maps) and construction of these whole chromosome maps greatly facilitates the ordering and orientation of sequence contigs.</p

    Allelic association: linkage disequilibrium structure and gene mapping

    No full text
    The linkage disequilibrium (LD) structure of the human genome is now well understood and characterised for a number of human populations. The LD structure underpins the design and execution of candidate gene and genome-wide association mapping studies. Successful association mapping studies completed to date provide vital new insights into the genetic influences on common diseases, such as diabetes, some cancers and heart disease. The LD structure also presents new avenues of research into the genetic history of human populations, the effects of natural selection and the impact of recombination on the genomic landscape. This review introduces this exciting and complex field by encompassing this range of topic

    Genetic epidemiology of complex phenotypes

    No full text
    A theory is given for complex phenotypes represented by an ordered polychotomy separately for affected (as severity) and for normals (as diathesis), with consideration of history, ascertainment, sampling frames, and phenotype systems. Nonrandom selection of probands by severity is permitted. Both probit and logistic models are developed in a form compatible with segregation and/or linkage analysis. Probabilities are set out in detail in the Appendix. This approach avoids problems that have been encountered with quantitative traits and correlated phenotypes, although using this information

    Mapping in the sequencing era

    No full text
    The present phase of the Human Genome Project is concerned with sequencing. The shift of emphasis has left an impression that mapping is in some sense complete or finished. On the contrary, faced with the challenges of mapping genes for complex traits and efforts to understand recombination and other biological processes, the need for accurate integrated metric maps is greater than ever. Furthermore, sequencing could be regarded as merely a way of improving the map, since the most useful 'end product' of the sequencing effort must be the annotated sequence that gives precise physical coordinates for markers and expressed sequences. Integration of both location and functional information, the latter provided by homology, expression and other functional studies, is the main target for the future

    A metric map of humans: 23,500 loci in 850 bands

    No full text
    High-resolution maps integrated with the enhanced location data base software (ldb+) give improved estimates of genetic parameters and reveal characteristics of cytogenetic bands. Chiasma interference is intermediate between Kosambi and Carter–Falconer levels, as in Drosophila and the mouse. The autosomal genetic map is 2832 and 4348 centimorgans in males and females, respectively. Telomeric T-bands are strikingly associated with male recombination and gene density. Position and centromeric heterochromatin have large effects, but nontelomeric R-bands are not significantly different from G-bands. Several possible reasons are discussed. These regularities validate the maps, despite their high resolution and inevitable local errors. No other approach has been demonstrated to integrate such a large number of loci, which are increasing at about 45% per year. The maps and the data and

    Clearwing hunting in the 21st century.

    No full text

    The genomic and functional characteristics of disease genes

    No full text
    Increasing evidence indicates that genes containing disease causal variation have distinct functional and genomic properties. The importance of understanding these properties is highlighted by efforts to filter lists of variants from next-generation sequencing studies, where the number of potentially deleterious variants, which are in fact unrelated to disease, may be large. Available evidence indicates that the majority of disease genes are ‘non-essential’ and their products occupy functionally peripheral positions in protein networks. They tend to be intermediate between genes that have core biological functions, particularly low mutation rates and low haplotype diversity, and genes for which high haplotype diversity and high mutation rates are advantageous (such as those involved in sensory perception and some immune system functions). Evidence presented here supports these conclusions through analysis of integrated data sets incorporating the latest mutational profiles, linkage disequilibrium structure and other genomic properties of individual genes. The analysis highlights the contrasting functions of genes predicted as least and most likely to contain disease variation and provides a basis for filtering gene variant lists to exclude the least plausible disease candidates
    corecore