1,721,037 research outputs found

    Relative contributions of the host genome, microbiome, and environment to the metabolic profile

    No full text
    Background Metabolic syndrome is as a well-known risk factor for cardiovascular disease, which is associated with both genetic and environmental factors. Recently, the microbiome composition has been shown to affect the development of metabolic syndrome. Thus, it is expected that the complex interplay among host genetics, the microbiome, and environmental factors could affect metabolic syndrome. Objective To evaluate the relative contributions of genetic, microbiome, and environmental factors to metabolic syndrome using statistical approaches. Methods Data from the prospective Korean Association REsource project cohort (N = 8476) were used in this study, including single-nucleotide polymorphisms, phenotypes and lifestyle factors, and the urine-derived microbial composition. The effect of each data source on metabolic phenotypes was evaluated using a heritability estimation approach and a prediction model separately. We further experimented with various types of metagenomic relationship matrices to estimate the phenotypic variance explained by the microbiome. Results With the heritability estimation, five of the 11 metabolic phenotypes were significantly associated with metagenome-wide similarity. We found significant heritability for fasting glucose (4.8%), high-density lipoprotein cholesterol (4.9%), waist-hip ratio (7.7%), and waist circumference (5.6%). Microbiome compositions provided more accurate estimations than genetic factors for the same sample size. In the prediction model, the contribution of each source to the prediction accuracy varied for each phenotype. Conclusion The effects of host genetics, the metagenome, and environmental factors on metabolic syndrome were minimal. Our statistical analysis suffers from a small sample size, and the measurement error is expected to be substantial. Further analysis is necessary to quantify the effects with better accuracy.N

    missForest with feature selection using binary particle swarm optimization improves the imputation accuracy of continuous data

    No full text
    Background Missing data are a common problem in large-scale datasets and its appropriate handling is crucial for data analyses. Missingness can be categorized as (1) missing completely at random (MCAR), (2) missing at random (MAR), and (3) missing not at random (MNAR). Different missingness mechanisms require different imputation strategies. Multiple imputation, an approach for averaging outcomes across multiple imputed data, is more suitable than single imputation for dealing with various missing mechanisms. missForest, a nonparametric missing value imputation strategy using random forest, is one of the most prevalent multiple imputation methods for missing-data because it can be applied to mixed-type data and does not require distributional assumptions. However, a recent study found that missForest can produce biased results for non-normal data. In addition, missForest is computationally expensive. Objective Therefore, we aimed to further develop the missForest algorithm by combining a binary particle swarm optimization (BPSO)-based feature-selection strategy. Methods The BPSO is an evolutionary algorithm that is well known for global optimization and computational efficiency. By using the BPSO-based feature selection step prior to imputing missing values with missForest, the imputation accuracy for continuous variables could be increased by pruning redundant variables. Results In this study, missForest with BPSO (BPSOmf) showed better imputation accuracy than missForest alone with respect to continuous variables by feature selection prior to the imputation step. Conclusions BPSOmf is an appropriate and robust method when the imputation target data consist mainly of continuous variables.N

    Prediction models using outdoor environmental data for real-time PM10 concentrations in daycare centers, kindergartens, and elementary schools

    No full text
    Children spend a considerable amount of time in daycare centers, kindergartens, and elementary schools. Poor indoor air quality (IAQ) in the educational facilities can affect the health of the children and impair their academic performance. The prediction of real-time PM10 concentration could be useful to intervene the problem of poor IAQ. This study developed models to predict real-time indoor PM10 concentration in the daycare centers, kindergartens, and elementary schools using outdoor environmental data. Indoor PM10 concentrations were measured in 54 daycare centers, 12 kindergartens, and 21 elementary schools in Seoul, South Korea, using a realtime monitor (AirGuard K) over a period of one year. Multiple linear regression models were used to predict realtime indoor PM10 concentration in these educational facilities using outdoor PM10 and meteorological data as input variable. Four formations (original, ratio of indoor-to-outdoor, root-transformation, and log transformation) for dependent variable were compared to determine the best performance of the model. A 10 fold cross-validation method was used to evaluate the accuracy of the prediction models. Daycare centers showed the highest indoor PM10 concentration. Root-transformed models with high accuracy were developed to predict the real-time indoor PM10 concentration in educational facilities every 10 min. The R-2 of the prediction models were 0.64 in the daycare centers, 0.45 in the kindergartens, and 0.43 in the elementary schools. The 24 h profile of the predicted indoor PM10 was similar to the measured PM10 concentration. The prediction models could provide real-time PM10 levels in educational facilities without direct indoor measurement and observation.Y

    Association between prenatal cadmium exposure and cord blood DNA methylation

    No full text
    Prenatal cadmium exposure is known to affect infant growth and organ development. Nonetheless, the role of DNA methylation in cadmium-related health effects has yet to be determined. To this end, we investigated the relationship between prenatal cadmium exposure and cord blood DNA methylation in Korean infants through an epigenome-wide association study. Cadmium concentrations in maternal blood during early and late pregnancy and in cord blood collected from newborns were measured using atomic adsorption spectrometry and DNA methylation analysis was conducted using HumanMethylationEPIC BeadChip kits. After adjusting for infant sex, maternal pregnancy body mass index, smoking status, and estimated leukocyte composition, we analyzed the association between CpG methylation and cadmium concentration in 364 samples. Among 835,252 CpG sites, maternal blood cadmium concentration in early pregnancy was significantly associated with two differentially methylated CpG sites, cg05537752 and cg24904393, which were annotated ATP9A and no gene, respectively. The study findings indicate that prenatal cadmium exposure is significantly associated with methylation statuses of several CpG sites and regions in Korean infants, especially during early pregnancy.N

    TBC: A clustering algorithm based on prokaryotic taxonomy

    No full text
    High-throughput DNA sequencing technologies have revolutionized the study of microbial ecology. Massive sequencing of PCR amplicons of the 16S rRNA gene has been widely used to understand the microbial community structure of a variety of environmental samples. The resulting sequencing reads are clustered into operational taxonomic units that are then used to calculate various statistical indices that represent the degree of species diversity in a given sample. Several algorithms have been developed to perform this task, but they tend to produce different outcomes. Herein, we propose a novel sequence clustering algorithm, namely Taxonomy-Based Clustering (TBC). This algorithm incorporates the basic concept of prokaryotic taxonomy in which only comparisons to the type strain are made and used to form species while omitting full-scale multiple sequence alignment. The clustering quality of the proposed method was compared with those of MOTHUR, BLASTClust, ESPRIT-Tree, CD-HIT, and UCLUST. A comprehensive comparison using three different experimental datasets produced by pyrosequencing demonstrated that the clustering obtained using TBC is comparable to those obtained using MOTHUR and ESPRIT-Tree and is computationally efficient. The program was written in JAVA and is available from http://sw.ezbiodoud.net/tbc.N

    Role of an unclassified Lachnospiraceae in the pathogenesis of type 2 diabetes: a longitudinal study of the urine microbiome and metabolites

    No full text
    Recent investigations have revealed that the human microbiome plays an essential role in the occurrence of type 2 diabetes (T2D). However, despite the importance of understanding the involvement of the microbiota throughout the body in T2D, most studies have focused specifically on the intestinal microbiota. Extracellular vesicles (EVs) have been recently found to provide important evidence regarding the mechanisms of T2D pathogenesis, as they act as key messengers between intestinal microorganisms and the host. Herein, we explored microorganisms potentially associated with T2D by tracking changes in microbiota-derived EVs from patient urine samples collected three times over four years. Mendelian randomization analysis was conducted to evaluate the causal relationships among microbial organisms, metabolites, and clinical measurements to provide a comprehensive view of how microbiota can influence T2D. We also analyzed EV-derived metagenomic (N = 393), clinical (N = 5032), genomic (N = 8842), and metabolite (N = 574) data from a prospective longitudinal Korean community-based cohort. Our data revealed that GU174097_g, an unclassified Lachnospiraceae, was associated with T2D (beta = -189.13; p = 0.00006), and it was associated with the ketone bodies acetoacetate and 3-hydroxybutyrate (r = -0.0938 and -0.0829, respectively; p = 0.0022 and 0.0069, respectively). Furthermore, a causal relationship was identified between acetoacetate and HbA1c levels (beta = 0.0002; p = 0.0154). GU174097_g reduced ketone body levels, thus decreasing HbA1c levels and the risk of T2D. Taken together, our findings indicate that GU174097_g may lower the risk of T2D by reducing ketone body levels. Diabetes: a little help from the microbiome A microbe that may help protect against type II diabetes has been detected by examining extracellular vesicles (EVs), tiny membrane-wrapped packages secreted by human cells and by the bacteria making up the microbiome. Examining EVs allows researchers to sample microbial populations other than the intensively studied intestinal microbiome. Sungho Won, Seoul National University, and Geum-Sook Hwang, Korea Basic Science Institute, Seoul, and coworkers studied the microbial EVs in urine samples collected from South Korean subjects over four years. They identified a previously unclassified bacterial species in the family Lachnospiraceae that was associated with lower risk of developing type II diabetes. Further investigation showed that these bacteria may break down ketone bodies, metabolic byproducts that signal disrupted sugar metabolism leading to diabetes. These results contribute to understanding how the microbiome contributes to metabolic health and disease.N

    Evaluation of a human glycated hemoglobin test in canine diabetes mellitus

    No full text
    Glycated hemoglobin A1c (HbA1c) is widely used for monitoring and diagnosing human diabetes mellitus, but is rarely used in veterinary clinics. The goal of our study was to validate the commercial HbA1c testing system SD A1cCare analyzer (Bionote, Gyeoggi-do, South Korea) for use in dogs. Dogs were recruited with owner's consent. Diabetic status was determined based on clinical signs, fasting hyperglycemia, and glycosuria. Intra-assay precision and linearity were evaluated with EDTA, heparin, or citrate as anticoagulants, and had excellent precision with mean coefficients of variation (CVs) of 2.47%, 2.26%, and 1.92%, respectively. Diluted anticoagulated blood samples showed excellent linear relationships with R-2 of 0.991, 0.996, and 0.994, respectively. Inter-assay precision revealed that the mean CV of the normal control was 2.18% and that of the high control was 2.01% (30 repeats). Observed total error of a normal control was 7.81%, and 6.12% for the high control. HbA1c level measured before and after removal of plasma and replacement by saline showed minimal interference by lipid contents (p = 0.929). The HbA1c concentrations of diabetic dogs were significantly higher than those of non-diabetic dogs (p < 0.001). HbA1c value >6.2% indicated canine diabetes through a classification and regression tree model. In most cases, fructosamine and HbA1c were highly correlated (r = 0.674, p < 0.001). The HbA1c testing system could be a valuable testing system to evaluate canine diabetes mellitus, providing an alternative in-house option for use by veterinary clinicians.Y

    : Family‐Based Rare Variant Association Test for X‐Linked Genes

    No full text
    Although the X chromosome has many genes that are functionally related to human diseases, the complicated biological properties of the X chromosome have prevented efficient genetic association analyses, and only a few significantly associated X-linked variants have been reported for complex traits. For instance, dosage compensation of X-linked genes is often achieved via the inactivation of one allele in each X-linked variant in females; however, some X-linked variants can escape this X chromosome inactivation. Efficient genetic analyses cannot be conducted without prior knowledge about the gene expression process of X-linked variants, and misspecified information can lead to power loss. In this report, we propose new statistical methods for rare X-linked variant genetic association analysis of dichotomous phenotypes with family-based samples. The proposed methods are computationally efficient and can complete X-linked analyses within a few hours. Simulation studies demonstrate the statistical efficiency of the proposed methods, which were then applied to rare-variant association analysis of the X chromosome in chronic obstructive pulmonary disease. Some promising significant X-linked genes were identified, illustrating the practical importance of the proposed methods. (C) 2016 Wiley Periodicals, Inc.N

    Heritability Analyses Uncover Shared Genetic Effects of Lung Function and Change over Time

    No full text
    Genetic influence on lung functions has been identified in previous studies; however, the relative longitudinal effects of genetic factors and their interactions with smoking on lung function remain unclear. Here, we identified the longitudinal effects of genetic variants on lung function by determining single nucleotide polymorphism (SNP) heritability and genetic correlations, and by analyzing interactions with smoking. Subject-specific means and annual change rates were calculated for eight spirometric measures obtained from 6622 Korean adults aged 40–69 years every two years for 14 years, and their heritabilities were estimated separately. Statistically significant (p < 0.05) heritability for the subject-specific means of all spirometric measures (8~32%) and change rates of forced expiratory volume in 1 s to forced vital capacity ratio (FEV(1)/FVC; 16%) and post-bronchodilator FEV(1)/FVC (17%) were detected. Significant genetic correlations of the change rate with the subject-specific mean were observed for FEV(1)/FVC ([Formula: see text] = 0.64) and post-bronchodilator FEV(1)/FVC ([Formula: see text] = 0.47). Furthermore, post-bronchodilator FEV(1)/FVC showed significant heritability of SNP-by-smoking interaction ([Formula: see text] = 0.4) for the annual change rate. The GWAS also detected genome-wide significant SNPs for FEV(1) (rs4793538), FEV(1)/FVC (rs2704589, rs62201158, and rs9391733), and post-bronchodilator FEV(1)/FVC (rs2445936). We found statistically significant evidence of heritability role on the change in lung function, and this was shared with the effects on cross-sectional measurements. We also found some evidence of interaction with smoking for the change of lung function

    Machine Learning Characterization of COPD Subtypes: Insights From the COPDGene Study

    No full text
    COPD is a heterogeneous syndrome. Many COPD subtypes have been proposed, but there is not yet consensus on how many COPD subtypes there are and how they should be defined. The COPD Genetic Epidemiology Study (COPDGene), which has generated 10-year longitudinal chest imaging, spirometry, and molecular data, is a rich resource for relating COPD phenotypes to underlying genetic and molecular mechanisms. In this article, we place COPDGene clustering studies in context with other highly cited COPD clustering studies, and summarize the main COPD subtype findings from COPDGene. First, most manifestations of COPD occur along a continuum, which explains why continuous aspects of COPD or disease axes may be more accurate and reproducible than subtypes identified through clustering methods. Second, continuous COPD-related measures can be used to create subgroups through the use of predictive models to define cut-points, and we review COPDGene research on blood eosinophil count thresholds as a specific example. Third, COPD phenotypes identified or prioritized through machine learning methods have led to novel biological discoveries, including novel emphysema genetic risk variants and systemic inflammatory subtypes of COPD. Fourth, trajectory-based COPD subtyping captures differences in the longitudinal evolution of COPD, addressing a major limitation of clustering analyses that are confounded by disease severity. Ongoing longitudinal characterization of subjects in COPDGene will provide useful insights about the relationship between lung imaging parameters, molecular markers, and COPD progression that will enable the identification of subtypes based on underlying disease processes and distinct patterns of disease progression, with the potential to improve the clinical relevance and reproducibility of COPD subtypes.N
    corecore