1,720,996 research outputs found

    learnMET: an R package to apply machine learning methods for genomic prediction using multi-environment trial data

    No full text
    We introduce the R-package learnMET, developed as a flexible framework to enable a collection of analyses on multi-environment trial breeding data with machine learning-based models. learnMET allows the combination of genomic information with environmental data such as climate and/or soil characteristics. Notably, the package offers the possibility of incorporating weather data from field weather stations, or to retrieve global meteorological datasets from a NASA database. Daily weather data can be aggregated over specific periods of time based on naive (for instance, nonoverlapping 10-day windows) or phenological approaches. Different machine learning methods for genomic prediction are implemented, including gradient-boosted decision trees, random forests, stacked ensemble models, and multilayer perceptrons. These prediction models can be evaluated via a collection of cross-validation schemes that mimic typical scenarios encountered by plant breeders working with multi-environment trial experimental data in a user-friendly way. The package is published under an MIT license and accessible on GitHub

    Imputation of low-density marker chip data in plant breeding : Evaluation of methods based on sugar beet

    No full text
    Low-density genotyping followed by imputation reduces genotyping costs while still providing high-density marker information. An increased marker density has the potential to improve the outcome of all applications that are based on genomic data. This study investigates techniques for 1k to 20k genomic marker imputation for plant breeding programs with sugar beet (Beta vulgaris L. ssp. vulgaris) as an example crop, where these are realistic marker numbers for modern breeding applications. The generally accepted ‘gold standard’ for imputation, Beagle 5.1, was compared with the recently developed software AlphaPlantImpute2 which is designed specifically for plant breeding. For Beagle 5.1 and AlphaPlantImpute2, the imputation strategy as well as the imputation parameters were optimized in this study. We found that the imputation accuracy of Beagle could be tremendously improved (0.22 to 0.67) by tuning parameters, mainly by lowering the values for the parameter for the effective population size and increasing the number of iterations performed. Separating the phasing and imputation steps also improved accuracies when optimized parameters were used (0.67 to 0.82). We also found that the imputation accuracy of Beagle decreased when more low-density lines were included for imputation. AlphaPlantImpute2 produced very high accuracies without optimization (0.89) and was generally less responsive to optimization. Overall, AlphaPlantImpute2 performed relatively better for imputation whereas Beagle was better for phasing. Combining both tools yielded the highest accuracies

    An R Framework for the Partitioning of Linkage Disequilibrium between and Within Populations

    Full text link
    Patterns of linkage disequilibrium (LD) across the genome result from a myriad of contributing factors including selection and genetic drift. Natural selection can increase LD near individually selected loci, or it can influence LD between epistatically selected groups of loci. Statistics have previously been derived which compare levels of linkage disequilibrium in subpopulations relative to the total population. These statistics may be leveraged to identify loci that may be under selection or epistatic selection. This is a powerful approach, but to date no framework exists to support its use on a genome-wide scale. We present ohtadstats, an R package designed to facilitate the implementation of Ohta’s D statistics in a variety of use cases. Statistics calculated by this package can be used to determine whether a locus is under selection or not, and can provide insight into the nature of the selection that is taking place (hard sweep or epistatic selection). This package is available on the Comprehensive R Archive Network (CRAN).   Funding statement: This research was supported by funding from the USDA Agricultural Research Service. PFP is funded by the University of Missouri Life Sciences Fellowship and a training grant from the National Institute of Health (T32GM008396)

    Comparing Different Statistical Models and Multiple Testing Corrections for Association Mapping in Soybean and Maize

    No full text
    Association mapping (AM) is a powerful tool for fine mapping complex trait variation down to nucleotide sequences by exploiting historical recombination events. A major problem in AM is controlling false positives that can arise from population structure and family relatedness. False positives are often controlled by incorporating covariates for structure and kinship in mixed linear models (MLM). These MLM-based methods are single locus models and can introduce false negatives due to over fitting of the model. In this study, eight different statistical models, ranging from single-locus to multilocus, were compared for AM for three traits differing in heritability in two crop species: soybean (Glycine max L.) and maize (Zea mays L.). Soybean and maize were chosen, in part, due to their highly differentiated rate of linkage disequilibrium (LD) decay, which can influence false positive and false negative rates. The fixed and random model circulating probability unification (FarmCPU) performed better than other models based on an analysis of Q-Q plots and on the identification of the known number of quantitative trait loci (QTLs) in a simulated data set. These results indicate that the FarmCPU controls both false positives and false negatives. Six qualitative traits in soybean with known published genomic positions were also used to compare these models, and results indicated that the FarmCPU consistently identified a single highly significant SNP closest to these known published genes. Multiple comparison adjustments (Bonferroni, false discovery rate, and positive false discovery rate) were compared for these models using a simulated trait having 60% heritability and 20 QTLs. Multiple comparison adjustments were overly conservative for MLM, CMLM, ECMLM, and MLMM and did not find any significant markers; in contrast, ANOVA, GLM, and SUPER models found an excessive number of markers, far more than 20 QTLs. The FarmCPU model, using less conservative methods (false discovery rate, and positive false discovery rate) identified 10 QTLs, which was closer to the simulated number of QTLs than the number found by other models

    Fostering Active Learning in an International Joint Classroom: A Case Study

    Full text link
    Engaging students in an international online setting that is interdisciplinary and culturally diverse is a challenge. A joint classroom between German and Ugandan universities used a formative assessment approach paired with active learning elements to foster individual and peer learning in an international virtual setting. A survey at three different times across the semester explored students’ perceptions towards the value of the active learning activities and evaluated how perceptions changed over time. Overall, students enjoyed the diverse active learning activities and perceived value toward their success in class. This was more pronounced and unidirectional for individual tasks than it was for group work. In addition to the findings of the structured survey, observation and feedback indicated that other elements contributed to effective course delivery. These included clear and frequent communication to the students from the primary instructor, prompt feedback from the instructor on graded exercises, such as a reflective learning diary and ungraded quizzes, and student confidence that sincere effort would achieve a good grade

    Ghat: An R package for identifying adaptive polygenic traits

    No full text
    Abstract Identifying selection on polygenic complex traits in crops and livestock is important for understanding evolution and helps prioritize important characteristics for breeding. QTL that contribute to polygenic trait variation often exhibit small or infinitesimal effects. This hinders the ability to detect QTL controlling polygenic traits because enormously high statistical power is needed for their detection. Recently, we circumvented this challenge by introducing a method to identify selection on complex traits by evaluating the relationship between genome-wide changes in allele frequency and estimates of effect-size. The approach involves calculating a composite-statistic across all markers that captures this relationship, followed by implementing a linkage disequilibrium-aware permutation test to evaluate if the observed pattern differs from that expected due to drift during evolution and population stratification. In this manuscript, we describe “Ghat”, an R package developed to implement this method to test for selection on polygenic traits. We demonstrate the package by applying it to test for polygenic selection on 15 published European wheat traits including yield, biomass, quality, morphological characteristics, and disease resistance traits. Moreover, we applied Ghat to different simulated populations with different breeding history and genetic architecture. The results highlight the power of Ghat to identify selection on complex traits. The Ghat package is accessible on CRAN, the Comprehensive R Archival Network, and on GitHub.Open-Access-Publikationsfonds 202

    Genetic diversity of pea (Pisum sativum L.) genotypes differing in leaf type using SNP markers

    No full text
    Abstract A collection of 46 pea ( Pisum sativum L.) accessions, mostly from Europe, were analysed for genetic diversity using the GenoPea 13.2 K SNP Array chip. Of these accessions were 24 nomal-leaved and 22 semi-leafless. Principal components analysis (PCA) separated the peas into two groups characterized by the two different leaf types, although some genotypes were exceptions and appeared in the opposite group. Cluster analysis confirmed the two groups. A dendrogram showed larger genetic distances between genotypes in the normal-leafed group compared to semi-leafless genotypes. Both PCA and cluster analysis show that the two leave types are genetically divergent. So normal-leaved peas are an interesting genetic resource, even if the breeding goal is to develop semi-leafless varieties.DAAD http://dx.doi.org/10.13039/501100001655Georg-August-Universität Göttingen http://dx.doi.org/10.13039/50110000338

    Individual plant genetics reveal the control of local adaptation in European maize landraces

    No full text
    Abstract Background European maize landraces encompass a large amount of genetic diversity, allowing them to be well-adapted to their local environments. This diversity can be exploited to improve the fitness of elite material in the face of a changing climate. Results We characterized the genetic diversity of 333 individual plants from 40 European maize landrace populations (EMLPs). We identified five genetic groups that mirrored the proximities of their geographical origins. Fixation indices showed moderate differentiation among genetic groups (0.034 to 0.093). More than half of the genetic variance was observed to be partitioned among individuals. Nucleotide diversity of EMLPs decreased significantly as latitude increased (from 0.16 to 0.04), suggesting serial founder events during maize expansion in Europe. GWAS with latitude, longitude, and elevation as response variables identified 28, 347, and 68 significant SNP positions, respectively. We pinpointed significant SNPs near dwarf8, tb1, ZCN7, ZCN8, and ZmMADS69 and identified 126 candidate genes with ontology terms indicative of local adaptation in maize, regulating adaptation to diverse abiotic and biotic environmental stresses. Conclusions This study suggests a quick and cost-efficient approach to identifying genes involved in local adaptation without requiring field data. The EMLPs used in this study have been assembled to serve as a continuing resource of genetic diversity for further research aimed at improving agronomically relevant adaptation traits.Open-Access-Publikationsfonds 202
    corecore