1,721,564 research outputs found
Bayesian selection of nucleotide substitution models and their site assignments
Probabilistic inference of a phylogenetic tree from molecular sequence data is predicated on a substitution model describing the relative rates of change between character states along the tree for each site in the multiple sequence alignment. Commonly, one assumes that the substitution model is homogeneous across sites within large partitions of the alignment, assigns these partitions a priori, and then fixes their underlying substitution model to the best-fitting model from a hierarchy of named models. Here, we introduce an automatic model selection and model averaging approach within a Bayesian framework that simultaneously estimates the number of partitions, the assignment of sites to partitions, the substitution model for each partition, and the uncertainty in these selections. This new approach is implemented as an add-on to the BEAST 2 software platform. We find that this approach dramatically improves the fit of the nucleotide substitution model compared with existing approaches, and we show, using a number of example data sets, that as many as nine partitions are required to explain the heterogeneity in nucleotide substitution process across sites in a single gene analysis. In some instances, this improved modeling of the substitution process can have a measurable effect on downstream inference, including the estimated phylogeny, relative divergence times, and effective population size histories
Phenotypic Bayesian phylodynamics : hierarchical graph models, antigenic clustering and latent liabilities
Combining models for phenotypic and molecular evolution can lead to powerful inference tools. Under the flexible framework of Bayesian phylogenetics, I develop statistical methods to address phylodynamic problems in this intersection. First, I present a hierarchical phylogeographic method that combines information across multiple datasets to draw inference on a common geographical spread process. Each dataset represents a parallel realization of this geographic process on a different group of taxa, and the method shares information between these realizations through a hierarchical graph structure. Additionally, I develop a multivariate latent liability model for assessing phenotypic correlation among sets of traits, while controlling for shared evolutionary history. This method can efficiently estimate correlations between multiple continuous traits, binary traits and discrete traits with many ordered or unordered outcomes. Finally, I present a method that uses phylogenetic information to study the evolution of antigenic clusters in influenza. The method builds an antigenic cartography map informed by the assignment of each influenza strain to one of the antigenic clusters
Recommended from our members
Integrating dynamical modeling and phylogeographic inference to characterize global influenza circulation.
Global seasonal influenza circulation involves a complex interplay between local (seasonality, demography, host immunity) and global factors (international mobility) shaping recurrent epidemic patterns. No studies so far have reconciled the two spatial levels, evaluating the coupling between national epidemics, considering heterogeneous coverage of epidemiological, and virological data, integrating different data sources. We propose a novel-combined approach based on a dynamical model of global influenza spread (GLEAM), integrating high-resolution demographic, and mobility data, and a generalized linear model of phylogeographic diffusion that accounts for time-varying migration rates. Seasonal migration fluxes across countries simulated with GLEAM are tested as phylogeographic predictors to provide model validation and calibration based on genetic data. Seasonal fluxes obtained with a specific transmissibility peak time and recurrent travel outperformed the raw air-transportation predictor, previously considered as optimal indicator of global influenza migration. Influenza A subtypes supported autumn-winter reproductive number as high as 2.25 and an average immunity duration of 2 years. Similar dynamics were preferred by influenza B lineages, with a lower autumn-winter reproductive number. Comparing simulated epidemic profiles against FluNet data offered comparatively limited resolution power. The multiscale approach enables model selection yielding a novel computational framework for describing global influenza dynamics at different scales-local transmission and national epidemics vs. international coupling through mobility and imported cases. Our findings have important implications to improve preparedness against seasonal influenza epidemics. The approach can be generalized to other epidemic contexts, such as emerging disease outbreaks to improve the flexibility and predictive power of modeling
Going Beyond Counting First Authors in Author Co-citation Analysis
The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation
counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings
are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that
only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into
account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed
Recommended from our members
Scalable Methods for Survival Analysis using Massive Observational Data
The emerging observational health data, such as electronic health records and administrative claims, provide a rich resource for learning about treatment effects and risks. However, computational challenges arise when fitting statistical models to such large-scale and high-dimensional data. In this dissertation, I employ parallel computing techniques to address the computational bottlenecks associated widely used statistical models in observational studies. First, I present a novel parallel scan algorithm to scale up the Cox proportional hazards model and the Fine-Gray model. This advancement significantly accelerates the execution of large-scale comparative effectiveness and safety studies involving millions of patients and thousands of patient characteristics by an order of magnitude. Second, I apply an efficient parallel segmented-scan algorithm to accelerate the computational intensive parts shared by the stratified Cox model, the Cox model with time-varying covariates, and the Cox model with time-varying coefficients. This innovation enables efficient large-scale and high-dimensional Cox modeling with stratification or time-varying effect, delivering an order of magnitude speedup over traditional central processing unit-based methods. Third, I introduce a memory-efficient approach for fitting pooled logistic regression models with massive sample-size data. This approach offers a valuable tool, allowing for pooled logistic regression analysis on massive sample sizes, even when computational resources are limited. I have implemented all of the above work in the open-source R package Cyclops
Recommended from our members
Genomic and transcriptomic mediators of resistance and response to antibody-drug conjugates (ADCs) in metastatic breast cancer (MBC)
Antibody-drug conjugates (ADCs) have transformed the treatment landscape for metastatic breast cancer (MBC), demonstrating significant efficacy. However, the emergence of resistance remains a substantial challenge, limiting their long-term effectiveness. This study investigates genomic and transcriptomic alterations associated with primary and acquired resistance to sacituzumab govitecan (SG), trastuzumab deruxtecan (T-DXd), and trastuzumab emtansine (T-DM1) using next-generation sequencing (NGS) data from the Tempus database. Non-paired pre- and post-treatment biopsies were analyzed to identify biomarkers of resistance, while correlations between pre-treatment profile and clinical response were assessed. High expression of drug efflux pump genes was significantly associated with shorter duration of treatment for patients receiving T-DXd and SG. Additionally, ERBB2 overexpression was linked to improved OS in the T-DXd group. Notably, no significant genomic or transcriptomic markers of resistance were identified for T-DM1. These findings enhance the understanding of molecular mechanisms underlying ADC resistance, highlighting potential predictive biomarkers and therapeutic targets. Future investigations are warranted to validate these results and explore strategies to overcome resistance, ultimately informing personalized treatment approaches in MBC
Variations on the Author
“Variations on the Author” discusses two of Eduardo Coutinho’s recent films (Um Dia na Vida, from 2010, and Últimas Conversas, posthumously released in 2015) and their contribution to the general question of documentary authorship. The director’s filmography is characterized by a consistent yet self-effacing form of authorial self-inscription: Coutinho often features as an interviewer that rather than express opinions propels discourses; an interviewer that is good at listening. This mode of self-inscription characterizes him as an author who is not expressive but who is nonetheless markedly present on the screen. In Um Dia na Vida, however, Coutinho is completely absent form the image, while Últimas Conversas, on the contrary, includes a confessional prologue that moves the director from the margins to the center of his films. This article examines the ways in which these works stand out in the filmography of a director who offers new insights into the notion of cinematic authorship
Recommended from our members
Using Smartwatch and Bluetooth Beacons to Monitor Physical Activity of Older Adults
ObjectiveWe used a novel Sensing At-Risk Population (SARP) system to monitor patients’ physical activity and locations during post-acute rehabilitation; To (1) examine the correlation between SARP measurements and standard physical (PT) and occupational therapists (OT) and nurse (RN) evaluations; (2) examine the effectiveness of SARP to discriminate discharge dispositions.MethodsParticipants were instructed to wear the smartwatch and receive physical and occupational therapy. Spearman correlations were used to determine the associations between SARP measurements and in-person evaluations. Univariate logistic regression was used to identify predictors of discharge dispositions.�ResultsSARP measurements and PT/OT/RN evaluations were correlated significantly. SARP indicated that participants were active for only 5 minutes/hour during post-acute rehabilitation. SARP significantly predicted hospital readmission (AUC>70%).ConclusionsSARP provides physical activity information during post-acute rehabilitation in real-time. Not only is SARP significantly correlated with PT/OT/RN evaluations, but it also helps to discern discharge dispositions
Appropriate Similarity Measures for Author Cocitation Analysis
We provide a number of new insights into the methodological discussion about author cocitation analysis. We first argue that the use of the Pearson correlation for measuring the similarity between authors’ cocitation profiles is not very satisfactory. We then discuss what kind of similarity measures may be used as an alternative to the Pearson correlation. We consider three similarity measures in particular. One is the well-known cosine. The other two similarity measures have not been used before in the bibliometric literature. Finally, we show by means of an example that our findings have a high practical relevance.information science;Pearson correlation;cosine;similarity measure;author cocitation analysis
- …
