1,720,970 research outputs found

    From Centralized to Federated: The journey of data in healthcare

    No full text
    Real-World Data (RWD) in healthcare holds tremendous potential, but its effective use is often hampered by significant challenges. The fragmentation of data across various healthcare systems, combined with strict privacy and legal regulations, greatly limits the ability to make RWD easily Findable, Accessible, Interoperable, and Reusable (FAIR). These challenges become even more pronounced when developing analytical models, where the issue of \emph{data centralization} presents a significant obstacle. Conventional data analysis methods typically depend on a single, unified dataset. However, for conditions like Multiple Sclerosis (MS), where the low prevalence of the disease is compounded by the dispersion of RWD across numerous repositories, the situation becomes even more complex. The variability in data formats, quality standards, and regulatory guidelines further complicates the aggregation of data. Consequently, these obstacles not only intensify the fragmentation of RWD but also hinder the generation of robust, large-scale evidence critical for advancing healthcare outcomes. Recognizing the gaps and needs in current practices, this thesis advocates for a shift towards federated data analysis as a core strategy. This approach enables the examination of distributed datasets without requiring centralization, thus preserving data privacy and integrity. Although federated analysis shows enormous potential as a solution to overcome many of the challenges associated with RWD, its widespread adoption is still limited. This is primarily due to the complexities involved in its practical implementation, amplified by the multidisciplinary nature of its domain and the heterogeneous characteristics of RWD. In this thesis, we outline the evolution of data management pipelines, transitioning from a centralized to a fully federated framework, and introduce three foundational pillars where technical solutions are combined with clinical perspectives aiming to advance the use of sophisticated yet inclusive, privacy-aware, but pragmatic technologies in healthcare. The thesis introduces its first pillar: a comprehensive, research-agnostic hybrid data managment pipeline. This pipeline is designed to support the integration of diverse data sources and formats, facilitating a more inclusive and practical approach to data analysis. This pipeline was effectively implemented in the Global Data Sharing Initiative for COVID-19 and MS, leading to the collection of the largest cohort of MS and COVID-19, showcasing the potential of collaborative, evidence-based healthcare advancements. As the second pillar of this thesis, the ``Federated Learning For Everyone'' framework is presented. This framework empowers its diverse stakeholders to more effectively leverage RWD through an adaptable and inclusive federated data analysis ecosystem. In addition, the framework introduces the novel concept of the `degree of federation', which allows for flexible adjustments between data centralization and decentralization to meet specific healthcare needs. Finally, the thesis explores the pioneering application of federated data analysis in MS research. It utilizes routine clinical data to evaluate the effectiveness of federated analysis in predicting disability progression, employing one of the largest available cohorts of people with MS. This evaluation includes assessing various federated configurations and optimizing models to demonstrate that federated analysis is a robust alternative to conventional centralized approaches. Additionally, the thesis proposes novel federated modeling techniques to enhance federation performance, further highlighting the potential of federated analysis in complex healthcare research settings. Overall, this thesis underscores the necessity and benefits of transitioning towards inclusive federated data management in healthcare by addressing critical gaps and leveraging pragmatic, privacy-aware technologies. This approach paves the way for broader adoption and advocate for impactful innovations in the field, highlighting the significant potential for enhancing healthcare research and practice

    Unlocking the Power of Real-World Data: A Framework for Sustainable Healthcare

    No full text
    Real-world data (RWD) has the potential to revolutionize healthcare by offering valuable insights into patient outcomes and treatment efficacy. However, leveraging RWD effectively presents challenges, including its inherent limitations, diverse stakeholders, and insufficient data management pipelines. A proposed framework advocates three essential elements: adherence to FAIR principles (Findable, Accessible, Interoperable, and Reusable), stakeholder engagement and education, and highlighting the need for inclusive, pragmatic federated hybrid pipelines. By employing these strategies, healthcare organizations can overcome obstacles to RWD utilization and foster sustainable progress in patient care

    The Journey of Data Within a Global Data Sharing Initiative: A Federated 3-Layer Data Analysis Pipeline to Scale Up Multiple Sclerosis Research

    No full text
    Background: Investigating low-prevalence diseases such as multiple sclerosis is challenging because of the rather small number of individuals affected by this disease and the scattering of real-world data across numerous data sources. These obstacles impair data integration, standardization, and analysis, which negatively impact the generation of significant meaningful clinical evidence.Objective: This study aims to present a comprehensive, research question-agnostic, multistakeholder-driven end-to-end data analysis pipeline that accommodates 3 prevalent data-sharing streams: individual data sharing, core data set sharing, and federated model sharing.Methods: A demand-driven methodology is employed for standardization, followed by 3 streams of data acquisition, a data quality enhancement process, a data integration procedure, and a concluding analysis stage to fulfill real-world data-sharing requirements. This pipeline's effectiveness was demonstrated through its successful implementation in the COVID-19 and multiple sclerosis global data sharing initiative.Results: The global data sharing initiative yielded multiple scientific publications and provided extensive worldwide guidance for the community with multiple sclerosis. The pipeline facilitated gathering pertinent data from various sources, accommodating distinct sharing streams and assimilating them into a unified data set for subsequent statistical analysis or secure data examination. This pipeline contributed to the assembly of the largest data set of people with multiple sclerosis infected with COVID-19.Conclusions: The proposed data analysis pipeline exemplifies the potential of global stakeholder collaboration and underlines the significance of evidence-based decision-making. It serves as a paradigm for how data sharing initiatives can propel advancements in health care, emphasizing its adaptability and capacity to address diverse research inquiries.The author(s) have disclosed that they received financial support for the research, authorship, or publication of this paper from the following sources: the operational costs associated with this study were funded by the Multiple Sclerosis International Federation and the Multiple Sclerosis Data Alliance (MSDA) operating under the European Charcot Foundation. The MSDA is a global not-for-profit multistakeholder collaboration acting under the umbrella of the European Charcot Foundation, financially supported by a combination of industry partners, including Novartis, Merck, Biogen, Janssen, Bristol-Myers Squibb, and Roche. Additionally, this work was supported by the Flemish government through the Onderzoeksprogramma Artificiële Intelligentie Vlaanderen program and the Research Foundation Flanders for ELIXIR Belgium. QMENTA provided the central platform, while Amazon supplied the computational resources utilized in this work. The statistical analysis was conducted at the Clinical Outcomes Research Unit, The University of Melbourne, with support from National Health and Medical Research Council (1129189 and 1140766). The authors wish to extend their sincere appreciation to Nikola Lazovski for his invaluable guidance and collaboration throughout the global data sharing initiative project, especially concerning the central platform. They are also profoundly grateful to Dr Ilse Vermeulen for her unwavering support and encouragement throughout the various stages of drafting and conceptualizing the manuscript

    Accessible Ecosystem for Clinical Research (Federated Learning for Everyone): Development and Usability Study

    No full text
    Background: The integrity and reliability of clinical research outcomes rely heavily on access to vast amounts of data. However, the fragmented distribution of these data across multiple institutions, along with ethical and regulatory barriers, presents significant challenges to accessing relevant data. While federated learning offers a promising solution to leverage insights from fragmented data sets, its adoption faces hurdles due to implementation complexities, scalability issues, and inclusivity challenges. Objective: This paper introduces Federated Learning for Everyone (FL4E), an accessible framework facilitating multistakeholder collaboration in clinical research. It focuses on simplifying federated learning through an innovative ecosystem-based approach. Methods: The “degree of federation” is a fundamental concept of FL4E, allowing for flexible integration of federated and centralized learning models. This feature provides a customizable solution by enabling users to choose the level of data decentralization based on specific health care settings or project needs, making federated learning more adaptable and efficient. By using an ecosystem-based collaborative learning strategy, FL4E encourages a comprehensive platform for managing real-world data, enhancing collaboration and knowledge sharing among its stakeholders. Results: Evaluating FL4E’s effectiveness using real-world health care data sets has highlighted its ecosystem-oriented and inclusive design. By applying hybrid models to 2 distinct analytical tasks—classification and survival analysis—within real-world settings, we have effectively measured the “degree of federation” across various contexts. These evaluations show that FL4E’s hybrid models not only match the performance of fully federated models but also avoid the substantial overhead usually linked with these models. Achieving this balance greatly enhances collaborative initiatives and broadens the scope of analytical possibilities within the ecosystem. Conclusions: FL4E represents a significant step forward in collaborative clinical research by merging the benefits of centralized and federated learning. Its modular ecosystem-based design and the “degree of federation” feature make it an inclusive, customizable framework suitable for a wide array of clinical research scenarios, promising to revolutionize the field through improved collaboration and data use. Detailed implementation and analyses are available on the associated GitHub repository

    The Multiple Sclerosis Data Alliance Catalogue

    No full text
    Abstract Background: One of the major objectives of the Multiple Sclerosis Data Alliance (MSDA) is to enable better discovery of multiple sclerosis (MS) real-world data (RWD). Methods: We implemented the MSDA Catalogue, which is available worldwide. The current version of the MSDA Catalogue collects descriptive information on governance, purpose, inclusion criteria, procedures for data quality control, and how and which data are collected, including the use of e-health technologies and data on collection of COVID-19 variables. The current cataloguing procedure is performed in several manual steps, securing an effective catalogue. Results: Herein we summarize the status of the MSDA Catalogue as of January 6, 2021. To date, 38 data sources across five continents are included in the MSDA Catalogue. These data sources differ in purpose, maturity, and variables collected, but this landscaping effort shows that there is substantial alignment on some domains. The MSDA Catalogue shows that personal data and basic disease data are the most collected categories of variables, whereas data on fatigue measurements and cognition scales are the least collected in MS registries/cohorts. Conclusions: The Web-based MSDA Catalogue provides strategic overview and allows authorized end users to browse metadata profiles of data cohorts and data sources. There are many existing and arising RWD sources in MS. Detailed cataloguing of MS RWD is a first and useful step toward reducing the time needed to discover MS RWD sets and promoting collaboration

    Going Beyond Counting First Authors in Author Co-citation Analysis

    Full text link
    The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed

    Variations on the Author

    Full text link
    “Variations on the Author” discusses two of Eduardo Coutinho’s recent films (Um Dia na Vida, from 2010, and Últimas Conversas, posthumously released in 2015) and their contribution to the general question of documentary authorship. The director’s filmography is characterized by a consistent yet self-effacing form of authorial self-inscription: Coutinho often features as an interviewer that rather than express opinions propels discourses; an interviewer that is good at listening. This mode of self-inscription characterizes him as an author who is not expressive but who is nonetheless markedly present on the screen. In Um Dia na Vida, however, Coutinho is completely absent form the image, while Últimas Conversas, on the contrary, includes a confessional prologue that moves the director from the margins to the center of his films. This article examines the ways in which these works stand out in the filmography of a director who offers new insights into the notion of cinematic authorship

    Appropriate Similarity Measures for Author Cocitation Analysis

    Full text link
    We provide a number of new insights into the methodological discussion about author cocitation analysis. We first argue that the use of the Pearson correlation for measuring the similarity between authors’ cocitation profiles is not very satisfactory. We then discuss what kind of similarity measures may be used as an alternative to the Pearson correlation. We consider three similarity measures in particular. One is the well-known cosine. The other two similarity measures have not been used before in the bibliometric literature. Finally, we show by means of an example that our findings have a high practical relevance.information science;Pearson correlation;cosine;similarity measure;author cocitation analysis

    Dispelling the Myths Behind First-author Citation Counts

    Full text link
    We conducted a full-scale evaluative citation analysis study of scholars in the XML research field to explore just how different from each other author rankings resulting from different citation counting methods actually are, and to demonstrate the capability of emerging data and tools on the Web in supporting more realistic citation counting methods. Our results contest some common arguments for the continued use of first-author citation counts in the evaluation of scholars, such as high correlations between author rankings by first-author citation counts and other citation counting methods, and high costs of using more realistic citation counting methods that are not well-supported by the ISI databases. It is argued that increasingly available digital full text research papers make it possible for citation analysis studies to go beyond what the ISI databases have directly supported and to employ more sophisticated methods
    corecore