1,721,062 research outputs found
The arisal of data spaces: why I am excited and worried
This paper explores the significant role of real-world data (RWD) in advancing our understanding and management of Multiple Sclerosis (MS). RWD has proven invaluable in MS research and care, offering insights from larger and diverse patient populations. A key focus of the paper is the European Health Data Space (EHDS), a significant development that promises to change how healthcare data is managed across Europe. This initiative is particularly relevant to the MS community. The paper highlights various data initiatives, discussing their importance for those affected by MS. Despite the potential benefits, there are challenges and concerns, especially about ensuring that the growth of various data platforms remains beneficial for MS patients. The paper suggests practical actions for the global MS community to consider, aimed at optimizing the use of RWD. The emphasis of this discussion is on the secondary use of health data, particularly in the European context. The content is based on the author's own experiences and interpretations, offering a personal yet informed view on using RWD to improve MS research and patient care.The authors acknowledge the assistance of ChatGPT4, an AI language model developed by OpenAI, for its support in structuring and refining the content of this paper
Strategic Oversight Across Real-World Health Data Initiatives in a Complex Health Data Space: A Call for Collective Responsibility
Reusing real-world health data is useful, but challenging. Multiple initiatives exist and more are continuously arising to overcome these challenges, but the strategic oversight across these initiatives is lacking, which leads to a fragmented ecosystem. An overview of which initiatives that work on unlocking real-world health data, making this data accessible for research and/or innovation and/or policy and getting an idea about which aspect of the ecosystem the initiatives are working on would be very helpful. It could help in figuring out how initiatives can work in synergy in order that consortia can be formed more efficiently. We tried to create an overview, resulting in a static list, but have thereby run into many problems and difficulties and have noticed that the information is even more scattered than expected, and often ambiguous and unclear. This paper highlights the need for strategic oversight in our complex health data space, defines key challenges and focuses on solutions and strategies for overcoming these challenges, and aims to guide the future of health data research and innovation on a global scale, offering a valuable resource for stakeholders in the field.We would like to thank all the experts who
generously contributed their insights and expertise
during the interviews. Their valuable input has been
instrumental in shaping our thoughts formulated in
this position paper. This work was supported by
Research Foundation - Flanders (FWO) for ELIXIR
Belgium (I000323N)
Unlocking the Power of Real-World Data: A Framework for Sustainable Healthcare
Real-world data (RWD) has the potential to revolutionize healthcare by offering valuable insights into patient outcomes and treatment efficacy. However, leveraging RWD effectively presents challenges, including its inherent limitations, diverse stakeholders, and insufficient data management pipelines. A proposed framework advocates three essential elements: adherence to FAIR principles (Findable, Accessible, Interoperable, and Reusable), stakeholder engagement and education, and highlighting the need for inclusive, pragmatic federated hybrid pipelines. By employing these strategies, healthcare organizations can overcome obstacles to RWD utilization and foster sustainable progress in patient care
Schema Matching with Large Language Models: an Experimental Study
Large Language Models (LLMs) have shown useful applications in a variety of tasks, including data wrangling. In this paper, we investigate the use of an off-the-shelf LLM for schema matching. Our objective is to identify semantic correspondences between elements of two relational schemas using only names and descriptions. Using a newly created benchmark from the health domain, we propose different so-called task scopes. These are methods for prompting the LLM to do schema matching, which vary in the amount of context information contained in the prompt. Using these task scopes we compare LLM-based schema matching against a string similarity baseline, investigating matching quality,
verification effort, decisiveness, and complementarity of the approaches. We find that matching quality suffers from a lack of context information, but also from providing too much context information. In general, using newer LLM versions increases decisiveness. We identify task scopes that have acceptable verification effort and succeed in identifying a significant number of true semantic matches. Our study shows that LLMs have potential in bootstrapping the schema matching process and are able to assist data engineers in speeding up this task solely based on schema element names and descriptions without the need for data instances.S. Vansummeren was supported by the Bijzonder Onderzoeksfonds (BOF) of Hasselt University under Grant No. BOF20ZAP02. This research received funding from the Flemish Government under the “Onderzoeksprogramma Artificiële Intelligentie (AI) Vlaanderen” programme. This work was supported by Research Foundation—Flanders(FWO)forELIXIRBelgium(I002819N).Theresources and services used in this work were provided by the VSC (Flemish Supercomputer Center), funded by the Research FoundationFlanders (FWO) and the Flemish Government
Schema Matching with Large Language Models: an Experimental Study
Large Language Models (LLMs) have shown useful applications in a variety of tasks, including data wrangling. In this paper, we investigate the use of an off-the-shelf LLM for schema matching. Our objective is to identify semantic correspondences between elements of two relational schemas using only names and descriptions. Using a newly created benchmark from the health domain, we propose different so-called task scopes. These are methods for prompting the LLM to do schema matching, which vary in the amount of context information contained in the prompt. Using these task scopes we compare LLM-based schema matching against a string similarity baseline, investigating matching quality,
verification effort, decisiveness, and complementarity of the approaches. We find that matching quality suffers from a lack of context information, but also from providing too much context information. In general, using newer LLM versions increases decisiveness. We identify task scopes that have acceptable verification effort and succeed in identifying a significant number of true semantic matches. Our study shows that LLMs have potential in bootstrapping the schema matching process and are able to assist data engineers in speeding up this task solely based on schema element names and descriptions without the need for data instances.S. Vansummeren was supported by the Bijzonder Onderzoeksfonds (BOF) of Hasselt University under Grant No. BOF20ZAP02. This research received funding from the Flemish Government under the “Onderzoeksprogramma Artificiële Intelligentie (AI) Vlaanderen” programme. This work was supported by Research Foundation—Flanders(FWO)forELIXIRBelgium(I002819N).Theresources and services used in this work were provided by the VSC (Flemish Supercomputer Center), funded by the Research FoundationFlanders (FWO) and the Flemish Government
Motor evoked potentials for multiple sclerosis, a multiyear follow-up dataset
Multiple sclerosis (MS) is a chronic disease affecting millions of people worldwide. Through the demyelinating and axonal pathology of MS, the signal conduction in the central nervous system is affected. Evoked potential measurements allow clinicians to monitor this process and can be used for decision support. We share a dataset that contains motor evoked potential (MEP) measurements, in which the brain is stimulated and the resulting signal is measured in the hands and feet. This results in time series of 100 milliseconds long. Typically, both hands and feet are measured in one hospital visit. The dataset contains 5586 visits of 963 patients, performed in day-to-day clinical care over a period of 6 years. The dataset consists of approximately 100,000 MEP. Clinical metadata such as the expanded disability status scale, sex, and age is also available. This dataset can be used to explore the role of evoked potentials in MS research and patient care. It may also be used as a benchmark for time series analysis and predictive modelling
Empowering Health Care Actors to Contribute to the Implementation of Health Data Integration Platforms: Retrospective of the medEmotion Project
Health data integration platforms are vital to drive collaborative, interdisciplinary medical research projects. Developing such a platform requires input from different stakeholders. Managing these stakeholders and steering platform development is challenging, and misaligning the platform to the partners' strategies might lead to a low acceptance of the final platform. We present the medEmotion project, a collaborative effort among 7 partners from health care, academia, and industry to develop a health data integration platform for the region of Limburg in Belgium. We focus on the development process and stakeholder engagement, aiming to give practical advice for similar future efforts based on our reflections on medEmotion. We introduce Personas to paraphrase different roles that stakeholders take and Demonstrators that summarize personas' requirements with respect to the platform. Both the personas and the demonstrators serve 2 purposes. First, they are used to define technical requirements for the medEmotion platform. Second, they represent a communication vehicle that simplifies discussions among all stakeholders. Based on the personas and demonstrators, we present the medEmotion platform based on components from the Microsoft Azure cloud. The demonstrators are based on real-world use cases and showcase the utility of the platform. We reflect on the development process of medEmotion and distill takeaway messages that will be helpful for future projects. Investing in community building, stakeholder engagement, and education is vital to building an ecosystem for a health data integration platform. Instead of academic-led projects, the health care providers themselves ideally drive collaboration among health care providers. The providers are best positioned to address hospital-specific requirements, while academics take a neutral mediator role. This also includes the ideation phase, where it is vital to ensure the involvement of all stakeholders. Finally, balancing innovation with implementation is key to developing an innovative yet sustainable health data integration platform.We thank our 3 partner hospitals Jessa Ziekenhuis, Noorderhart, and Ziekenhuis Oost-Limburg for their contributions toward the medEmotion project. Further, we thank the Limburg Clinical Research Center for sharing their expertise in clinical research projects. The software development of the medEmotion platform was funded by LRM, with the support of the European Regional Development Fund (EFRO-1308). This research received funding from the Flemish Government under the “Onderzoeksprogramma Artificiële Intelligentie (AI) Vlaanderen” program
Patient level dataset to study the effect of COVID-19 in people with Multiple Sclerosis
Abstract Multiple Sclerosis (MS) is an inflammatory autoimmune disease of the central nervous system, causing increased vulnerability to infections and disability among young adults. Ever since the outbreak of coronavirus disease 2019 (COVID-19), caused by severe acute respiratory syndrome coronavirus 2 infections, there have been concerns among people with MS (PwMS) about the potential interactions between various disease-modifying therapies and COVID-19. The COVID-19 in MS Global Data Sharing Initiative (GDSI) was initiated in 2020 with the aim of addressing these concerns. This paper focuses on the anonymisation and publicly releasing of a GDSI sub-dataset, comprising data entered by PwMS and clinicians using a fast data entry tool. The dataset includes information on demographics, comorbidities and hospital stay and COVID-19 symptoms of PwMS. The dataset can be used to perform different statistical analyses to improve our understanding of COVID-19 in MS. Furthermore, this dataset can also be used within the context of educational activities to educate different stakeholders on the complex data science topics that were used within the GDSI
Exploring the Correlation between Disability Status and Brain Volumetric Measurements Using Real-World Retrospective Magnetic Resonance Images in People with Multiple Sclerosis
Flemish Government; Stichting Multiple Sclerosis Research [19-1040 MS];Bijzonder OnderzoeksFonds [BOF19DOCMA10
- …
