1,721,008 research outputs found
Innovations in monitoring and data quality control in clinical trials
Clinical trials serve a key role in drug development by providing scientific knowledge on the risks and benefits of medical treatments or therapies. Conducting a clinical trial is a time-consuming endeavor which typically requires a large financial budget. It is therefore important to critically reflect on how resources are being spent. Indeed, the efficiency of clinical trial conduct is becoming an increasingly active topic of research and discussion, in view of rising healthcare costs. Estimates suggests that a considerable portion (15 to 30 percent) of a phase III clinical trial budget is spent on the monitoring of the local investigators (i.e. the treating physicians) and their teams. Traditionally, this is the responsibility of a Clinical Research Associate (CRA), who is employed by the sponsor or contract research organization. The CRA visits the local centers on regular intervals and spends a large portion of his/her time comparing all submitted data against the source data, correcting any discrepancy that is detected. This thesis examines and further develops alternative monitoring and trial management strategies which are aimed to conduct clinical trials more efficiently, while at the same time preserving or even improving data quality. Mainly, it focuses on methods that reduce the reliance on the physical presence of the CRA on the local centers and make better use of centrally available data, and it reflects on specific aspects of trial management for which the effectiveness has been the subject of discussion. After a general introduction, the thesis first focuses on the use of statistical sampling methodology to reduce the effort of source data verification. Next, it examines how central statistical monitoring can be used for the detection of data fabrication (i.e. possible fraud). Third, to improve selection and management of clinical trial sites, it is evaluated to which extent center-level information can predict their performance in terms of meeting recruitment targets. Furthermore, the impact of outcome misclassification is assessed, and the value of so-called adjudication to overcome this problem is critically evaluated. As a last topic, the thesis addresses application of so-called ‘risk-based monitoring’ procedures in the context of pragmatic trials. The thesis ends with a general discussion and framework on the topic of clinical trial monitoring
Transfusion data: from collection to reflection
Blood transfusion is an important medical treatment for many and diverse patients groups, saving lives but sometimes also causing adverse transfusion reactions in transfusion recipients. For this reason blood use should ideally be as low as possible. The fact that significant differences exist in the amount of blood used between countries, hospitals and even within hospitals, indicates that there is room for improvement. Moreover, there are likely to exist unrecognized risk factors in donors and blood products that might affect patient outcomes. In order to study these various aspects and the interplay between them, data on the complete transfusion chain are needed. Therefore the Dutch Transfusion Data warehouse (DTD) was set up, in which data from the national blood bank and a (growing) number of Dutch hospitals are linked. These data have a broad range of applications: identifying risk factors, predicting future blood use, benchmarking blood use, and optimizing process efficiency. A structured stepwise approach is applied to validate data quality, addressing external validity (e.g. concordance with external reports, previous studies and expert feedback) and internal validity (e.g. completeness, uniformity and plausibility). In addition, an algorithm is developed to identify –out of all diagnostic and procedural data available– the most likely indication (i.e. reason) for transfusion. The algorithm was evaluated against a gold standard based on expert review and adjusted accordingly. The final algorithm was able to predict the majority of cases correctly (about 75%). However, before implementation of the algorithm it should be optimized and externally validated in independent hospital datasets. New hospitals are included in the DTD continuously. Different strategies for selecting hospitals for inclusion in the DTD are simulated to compare their effect on representativeness for the Netherlands. The ‘maximum varation’ strategy, which is to include hospitals that differ from each other maximally (the smallest and largest hospitals), resulted in highest representativeness. Finally, analyses of donor and patient data show trends in blood use and in the donor population composition. Over the past 20 years, the use of red blood cell units (RBCs) decreased. Retrospective analysis of various patient revealed that RBC use changed from largely surgical to predominantly medical blood use, suggesting a more restrictive transfusion policy for surgical patients as well as an increase in medical indications for transfusion. A special group of donors provide antibdies that are required for ReshusD (RhD)-negative women pregnant with a RhD-positive child in order to prevent hemolytic disease of the newborn. Due to the success of the RhD prevention program, the number of naturally immunized women has decreased, thereby also reducing the number of potential donors. Various donor recruitment scenarios were compared by simulating donor population size and age using data on Dutch anti-RhD donors in 1994-2013. This relatively simple simulation model could sufficiently accurately describe and predict the size of the anti-RhD donor population and the impact of ageing. Recommendations for research topics and possible extensions of the DTD offer a perspective on future applications of the data following the process from collection to reflection
Going Beyond Counting First Authors in Author Co-citation Analysis
The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation
counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings
are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that
only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into
account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed
Variations on the Author
“Variations on the Author” discusses two of Eduardo Coutinho’s recent films (Um Dia na Vida, from 2010, and Últimas Conversas, posthumously released in 2015) and their contribution to the general question of documentary authorship. The director’s filmography is characterized by a consistent yet self-effacing form of authorial self-inscription: Coutinho often features as an interviewer that rather than express opinions propels discourses; an interviewer that is good at listening. This mode of self-inscription characterizes him as an author who is not expressive but who is nonetheless markedly present on the screen. In Um Dia na Vida, however, Coutinho is completely absent form the image, while Últimas Conversas, on the contrary, includes a confessional prologue that moves the director from the margins to the center of his films. This article examines the ways in which these works stand out in the filmography of a director who offers new insights into the notion of cinematic authorship
Appropriate Similarity Measures for Author Cocitation Analysis
We provide a number of new insights into the methodological discussion about author cocitation analysis. We first argue that the use of the Pearson correlation for measuring the similarity between authors’ cocitation profiles is not very satisfactory. We then discuss what kind of similarity measures may be used as an alternative to the Pearson correlation. We consider three similarity measures in particular. One is the well-known cosine. The other two similarity measures have not been used before in the bibliometric literature. Finally, we show by means of an example that our findings have a high practical relevance.information science;Pearson correlation;cosine;similarity measure;author cocitation analysis
Predicting an optimal function for diagnostic and prognostic analyses with gene expression data
The completion of the human genome and the advancement of high-throughput technologies have enable the quantification of thousands of genes for precision medicine. The problem with gene expression data is that the number of genes (as variables) greatly supersedes the number of samples thereby rendering regular statistical models that often require large number of samples than variables, unsatisfactory. As such, several machine learning algorithms (functions) have been proposed in the literature to tackle this problem of small sample size relative to the number of variables, commonly referred to as the “curse of dimensionality”. Nevertheless, there is no universal algorithm that performs best on all gene expression data. The varying performance of these algorithms across datasets is a clear indication that there are data characteristics associated with the performance of the algorithms. In order to determine an optimal function for a given gene expression data, several algorithms are often compared and the one with the smallest cross-validated error is chosen. This approach leads to selection bias because an algorithm might have the smallest cross-validated error by chance. To combat this, a number of selection bias correction methods have been proposed but no such method is guaranteed to be effective when several least optimal functions are compared on a dataset with a small sample size. Alternative approaches utilize combined input of several algorithms (super-learners) with the goal of improving predictions. Such approaches are hardly accepted in medical applications because they are often considered black boxes whose models are hard to interpret and they utilize the entire genome instead of a selected profile, making practical application time consuming and costly. Hence traditional algorithms that can perform variable selection to yield a gene signature and interpretable models are often preferred over super-learners but the question which of these algorithms is optimal for a given data remains unanswered. In this thesis, we have identified gene expression data characteristics that associate with the performance of often used traditional machine learning algorithms, using publicly available microarray gene expression data. With the identified data characteristics we systematically varied the variables and assessed their associative effects to the performance of these functions using simulations. Additionally, we analyzed our simulated results to provide predictive models for selecting an optimal algorithm for diagnostic or prognostic analysis on any given data, with little or no bias. Application of our models on several real-life gene expression data showed high correlations between predicted and actually achieved performance of the functions. One of such models was used to select an optimal algorithm that was subsequently utilized to identify and validate prognostic biomarkers for disease severity in respiratory syncytial virus (RSV) infected infants. The identified 84 genes signature might serve as the basis for the management of RSV disease in pediatric wards
Hybrid Bayesian - frequentist approaches for randomized trial design in small populations
Randomized Controlled Trials (RCTs) are considered the gold standard for evaluating medical interventions. In small populations, where resources and patients available for participation in research are scarce, the design and conduct of RCTs is especially challenging. Both main schools of statistical inference (frequentist and Bayesian) have shortcomings in that respect. In this thesis, we suggest methods that combine ideas from both those schools in order to borrow their strengths and mitigate their weaknesses. The focus is in efficient use of prior information (a Bayesian concept) while controlling operational characteristics (a frequentist concern)
Dispelling the Myths Behind First-author Citation Counts
We conducted a full-scale evaluative citation analysis study of scholars in the XML research field to explore just how different from each other author rankings resulting from different citation counting methods actually are, and to demonstrate the capability of emerging data and tools on the Web in supporting more realistic citation counting methods. Our results contest some common arguments for the continued
use of first-author citation counts in the evaluation of scholars, such as high correlations between author rankings by first-author citation counts and other citation
counting methods, and high costs of using more realistic citation counting methods that are not well-supported by the ISI databases. It is argued that increasingly available digital full text research papers make it possible for citation analysis studies to go beyond what the ISI databases have directly supported and to employ more
sophisticated methods
- …
