1,721,006 research outputs found

    Robustness of κ-type coefficients for clinical agreement

    No full text
    The degree of inter-rater agreement is usually assessed through (Formula presented.) -type coefficients and the extent of agreement is then characterized by comparing the value of the adopted coefficient against a benchmark scale. Through two motivating examples, it is displayed the different behavior of some (Formula presented.) -type coefficients due to asymmetric distribution of marginal frequencies over categories. In order to investigate the robustness of four (Formula presented.) -type coefficients for nominal and ordinal classifications and of an inferential benchmarking procedure that, differently from straightforward benchmarking, does not neglect the influence of the experimental conditions, an extensive Monte Carlo simulation study has been conducted. The robustness has been investigated for several scenarios, differing for sample size, rating scale dimension, number of raters, frequency distribution of rater classifications, pattern of agreement across raters. Simulation results reveal an higher paradoxical behavior of Fleiss kappa and Conger kappa with ordinal rather than nominal classifications; the coefficients robustness improves with increasing sample size and number of raters for both nominal and ordinal classifications whereas robustness improves with rating scale dimension only for nominal classifications. By identifying the scenarios (ie, minimum sample size, number of raters, rating scale dimension) with acceptable robustness, this study provides guidelines about the design of robust agreement studies

    Benchmarking procedures for characterizing the extent of rater agreement: a comparative study

    No full text
    Decision making processes often rely on subjective evaluations provided by human raters. In the absence of a gold standard against which check evaluation trueness, rater's evaluative performance is generally measured through rater agreement coefficients. In this study some parametric and non-parametric inferential benchmarking procedures for characterizing the extent of rater agreement—assessed via kappa-type agreement coefficients—are illustrated. A Monte Carlo simulation study has been conducted to compare the performance of each procedure in terms of weighted misclassification rate computed for all agreement categories. Moreover, in order to investigate whether the procedures overestimate or underestimate the level of agreement, misclassifications have been computed also for each specific category alone. The practical application of coefficients and inferential benchmarking procedures has been illustrated via two real data sets exemplifying different experimental conditions so as to highlight performance differences due to sample size

    Testing inter-group ranking heterogeneity: do patient characteristics matter for prioritization of quality improvements in healthcare service?

    No full text
    In many research contexts, such as social science, marketing, education, psychology and medicine, it is frequently of interest to compare two or more groups of subjects (e.g. people of different gender, age or nationality), who are asked to rank a set of alternatives according to their personal liking or opinion, for investigating the presence of group effect. The common investigation aim is to detect customers with homogeneous preferences (or priorities) in order to serve each group as properly as possible. Several approaches have been proposed in the literature for testing ranking heterogeneity among groups of subjects. This paper focuses on an approach considering diversity as a generalization of the notion of variation and investigates the performance of a testing procedure for ranking heterogeneity based on the index of segregation power. The performance of the testing procedure has been investigated via a Monte Carlo simulation study under several scenarios, differing for group size, number of ranked alternatives and system of hypothesis. Furthermore, using a real data set, the testing procedure is exploited for investigating whether patient age and gender matter for patient prioritization of quality improvement in healthcare service

    Evaluating classifier predictive performance in multi-class problems with balanced and imbalanced data sets

    No full text
    A major issue in classification problems arises when dealing with class imbalance, which requires the adoption of a suitable performance measure able to handle imbalanced data sets. This paper introduces the Balanced AC1 and its weighted version Balanced AC2 as classifier performance measures suitable for both balanced and imbalanced data sets. The performances of the proposed measures are compared against those of other well-known performance measures through an empirical comparison using several algorithms and data sets. Moreover, the applicability of Balanced AC1 is showcased through an illustrative example dealing with steel plate faults classifications, where class imbalance typically occurs due to non-common defects which, though rare, may seriously impact steel quality

    Alpha 2-adrenergic stimulation within the nucleus tractus solitarius attenuates vasopressin release induced by depletion of cardiovascular volume.

    No full text
    The functional role of the nucleus tractus solitarius (NTS) in the regulation of arginine-vasopressin (AVP) release mediated by baroreceptor activation was investigated by examining the effects induced by the presynaptic alpha-adrenergic agonist clonidine. The present data show that microinjection of clonidine into NTS resulted in a significant attenuation of AVP secretion induced by hypovolemia in the rat. This effect produced by NTS injection of 8 and 10 nmol clonidine was prevented by NTS pretreatment with the alpha-2-adrenoceptor blocker, yohimbine (10 nmol), indicating alpha-2-adrenergic receptors were required for the biological response. These findings suggest that catecholaminergic projections from NTS to hypothalamic vasopressinergic neurons play a facilitatory role in controlling AVP secretion

    Fair evaluation of classifier predictive performance based on binary confusion matrix

    No full text
    Evaluating the ability of a classifier to make predictions on unseen data and increasing it by tweaking the learning algorithm are two of the main reasons motivating the evaluation of classifier predictive performance. In this study the behavior of Balanced AC1 — a novel classifier accuracy measure — is investigated under different class imbalance conditions via a Monte Carlo simulation. The behavior of Balanced AC1 is compared against that of several well-known performance measures based on binary confusion matrix. Study results reveal the suitability of Balanced AC1 with both balanced and imbalanced data sets. A real example of the effects of class imbalance on the behavior of the investigated classifier performance measures is provided by comparing the performance of several machine learning algorithms in a churn prediction problem
    corecore