1,721,123 research outputs found

    Handling heterogeneity among units in Quantile Regression

    No full text
    In many real data applications statistical units belong to different groups and a statistical model should be tailored to incorporate and exploit this heterogeneity among units. This is also the case of the analysis of the relationship between a response variable and a set of regressors that cannot be carried out by neglecting the membership of the units to the different groups. Several approaches have been proposed in the literature to analyze group effects in a dependence model (the use of dummy variables to denote group membership or multilevel models among the others). All of them share the aim to inspect how the group structure affects the impact of the regressors on the dependent variable, without providing details on the dependence structure inside the groups. Moreover, they are tailored for the estimation of the average effects. To estimate group effects at different points of the response conditional distribution, Davino & Vistocco, 2008 proposed to exploit quantile regression (QR) (Koenker & Basset, 1978) (Davino et al., 2013), a method that is able to model the entire conditional distribution of a response variable. This paper discusses strengths and properties of such proposal through a simulation study

    DESPOTA: un approccio basato sui test di permutazione per la ricerca della partizione su un dendrogramma

    No full text
    La classificazione gerarchica è uno dei metodi di classificazione mag- giormente utilizzati in molti contesti applicativi (Everitt et al., 2001). La pos- sibilità di scegliere tra differenti partizioni alternative in funzione del livello di omogeneità interno delle corrispondenti classi è sicuramente uno dei punti di maggiore interesse dei metodi gerarchici che spesso li porta a preferire ai tradizionali metodi di partizionamento. La naturale rappresentazione grafica dell’insieme delle partizioni risultanti da un algoritmo di classificazione ge- rarchica è il dendrogramma. Nella scelta della partizione si procede usual- mente utilizzando un taglio orizzontale del dendrogramma ma così facendo una serie di partizioni ospitate nell’albero non saranno mai esplorate. Tali partizioni potranno infatti essere individuate solo utilizzando un taglio su li- velli differenti. La proposta mira appunto ad esplorare l’intero insieme di partizioni disponibili e sfrutta i test di permutazione per effettuare tale ricerca partendo dalla radice dell’albero e scendendo fino agli elementi terminali dello stesso. Un ulteriore vantaggio dell’algoritmo proposto è l’individuazione automatica del numero di classi da scegliere, caratteristica questa che rende una tecnica implementabile agevolmente in sistemi di clas- sificazione automatica

    Qu test for structural breaks in quantile regressions

    No full text
    The Qu (2008) test for changing coefficients in quantile regression is analyzed in order to investigate its diagnostic propensities when appropriately decomposed. The goal is to reveal the diagnostic capabilities of this test: it can indeed be implemented to detect the location of a break and to pin point its impact on each regression coefficient. In addition it is possible to analyze the evolution of the impact of a break on a coefficient as the quantile changes. To implement the Qu test as a diagnostic tool, the original definition is modified. The behavior of the modified test is investigated using real data and through a Monte Carlo study

    DESPOTA: DEndrogram Slicing through a PemutatiOn Test Approach

    No full text
    Hierarchical clustering represents one of the most widespread analytical approaches to tackle classification problems mainly due to the visual powerfulness of the associated graphical representation, the dendrogram. That said, the requirement of appropriately choosing the number of clusters still represents the main difficulty for the final user. We introduce DESPOTA (DEndrogram Slicing through a PermutatiOn Test Approach), a novel approach exploiting permutation tests in order to automatically detect a partition among those embedded in a dendrogram. Unlike the traditional approach, DESPOTA includes in the search space also partitions not corresponding to horizontal cuts of the dendrogram. Applications on both real and syntethic datasets will show the effectiveness of our proposal

    A comparison among estimators for linear regression methods

    No full text
    Koenker & Basset, 1978 introduce the quantile regression estimator, that allows to have a more complete view of the effects a set explicative variables exerts on the response, not only on average but at different points of the conditional distribution: the conditional quantiles. The core of quantile regression is the use of an asymmetric check function that moves the regression line above or below the conditional median, allowing to consider location, scale and shape effects in the study of a statistical relationship. Quantile regression is increasingly implemented due to the variety of its possible applications, as evidenced by the growing number of related papers in recent years. For further details see Koenker, 2005, Hao & Naiman, 2007 and Davino et al. , 2013. A computationally simpler alternative to quantile regression, introduced by Newey & Powell, 1987, is provided by the expectile regression. Expectiles allow the analysis of a regression model at various points of the conditional distribution through the introduction of an asymmetric weighting system. Analogously to the asymmetric check function of the quantile regression estimator, such weighting system moves the least squares regression line, as estimated by ordinary least squares, above or below the regression passing through the conditional average. Compared to quantile regression, the expectile regression is computationally convenient and it still allows to characterize the complete conditional distribution of the response. The main characteristic of expectiles is the adoption of the L2 norm and this causes the lack of robustness of the expectile with respect to the quantile regression estimator. The class of robust regression estimators, the M-estimators (Huber, 1981), computes the regression at the conditional mean meanwhile curbing the impact of outliers on the estimated coefficients. Once again this estimator considers a weighting system to detect and bound outliers while estimating the regression coefficients. Breckling & Chambers, 1988 propose to merge the M–estimators and the expectile approach. Even if both methods are implemented within the least squares framework, robustness is ensured by the introduction of a weighting system to control the outlying observations. The asymmetric weighting system of expectiles is combined with weights bounding outliers and this allows to compute a robust regression away from the conditional mean. Along with the above estimators, it is worth mentioning the modal linear regression (Kemp & Santos Silva, 2012, Yao & Li, 2014). Here the focus is on modeling the conditional mode of the response variable, and it is well adapt in situations where conditional distributions are highly skewed: exploiting the mode features, modal regression reveals robust to outliers, in particular to heavy-tailed conditional error distributions

    The burnt matchstick

    No full text
    In an informal study, two versions of a story involving probability are introduced to undergraduates. The findings reveal that students have troubles detecting equal probabilities in a sampling scheme without replacement in which no information on earlier draws is available

    Handling heterogeneity among units in quantile regression. Investigating the impact of students' features on University outcome

    No full text
    In many real data applications, statistical units belong to different groups and statistical models should be tailored to incorporate and exploit this heterogeneity among units. This paper proposes an innovative approach to identify group effects through a quantile regression model. The method assigns a conditional quantile to each group and provides a separate analysis of the dependence structure inside the groups. The relevance of the proposal is provided through an empirical analysis investigating the impact of students' features on University outcome. The analysis is performed on a sample of graduated students; the degree mark is the response variable, a set of variables describing the students' profile are used as regressors, and the attended School determines the group effects. A working example and a small simulation study are introduced to highlight the main features of the proposed approach

    Testing heterogeneity in quantile regression: a multigroup approach

    Full text link
    The paper aims to introduce a multigroup approach to assess group effects in quantile regression. The procedure estimates the same regression model at different quantiles, and for different groups of observations. Such groups are defined by the levels of one or more stratification variables. The proposed approach exploits a computational procedure to test group effects. In particular, a bootstrap parametric test and a permutation test are compared through artificial data taking into account different sample sizes, and comparing their performance in detecting low, medium, and high differences among coefficients pertaining different groups. An empirical analysis on MOOC students’ performance is used to show the proposal in action. The effect of the two main drivers impacting on performance, learning and engagement, is explored at different conditional quantiles, and comparing self-paced courses with instructor-paced courses, offered on the EdX platform

    The evaluation of university educational processes: a quantile regression approach

    No full text
    The paper aims to analyse the internal effectiveness of an university educational process by means of quantile regression. In particular, the goal is to evaluate how the students features affect the outcome of the University careers taking into account that this effect can be different for students with good or bad performances
    corecore