1,721,011 research outputs found

    Robust correlation coefficient goodmness-of-fit test for the extreme value distribution

    No full text
    A simple robust method is provided to test the goodness of fit for the extreme value distribution (Type I family) by using the new diagnostic tool called the Forward Search method. The Forward Search is a powerful general method that provides diagnostic plots for finding outliers and discovering their underlying effects on models fitted to the data and for assessing the adequacy of the model. The Forward Search algorithm has been previously developed for regression modeling and multivariate analysis frameworks. One of the powerful goodness-of-fit tests is represented by the correlation coefficient test, but this test suffers from the presence of outliers. We introduce the Forward Search version of this test that is not affected by the outliers. Also by using the transformation study, an application to the two-parameter Weibull distribution is investigated. The performance and the ability of this procedure to capture the structure of data are illustrated by some simulation studie

    Robustness for multilevel models: Fraud detection with the forward search

    No full text
    Several methods using multiple regression or classification tools are commonly adopted to identify outliers which are, perhaps, the most important statistical units for anti-fraud detection. For data in the European Union, which are here analysed, the presence of clusters of several firms and several countries, may hide structures and information, making standard and classical tools often unreliable. Moreover, even the parameters estimation of classical models can be severely biased by influential observations or outliers. A methodological solution is to exploit the natural hierarchical structure of multilevel models to take into account th time-varying evolution of quantities traded, and their price, for each country. Multilevel models, however, are not robust as they simply generalize linear models and ANOVA. A forward search algorithm is presented to make parameter estimation robust in the presence of outliers and avoiding masking and swamping, leading to a more accurate identification of suspicious firms. The influence of outliers, if any is inside the dataset, will be monitored at each step of the sequential procedure, which is the key element of the forward search. Preliminary results on simulated data have highlighted the benefit of adopting the forward search algorithm, which can reveal masked outliers, influential observations and show hidden structures. An application to real data is also illustrated

    The analysis of transformations for profit‐and‐loss data

    No full text
    We analyse data on the performance of investment funds, 99 out of 309 of which report a loss, and on the profitability of 1405 firms, 407 of which report losses. The problem in both cases is to use regression to predict performance from sets of explanatory variables. In one case, it is clear from scatter plots of the data that the negative responses have a lower variance than the positive responses and a different relationship with the explanatory variables. Because the data include negative responses, the Box–Cox transformation cannot be used. We develop a robust version of an extension to the Yeo–Johnson transformation which allows different transformations for positive and negative responses.Tests and graphical methods from our robust analysis enable the detection of outliers, the assessment of values of the two transformation parameters and the building of simple regression models. Performance comparisons are made with non-parametric transformations

    A forward approach for supervised classification with model selection

    No full text
    Supervised methods of classification naturally exploit linear and non linear relationships between explanatory variables and a response. However, the presence of clusters may lead to a different pattern within each group. For instance, data can naturally be grouped in several linear structures and so, simple linear regression models can be used for classification. Estimation of linear models can be severely biased by influential observations or outliers. A practical problem arises when the groups identifying the different relationships are unknown, and the number of ``relevant'' variables is high. In such a context, supervised classification problem can become cumbersome. As a solution, within the general framework of generalized linear models, a new robust approach is to exploit the sequential ordering of the data provided by the forward search algorithm. Such algorithm will be used two-folds to address the problems of variable selection for model fit, while grouping the data naturally ``around'' the model. The influence of outliers, if any is inside the dataset, will be monitored at each step of the sequential procedure. Preliminary results on simulated data have highlighted the benefit of adopting the forward search algorithm, which can reveal masked outliers, influential observations and show hidden structures

    Labor market analysis through transformations and robust multivariate models

    No full text
    The work presents a robust approach to labor share analysis. The estimate of labor share presents various complexities related to the nature of the data sets to be analyzed. Typically, labor share is evaluated by using discriminant analysis and linear or generalized linear models, that do not take into account the presence of possible outliers. Moreover, the variables to be considered are often characterized by a high dimensional structure. The proposed approach has the objective of improving the estimation of the model using robust multivariate regression techniques and data transformation
    corecore