1,721,272 research outputs found
Robust multivariate estimation based on statistical depth filters
In the classical contamination models, such as the gross-error (Huber and Tukey contamination model or case-wise contamination), observations are considered as the units to be identified as outliers or not. This model is very useful when the number of considered variables is moderately small. Alqallaf et al. (Ann Stat 37(1):311–331, 2009) show the limits of this approach for a larger number of variables and introduced the independent contamination model (cell-wise contamination) where now the cells are the units to be identified as outliers or not. One approach to deal, at the same time, with both type of contamination is filter out the contaminated cells from the data set and then apply a robust procedure able to handle case-wise outliers and missing values. Here, we develop a general framework to build filters in any dimension based on statistical data depth functions. We show that previous approaches, e.g., Agostinelli et al. (TEST 24(3):441–461, 2015b) and Leung et al. (Comput Stat Data Anal 111:59–76, 2017), are special cases. We illustrate our method by using the half-space depth
Likelihood disparity estimation for the skew-normal distribution.
In this paper we introduce an estimator based on Likelihood Disparity which is a distance between distributions based
on the Likelihood. This estimator is used to estimate location, scale and shape parameters of the skew normal distribution.
Maximum Likelihood, Method of Moments and others estimation methods are often unreliable estimators of the shape parameter (they are biased towards infinity), especially for small sample size. Our estimator performs very well for all sample sizes and shape values. Its behavior is similar to the Maximum Likelihood estimator when the latter is finite, but provides a finite estimate in all remain cases
ARFIMA processes and outliers: a weighted likelihood approach
In this paper, we consider the problem of robust estimation of the fractional parameter, d, in long memory autoregressive fractionally integrated moving average processes, when two types of outliers, i.e. additive and innovation, are taken into account without knowing their number, position or intensity. The proposed method is a weighted likelihood estimation (WLE) approach for which needed definitions and algorithm are given. By an extensive Monte Carlo simulation study, we compare the performance of the WLE method with the performance of both the approximated maximum likelihood estimation (MLE) and the robust M-estimator proposed by Beran (Statistics for Long-Memory Processes, Chapman & Hall, London, 1994). We find that robustness against the two types of considered outliers can be achieved without loss of efficiency. Moreover, as a byproduct of the procedure, we can classify the suspicious observations in different kinds of outliers. Finally, we apply the proposed methodology to the Nile River annual minima time series.ARFIMA processes, outliers, robust estimation, weighted likelihood,
A robust seemingly unrelated regressions for row-wise and cell-wise contamination
The Seemingly Unrelated Regressions (SUR) model is a wide used estimation procedure in econometrics, insurance and finance, where very often, the regression model contains more than one equation. Unknown parameters, regression coefficients and covariances among the errors terms, are estimated using algorithms based on Generalized Least Squares or Maximum Likelihood, and the method, as a whole, is very sensitive to outliers. To overcome this problem M-estimators and S-estimators are proposed in the literature together with fast algorithms. However, these procedures are only able to cope with row-wise outliers in the error terms, while their performance becomes very poor in the presence of cell-wise outliers and as the number of equations increases. A new robust approach is proposed which is able to perform well under both contamination types as well as it is fast to compute. Illustrations based on Monte Carlo simulations and a real data example are provided
Network depth: Identifying median and contours in complex networks
Centrality descriptors are widely used to rank nodes according to specific concept(s) of importance. Despite the large number of centrality measures available nowadays, it is still poorly understood how to identify the node which can be considered as the 'centre' of a complex network. In fact, this problem corresponds to finding the median of a complex network. The median is a non-parametric-or better, distribution-free-and robust estimator of the location parameter of a probability distribution. In this work, we present the statistical and most natural generalization of the concept of median to the realm of complex networks, discussing its advantages for defining the centre of the system and percentiles around that centre. To this aim, we introduce a new statistical data depth and we apply it to networks embedded in a geometric space induced by different metrics. The application of our framework to empirical networks allows us to identify central nodes which are socially or biologically relevant
Weighted likelihood methods for robust fitting of wrapped models for p-torus data
We consider, robust estimation of wrapped models to multivariate circular data that are points on the surface of a p-torus based on the weighted likelihood methodology. Robust model fitting is achieved by a set of weighted likelihood estimating equations, based on the computation of data dependent weights aimed to down-weight anomalous values, such as unexpected directions that do not share the main pattern of the bulk of the data. Weighted likelihood estimating equations with weights evaluated on the torus or obtained after unwrapping the data onto the Euclidean space are proposed and compared. Asymptotic properties and robustness features of the estimators under study have been studied, whereas their finite sample behavior has been investigated by Monte Carlo numerical experiment and real data examples
Going Beyond Counting First Authors in Author Co-citation Analysis
The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation
counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings
are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that
only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into
account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed
Variations on the Author
“Variations on the Author” discusses two of Eduardo Coutinho’s recent films (Um Dia na Vida, from 2010, and Últimas Conversas, posthumously released in 2015) and their contribution to the general question of documentary authorship. The director’s filmography is characterized by a consistent yet self-effacing form of authorial self-inscription: Coutinho often features as an interviewer that rather than express opinions propels discourses; an interviewer that is good at listening. This mode of self-inscription characterizes him as an author who is not expressive but who is nonetheless markedly present on the screen. In Um Dia na Vida, however, Coutinho is completely absent form the image, while Últimas Conversas, on the contrary, includes a confessional prologue that moves the director from the margins to the center of his films. This article examines the ways in which these works stand out in the filmography of a director who offers new insights into the notion of cinematic authorship
- …
