1,721,000 research outputs found
A Geometric Approach to Subset Selection and Sparse Sufficient Dimension Reduction
Sufficient dimension reduction methods allow to estimate lower dimensional subspaces while retaining most of the information about the regression of a response variable on a set of predictors. However, it may happen that only a subset of predictors is actually required. We propose a geometric approach to subset selection by imposing sparsity constraints on certain coefficients which determine the estimated directions. This method can be applied to most existing dimension reduction methods, such as sliced inverse regression and sliced average variance estimation, and may help to improve the estimation accuracy and facilitate interpretation
dispmod: Dispersion models
An R package for modelling dispersion in Generalized Linear Models
GA: A Package for Genetic Algorithms in R
Genetic algorithms (GAs) are stochastic search algorithms inspired by the basic principles of biological evolution and natural selection. GAs simulate the evolution of living organisms, where the fittest individuals dominate over the weaker ones, by mimicking the biological mechanisms of evolution, such as selection, crossover and mutation. GAs have been successfully applied to solve optimization problems, both for continuous (whether differentiable or not) and discrete functions.
This paper describes the R package GA, a collection of general purpose functions that provide a flexible set of tools for applying a wide range of genetic algorithm methods. Several examples are discussed, ranging from mathematical functions in one and two dimensions known to be hard to optimize with standard derivative-based methods, to some selected statistical problems which require the optimization of user defined objective functions
Poisson change-point models estimated by genetic algorithms
Change-point analysis aims at both detecting whether or not a sharp change has occurred, or whether several changes might have occurred, and identifying the times of any such changes. Numerous approaches to conduct a change-point analysis are available in the literature. In this paper we propose the use of Genetic Algorithms (GAs) for estimating Poisson change-point models. GAs are stochastic search and optimisation technique inspired by natural evolution. They provide a robust and flexible framework that can be applied to a wide range of learning and optimisation problems, in particular when traditional optimisation techniques break down. A data analysis on the annual number of patients with haemolytic uremic syndrome is presented, with change-point models estimated using the GA R package
GA: Genetic Algorithms
An R package for optimization using genetic algorithms. The package provides a flexible general-purpose set of tools for implementing genetic algorithms search in both the continuous and discrete case, whether constrained or not. Users can easily define their own objective function depending on the problem at hand. Several genetic operators are available and can be combined to explore the best settings for the current task. Furthermore, users can define new genetic operators and easily evaluate their performances
Graphical Tools for Model-based Mixture Discriminant Analysis
Visualization and graphics can play an important role in understanding discriminant analysis. Fisher's canonical variates provide a graphical counterpart to linear discriminant analysis (LDA). For quadratic discriminant analysis (QDA) there is no standard graphical representation, although some dimension reduction methods have been discussed in the literature.
In this contribution we propose a graphical method to be used in conjunction with model-based mixture discriminant analysis. Depending on the number of mixture components for each class and the adopted intraclass covariance matrices, the estimated subspace is able to show the main geometric characteristics of the fitted mixture model. The proposal reduces to the usual Fisher' canonical variates when a single mixture component with common intraclass covariance matrix is used. If the intraclass covariance matrices are unconstrained, the estimated subspace is equivalent to that provided by SAVE, a graphical method proposed for use in QDA
Graphical tools for model-based mixture discriminant analysis
The paper introduces a methodology for visualizing on a dimension reduced subspace the classification structure and the geometric characteristics induced by an estimated Gaussian mixture model for discriminant analysis. In particular, we consider the case of mixture of mixture models with varying parametrization which allow for parsimonious models. The approach is an extension of an existing work on reducing dimensionality for model-based clustering based on Gaussian mixtures. Information on the dimension reduction subspace is provided by the variation on class locations and, depending on the estimated mixture model, on the variation on class dispersions. Projections along the estimated directions provide summary plots which help to visualize the structure of the classes and their characteristics. A suitable modification of the method allows us to recover the most discriminant directions, i.e., those that show maximal separation among classes. The approach is illustrated using simulated and real data. © 2013 Springer-Verlag Berlin Heidelberg
Dimension reduction for model-based clustering
We introduce a dimension reduction method for visualizing the clustering structure obtained from a finite mixture of Gaussian densities. Information on the dimension reduction subspace is obtained from the variation on group means and, depending on the estimated mixture model, on the variation on group covariances. The proposed method aims at reducing the dimensionality by identifying a set of linear combinations, ordered by importance as quantified by the associated eigenvalues, of the original features which capture most of the cluster structure contained in the data. Observations may then be projected onto such a reduced subspace, thus providing summary plots which help to visualize the clustering structure. These plots can be particularly appealing in the case of high-dimensional data and noisy structure. The new constructed variables capture most of the clustering information available in the data, and they can be further reduced to improve clustering performance. We illustrate the approach on both simulated and real data sets. © 2009 Springer Science+Business Media, LLC
- …
