1,721,033 research outputs found

    Single-linkage clustering for optimal classification in piecewise affine regression

    No full text
    When performing regression with piecewise affine maps, the most challenging task is to classify the data points, i.e. to correctly attribute a data point to the affine submodel that most likely generated it. In this paper, we consider a regression scheme similar to the one proposed in (Ferrari-Trecate et al., 2001,2003) that reduces the classification step to a clustering problem in presence of outliers. However, instead of the K-means procedure adopted in (Ferrari-Trecate et al., 2001,2003), we propose the use of single-linkage clustering that estimates automatically the number of submodels composing the piecewise affine map. Moreover we prove that, under mild assumptions on the data set, single-linkage clustering can guarantee optimal classification in presence of bounded noise

    Bagged ensembles of Support Vector Machines for gene expression data analysis

    No full text
    Extracting information from gene expression data is a difficult task, as these data are characterized by very high dimensional, small sized, samples and large degree of biological variability. However, a possible way of dealing with the curse of dimensionality is offered by feature selection algorithms, while variance problems arising from small samples and biological variability can be addressed through ensemble methods based on resampling techniques. These two approaches have been combined to improve the accuracy of Support Vector Machines (SVM) in the classification of malignant tissues from DNA microarray data. To assess the accuracy and the confidence of the predictions performed proper measures have been introduced. Presented results show that bagged ensembles of SVM are more reliable and achieve equal or better classification accuracy with respect to single SVM, whereas feature selection methods can further enhance classification accuracy

    Cancer recognition with bagged ensembles of Support Vector Machines

    No full text
    Expression-based classification of tumors requires stable, reliable and variance reduction methods, as DNA microarray data are characterized by low size, high dimensionality, noise and large biological variability. In order to address the variance and curse of dimensionality problems arising from this difficult task, we propose to apply bagged ensembles of Support Vector Machines (SVM) and feature selection algorithms to the recognition of malignant tissues. Presented results show that bagged ensembles of SVMs are more reliable and achieve equal or better classification accuracy with respect to single SVMs, whereas feature selection methods can further enhance classification accuracy

    Modeling gene expression data via positive Boolean functions

    No full text
    In this work we propose an artificial model for the generation of biologically plausible gene expression data to be used in the evaluation of the performance of gene selection and clustering methods. The model allows to fix in advance the set of relevant genes and the functional classes involved in the problem; the input-output relationship is constructed by synthesizing a positive Boolean function. Despite its simplicity, it is sufficiently rich to take account of the specific peculiarities of gene expression data, including biological variability. A Java code had been developed to allow the user choose the model parameters according to the characteristics of the experiment he want to simulate. This permits to insert the artificial model into a distributed system for microarray analysis, in particular one based on a Grid infrastructure

    A New Learning Method for Piecewise Linear Regression

    No full text
    A new connectionist model for the solution of piecewise lin- ear regression problems is introduced; it is able to reconstruct both con- tinuous and non continuous real valued mappings starting from a finite set of possibly noisy samples. The approximating function can assume a different linear behavior in each region of an unknown polyhedral parti- tion of the input domain. The proposed learning technique combines local estimation, clustering in weight space, multicategory classification and linear regression in order to achieve the desired result. Through this approach piecewise affine solutions for general nonlinear regression problems can also be found
    corecore