Search CORE

1,721,677 research outputs found

Increased expression of CD40 ligand in activated CD4+ T lymphocytes of systemic sclerosis patients.

Author: R. BISOGNI
LAMBERTI ANNALISA
ROMANO MARIA FIAMMETTA
C. NACLERIO
M. C. TURCO
G. VALENTINI
AND S. VENUTA
Publication venue
Publication date: 01/01/2000
Field of study

*G. Valentini ed M.F. Romano hanno egualmente contribuito al lavor

ePublications

Increased expression of CD40 ligand in activated CD4+ T lymphocytes of systemic sclerosis patients.

Author: R. BISOGNI
LAMBERTI ANNALISA
ROMANO MARIA FIAMMETTA
C. NACLERIO
M. C. TURCO
G. VALENTINI
AND S. VENUTA
Publication venue
Publication date: 01/01/2000
Field of study

*G. Valentini ed M.F. Romano hanno egualmente contribuito al lavor

Archivio della ricerca - Università degli studi di Napoli Federico II

A neural model for the prediction of pathogenic genomic variants in Mendelian diseases

Author: L. Cappelletti
A. Cuzzocrea
G. Valentini
Publication venue
Publication date: 2019
Field of study

The detection of pathogenic genomic variants associated with genetic or cancer diseases represents an open problem in the context of the Genomic Medicine. In particular the detection of mutations in the non-coding regions of human genome represents a particularly challenging machine learning problem, since the number of neutral variants largely outnumber the pathogenic ones, thus resulting in highly imbalanced classification problems. We applied neural networks to the detection of pathogenic regulatory genomic variants in Mendelian diseases and we showed that leveraging imbalance-aware techniques and deep learning algorithms, we can obtain state-of-the-art results, using a less complex model than those proposed in literature for this challenging prediction task

AIR Universita degli studi di Milano

Prediction of gene function using ensembles of SVMs and heterogeneous data sources

Author: M. Re
Giorgio Valentini
G. Valentini
Matteo Re
Publication venue
Publication date: 01/01/2009
Field of study

The ever increasing amount of biomolecular data available in public domain databases for a broad range of organisms coupled with recent advances in machine learning research has stimulated interest in computational approaches on gene function prediction. In this context data integration from heterogeneous biomolecular data sources plays a key role. In this contribution we test the performance of several ensembles of SVM classifiers, in which each component learner has been trained on different types of data, and then combined using different aggregation techniques. The compared combination methods are the widely adopted linear weighted combination, the logarithmic weighted combination and the similarity based decision templates approach. The results show that heterogeneous data integration through ensemble methods represents a valuable research line in gene function prediction

Crossref

AIR Universita degli studi di Milano

DDAG K-TIPCAC : an ensemble method for protein subcellular localization

Author: M. Re
G. Valentini
G. Lombardi
A. Rozza
E. Casiraghi
Publication venue
Publication date: 2010
Field of study

Protein subcellular location prediction is one of the most difficult multiclass prediction problems in modern computational biology. Many methods have been proposed in the literature to solve this problem, but all the existing approaches are affected by some limitations. In this contribution we propose a novel method for protein subcellular location prediction that performs multiclass classification by combining kernel classifiers through DDAG. Each base classifier, called K-TIPCAC, projects the points on a Fisher subspace estimated on the training data by means of a novel technique. Experimental results clearly indicated that DDAG K-TIPCAC performs equally, if not better, than state-of-the-art ensemble methods for protein subcellular location

AIR Universita degli studi di Milano

Clusterv : a tool for assessing the reliability of clusters discovered in DNA microarray data

Author: G. Valentini
Publication venue
Publication date: 01/01/2006
Field of study

We present a new R package for the assessment of the reliability of clusters discovered in high dimensional DNA microarray data. The package implements methods based on random projections that approximately preserve distances between examples in the projected subspaces

AIR Universita degli studi di Milano

Hierarchical ensemble methods for protein function prediction

Author: G. Valentini
Publication venue
Publication date: 2014
Field of study

Protein function prediction is a complex multiclass multilabel classification problem, characterized by multiple issues such as the incompleteness of the available annotations, the integration of multiple sources of high dimensional biomolecular data, the unbalance of several functional classes, and the difficulty of univocally determining negative examples. Moreover, the hierarchical relationships between functional classes that characterize both the Gene Ontology and FunCat taxonomies motivate the development of hierarchy-aware prediction methods that showed significantly better performances than hierarchical-unaware “flat” prediction methods. In this paper, we provide a comprehensive review of hierarchical methods for protein function prediction based on ensembles of learning machines. According to this general approach, a separate learning machine is trained to learn a specific functional term and then the resulting predictions are assembled in a “consensus” ensemble decision, taking into account the hierarchical relationships between classes. The main hierarchical ensemble methods proposed in the literature are discussed in the context of existing computational methods for protein function prediction, highlighting their characteristics, advantages, and limitations. Open problems of this exciting research area of computational biology are finally considered, outlining novel perspectives for future research

Crossref

AIR Universita degli studi di Milano

An experimental bial-variance analysis of SVM ensembles based on resampling techniques

Author: G. Valentini
Publication venue
Publication date: 2005
Field of study

Recently, bias-variance decomposition of error has been used as a tool to study the behavior of learning algorithms and to develop new ensemble methods well suited to the bias-variance characteristics of base learners. We propose methods and procedures, based on Domingo's unified bias-variance theory, to evaluate and quantitatively measure the bias-variance decomposition of error in ensembles of learning machines. We apply these methods to study and compare the bias-variance characteristics of single support vector machines (SVMs) and ensembles of SVMs based on resampling techniques, and their relationships with the cardinality of the training samples. In particular, we present an experimental bias-variance analysis of bagged and random aggregated ensembles of SVMs in order to verify their theoretical variance reduction properties. The experimental bias-variance analysis quantitatively characterizes the relationships between bagging and random aggregating, and explains the reasons why ensembles built on small subsamples of the data work with large databases. Our analysis also suggests new directions for research to improve on classical bagging

AIR Universita degli studi di Milano

True path rule hierarchical ensembles for genome-wide gene function prediction

Author: G. Valentini
Publication venue
Publication date: 2011
Field of study

Gene function prediction is a complex computational problem, characterized by several items: the number of functional classes is large, and a gene may belong to multiple classes; functional classes are structured according to a hierarchy; classes are usually unbalanced, with more negative than positive examples; class labels can be uncertain and the annotations largely incomplete; to improve the predictions, multiple sources of data need to be properly integrated. In this contribution we focus on the first three items, and in particular on the development of a new method for the hierarchical genome-wide and ontology-wide gene function prediction. The proposed algorithm is inspired by the “true path rule” that governs both the Gene Ontology and FunCat taxonomies. According to this rule, the proposed True Path Rule (TPR) ensemble method is characterized by a two-way asymmetric flow of information that traverses the graph-structured ensemble: positive predictions for a node influence in a recursive way its ancestors, while negative predictions influence its offsprings. Cross-validated results with the model organism S. cerevisiae, using 7 different sources of biomolecular data, and a theoretical analysis of the the TPR algorithm show the effectiveness and the drawbacks of the proposed approach

AIR Universita degli studi di Milano

Gene expression-based prediction of malignancies

Author: G. Valentini
Publication venue
Publication date: 01/01/2002
Field of study

Molecular classification of malignancies can potentially stratify patients into distinct subclasses not detectable using traditional classification of tumors, opening new perspectives on the diagnosis and personalized therapy of polygenic diseases. In this paper we present a brief overview of our work on gene expression based prediction of malignancies, starting from the dichotomic classification problem of normal versus tumoural tissues, to multiclasss cancer diagnosis and to functional class discovery and gene selection problems. The last part of this work present preliminary results about the applicatin of ensembles of SVMs based on bias-variance decomposition of the error to the analysis of gene expression data of malignant tissues

AIR Universita degli studi di Milano