1,721,029 research outputs found
Analysis of association football playing styles: An innovative method to cluster networks
Selecting the training set in classification problems with rare events
Binary classification algorithms are often used in situations when one of the two classes is extremely rare. A common practice is to oversample units of the rare class when forming the training set. For some classification algorithms, like logistic classification, there are thorethical results that justify such an approach. Similar results are not available for other popular classification algorithms like classification trees. In this paper the use of balanced datasets, when dealing with rare classes, for tree classifiers and boosting algorithms is discussed and results from analyzing a real dataset and a simulated dataset are reported
Relationship Between Nasal Cycle, Nasal Symptoms and Nasal Cytology
Background: The nasal cycle is the spontaneous congestion and decongestion of nasal mucosa that happens during the day. Classically, 4 types of nasal cycle patterns have been described: (1) classic, (2) parallel, (3) irregular, and (4) acyclic. Hypothalamus has been considered as the central regulator even if several external factors may influence its activity. Objective: The aim of the study was to evaluate the presence of a correlation between nasal cycle pattern, nasal cytology and nasal symptoms. Methods: Thirty healthy volunteers have been enrolled in the study. All subjects completed a Sino-Nasal Outcome Test-22 questionnaire and a Visual Analog Scale (VAS) for nasal obstruction. The nasal cycle was studied by means of peak nasal inspiratory flow. Nasal cytology has been used to evaluate the presence of local nasal inflammation. Results: Nineteen subjects showed a parallel nasal cycle pattern, while 11 showed a regular one. A parallel pattern was present in 60% of asymptomatic subjects and in 67% of the symptomatic one (P = 1). VAS for nasal obstruction did not show a significant difference between the 2 patterns of the nasal cycle (P =.398). Seventeen subjects had a normal rhinocytogram, while 13 volunteers showed a neutrophilic rhinitis; 53.8% of the subjects with a neutrophilic rhinitis showed a parallel pattern, while the remaining 46.2% had a regular one. In the case of a normal cytology, 70.6% of the volunteers had a parallel pattern and 29.4% had a regular one. Differences between the 2 groups were not statistically significant (P =.575). Conclusion: Rhinitis with neutrophils seems to not influence the nasal cycle pattern. Based on the present results, the pattern of nasal cycle does not influence subjective nasal obstruction sensation
Enriched Pitman–Yor processes
Bayesian non-parametrics has evolved into a broad area encompassing flexible methods for Bayesian inference, combinatorial structures, tools for complex data reduction, and more. Discrete prior laws play an important role in these developments, and various choices are available nowadays. However, many existing priors, such as the Dirichlet process, have limitations if data require nested clustering structures. Thus, we introduce a discrete non-parametric prior, termed the enriched Pitman–Yor process, which offers higher flexibility in modeling such elaborate partition structures. We investigate the theoretical properties of this novel prior and establish its formal connection with the enriched Dirichlet process and normalized random measures. Additionally, we present a square-breaking representation and derive closed-form expressions for the posterior law and associated urn schemes. Furthermore, we demonstrate that several established models, including Dirichlet processes with a spike-and-slab base measure and mixture of mixtures models, emerge as special instances of the enriched Pitman–Yor process, which therefore serves as a unified probabilistic framework for various Bayesian non-parametric priors. To illustrate its practical utility, we employ the enriched Pitman–Yor process for a species-sampling ecological problem
PLS for classification
Partial Least Squares regression (PLS) is a multivariate technique developed to perform regression in the case of multivariate responses when multicollinearity, redundancy and noise affect the predictors. In spite of several efforts have been made to extend PLS to classification problems, this is still a current field of research. In the present study, a new technique called PLS for classification is introduced to solve the general G-class problem. It is developed within a self-consistent framework based on linear algebra and on the theory of compositional data. After the introduction of the notion of probability-data vector, the space of the predictors and that of the conditional probabilities are linked, and a well-defined least squares problem, whose solution specifies the relationship between probabilities and predictors, is solved by a suitable reformulation of PLS2. The method estimates directly the conditional probability of the class membership given the predictors. The score vectors are introduced only in a second step to improve model interpretation. The main properties of PLS for classification and its relationships with PLS-DA are discussed. One simulated and one real data sets are investigated to show how the method works in practice
Years of Potential Life Lost in the Sardinian population: Limitations and perspectives [Anni potenziali di vita persi nella popolazione Sarda: Limiti e prospettive]
- …
