Search CORE

1,721,013 research outputs found

Low-Rank Analysis of Topic Quality: Comparing LDA, CTM, and Fuzzy-LSA methods

Author: antonio calcagni'
Publication venue
Publication date: 01/01/2025
Field of study

The aim of this study is to evaluate the quality of topic solutions generated by Latent Dirichlet Allocation (LDA), Correlated Topic Model (CTM), and fuzzy Latent Semantic Analysis (fLSA). By introducing the CL, RL, and HO indices, the study focuses on structural properties such as oversimplification, redundancy, and homogeneity, offering a novel approach to complement traditional metrics like coherence and perplexity. This framework provides a nuanced perspective for assessing topic quality

Archivio istituzionale della ricerca - Università di Padova

Assessing CO2 emissions from electricity generation: a methodological review and comparative analysis

Author: Marina Bertolini
Pierdomenico Duttilo
Francesco Lisi
Publication venue
Publication date: 01/01/2025
Field of study

Accurate estimation of greenhouse gas (GHG) is essential to meet carbon neutrality targets, particularly through the calculation of direct CO2 emissions from electricity generation. This work reviews and compares emission factor-based methods for accounting direct carbon emissions from electricity generation. The emission factor approach is commonly worldwide used. Empirical comparisons are based on emission factors computed using data from the Italian electricity market. The analyses reveal significant differences in the CO2 estimates according to different methods. This, in turn, highlights the need to select an appropriate method for reliable emissions, which could support effective regulatory compliance and informed policy-making. As concerns, in particular, the market zones of the Italian electricity market, the results underscore the importance of tailoring emission factors to accurately capture regional fuel variations

Archivio istituzionale della ricerca - Università di Padova

Predicting and Preventing gender-based violence: A strategic framework for long-term change

Author: Forciniti Alessia
Zavarrone Emma
Publication venue
Publication date: 01/01/2025
Field of study

Gender-based violence (GBV) remains a critical global issue, requiring proactive prevention strategies to mitigate its long-term impact. This study examines the evolving landscape of GBV prevention, highlighting a shift from reactive interventions to forward- looking strategies. Using the futures cone and three horizons framework, we developed a sustainable model for GBV mitigation. Through Natural Language Processing analysis of survivor narratives, we identified linguistic and semantic patterns that reveal resilience and opportunities for early intervention. Our data-driven approach provides policymakers and advocates with actionable insights to drive systemic change and reduce GBV prevalence

Apeiron - IULM

Lost in Noise: When cleaning up clouds the picture. Fuzzy topic modeling and robust low-rank decomposition

Author: andrea sciandra
antonio calcagni'
arjuna tuzzi
Publication venue
Publication date: 01/01/2025
Field of study

This preliminary study assesses the impact of noise-removing techniques, such as Principal Component Pursuit (PCP), on the document-term matrix before topic modeling. Specifically, fuzzy Latent Semantic Analysis (fLSA) is applied to a benchmark dataset of Air France customer reviews to evaluate how different input representations – namely, the standard term-frequency matrix and its low-rank approximation via low-rank decomposition – affect topic coherence and interpretability. Initial results indicate that while fLSA effectively extracts meaningful topics, noise removal via PCP introduces distortions, altering topic structure

Archivio istituzionale della ricerca - Università di Padova

Academic Resilience Among Low ESCS Students in Italy

Author: Sulis Isabella
Porcu Mariano
Pisu Alessandra
Publication venue
Publication date: 01/01/2025
Field of study

Archivio istituzionale della ricerca - Università di Cagliari

Functional Clustering for Survival Curves

Author: Mariarita De Lucia
Elvira Romano
Fabrizio Maturo
Publication venue
Publication date: 01/01/2025
Field of study

This paper investigates the underexplored area of clustering multiple survival curves, with a focus on the advantages of Functional Data Analysis for analyzing survival or hazard functions to exploit their inherent continuous nature. We propose customized functional methods, particularly leveraging Functional Principal Component Analysis, and compare them with existing methods using two real datasets: the German Breast Cancer Study (GBCS) and the Lung Cancer dataset. The results show that FDA-based methods offer faster execution times and improve clustering quality overall, highlighting the potential of FDA as a more natural and efficient approach for clustering survival curves, making it a promising direction for future survival data analysis

Archivio Istituzionale della Ricerca - Università degli Studi della Campania "Luigi Vanvitelli"

A Novel Metric for Enhancing Online Review Relevance in E-commerce

Author: Patrizia Agati
Luisa Stracqualursi
Publication venue
Publication date: 01/01/2025
Field of study

In the realm of e-commerce, online reviews are a crucial resource for consumers, yet their usefulness is often hindered by the overwhelming quantity and variability of information. This study proposes an innovative approach to balancing numerical ratings with the sentiment extracted from review texts, leveraging the VADER (Valence Aware Dictionary and sEntiment Reasoner) model. The proposed metric identifies atypical and incongruent reviews by evaluating the consistency between numerical ratings and the sentiment conveyed in textual content. Through the analysis of real-world review datasets, we demonstrate how this system enhances the relevance of information for consumers, enabling them to navigate reviews with greater ease. Tested on datasets comprising 3 million reviews, the results show that integrating this metric into e-commerce platforms can not only optimize the shopping experience but also provide businesses with an opportunity to increase transparency and foster customer loyalty. This work contributes to the ongoing discourse on the importance of AI-driven tools in supporting informed decision-making within digital marketing

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Segmenting the spatial distribution of the adjusted dissimilarity index to detect residential segregation of foreigners in Campania

Author: Rosaria Simone
Federico Benassi
Publication venue
Publication date: 01/01/2025
Field of study

Residential segregation of the foreign population can depend by several socioeconomic and demographic factors related to both resident population and territorial context. By choosing the adjusted dissimilarity index to assess eveness of the spatial distribution of foreign residents with respect to the Italian population, we propose to resort to conditional inference trees to identify the contextual variables, measured at two spatial domains, that are mostly associated with the chosen measure of residential segregation in Campania, Italy. The analysis distinguishes between European and not European foreigners, to highlight differences in their settlement models

Archivio della ricerca - Università degli studi di Napoli Federico II

Improved prediction of 100-meter sprint records

Author: Fonseca Giovanni
Giummole' Federica
Lambardi Di San Miniato Michele
Mameli Valentina
Publication venue
Publication date: 01/01/2025
Field of study

In the last years, prediction of sport records has received increased attention by the scientific community. In particular, it is of great interest the evaluation of the goodness of a record. The application of extreme value theory in this context is quite natural. In this work, we use the Gumbel model to analyze the annual speed records in men’s and women’s 100-meter sprint races from 2001 to 2024. We propose the use of a new calibration procedure in order to correctly estimate the probability of future records and the expected time needed to break the current world record

Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari

Scoring ordinal variables for constructing composite indicators

Author: Marica Manisera
Publication venue
Publication date: 2013
Field of study

In order to provide composite indicators of latent variables, for example of customer satisfaction, it is opportune to identify the structure of the latent variable, in terms of the assignment of items to the subscales defining the latent variable. Adopting the reflective model, the impact of four different methods of scoring ordinal variables on the identification of the true structure of latent variables is investigated. A simulation study composed of 5 steps is conducted: (1) simulation of population data with continuous variables measuring a two-dimensional latent variable with known structure; (2) draw of a number of random samples; (3) discretization of the continuous variables according to different distributional forms; (4) quantification of the ordinal variables obtained in step (3) according to different methods; (5) construction of composite indicators and verification of the correct assignment of variables to subscales by the multiple group method and the factor analysis. Results show that the considered scoring methods have similar performances in assigning items to subscales, and that, when the latent variable is multinormal, the distributional form of the observed ordinal variables is not determinant in suggesting the best scoring method to use

Directory of Open Access Journals