Search CORE

1,720,998 research outputs found

A procedure for the three-mode analysis of compositions

Author: Gallo Michele
Simonacci V
SIMONACCI V.
Gallo M
Publication venue
Publication date: 01/01/2013
Field of study

The Tucker3 model is one of the most widely used tools for factorial analysis of three-way data arrays. When orthogonal factors are extracted this model can be seen as a three-way PCA (principal component analysis). The Tucker3 model is characterized by extreme flexibility as it allows for the use of a different number of factors in each mode and it yields non-unique results. This adaptability makes the Tucker3 model extremely effective for decomposition and compression of data in many applications and fields. When this model is applied to vectors of non-negative values with a sum constraint all problems connected with the statistical analysis of compositions must be taken into consideration. Like other standard statistical techniques, this model cannot be directly applied. The aim of this paper is to present the theory behind the correct application of the Tucker3 model on compositional data and to describe the TUCKALS3 algorithm

Archivio della ricerca - Università degli studi di Napoli Federico II

ARCHIVIO ISTITUZIONALE DELLA RICERCA-UNIVERSITA' DEGLI STUDI DI NAPOLI "L'ORIENTALE"

Università degli Studi di Napoli L'Orientale: CINECA IRIS

Improving PARAFAC-ALS performance by initialization

Author: Simonacci V
Gallo M
Publication venue
Publication date: 01/01/2018
Field of study

The CANDECOMP/PARAFAC (CP) model (Carroll and Chang, 1970; Harshman, 1970) is a trilinear decomposition which provides a low rank approximation of a three-way array in a manner that preserves the multi-mode structure of the data. This is achieved by estimating three sets of parameters, one for each dimension of the array, namely observation units, variables and occasions. The CP model, however, due to an elevated number of degrees of freedom, can be quite challenging to estimate. The most commonly used algorithm to t this model to the data is PARAFAC-ALS. Comparative studies (Tomasi and Bro, 2006) have shown that this procedure is, in general, more reliable and accurate than other algorithms proposed in the literature. Nonetheless, it presents some non-trivial issues: it can be slow at converging and may run into over-factoring and bad initialization degeneracies. With respect to these setbacks, some of the alternative estimating procedures are able to perform better than ALS, specically the Alternating Trilinear Decomposition (ATLD) and Self-weighted Alternating Trilin-ear Decomposition (SWATLD) proposed by Wu et al. (1998) and Chen et al. (2000) respectively. These algorithms are faster and less likely to be aected by over-factoring and bad initial values. They present, however, diculties connected to their non-least squares objective functions and for this reason they are seldom used in practice. In this work it is suggested that a successful way to improve on ALS performance with respect to the presented drawbacks is to initialize it with either ATLD or SWATLD steps, obtaining two integrated ALS procedures. The eectiveness of this methodology is demonstrated by comparing the results of standard ALS with the ones of the proposed integrated ALS variants in an extensive simulation design

ARCHIVIO ISTITUZIONALE DELLA RICERCA-UNIVERSITA' DEGLI STUDI DI NAPOLI "L'ORIENTALE"

Principal balances for three-way compositions

Author: Simonacci V
Publication venue
Publication date: 01/01/2021
Field of study

Orthonormal balances resulting from a sequential binary partition (SBP) are one of the preferred tools for transforming compositional data in real space coordinates. The interpretability of this approach, however, greatly depends on the relevance of the SBP. SBPs can be chosen with the help of expert knowledge or with data-oriented methods, such as Principal Balances analysis. This results in an SBP whose balances maximize the explained variance in a subsequent manner. Principal balances can be calculated in an exact way or in an approximate fashion by using methods based on PCA for compositional data. In this work a method for the approximation of principal balances in the more complex case of three-way compositions is proposed. Here the additional difficulty given by the introduction of third mode variability is dealt with. In particular an algorithm based on the Tucker3 model is used which allows to keep the variability of the third dimension separate in the definition of principal balances

Archivio della ricerca - Università degli studi di Napoli Federico II

Algorithms for compositional tensors of third-order

Author: Simonacci V
Publication venue
Publication date: 01/01/2020
Field of study

The PARAFAC-ALS procedure for estimating CP parameters on tridimen-sional tensors is sensitive to data collinearity. This inefficiency is especially problematic if collinearity is paired with other issues such as data of large dimensions and difficulties in establishing correct model rank. When dealing with compositional data, i.e. positive values with a covariance bias, multicollinearity is inherent by definition, and it is preserved also if the data is transformed in log-ratios by means of the clr function. For this reason, alternative estimating procedures may be considered, such as INT and INT-2. These dual-step methods use the properties of the SWATLD and ATLD algorithms during initialization to overcome ALS inefficiency while still providing least squares results. Their comparative performance is tested in an extensive simulation study on collinear data

Archivio della ricerca - Università degli studi di Napoli Federico II

Three-way compositional analysis of energy intensity in manufacturing

Author: Todorov V
Simonacci V
Publication venue
Publication date: 01/01/2020
Field of study

Both the scientific and political communities agree that significant reductions in CO2 emissions are necessary to limit the magnitude and extent of climate change and of course the energy efficiency is one of the most interesting issues analyzed by economists and policy makers within this debate. Different measures of energy efficiency in manufacturing can be defined but broadly this is the ratio of the production output to the energy input, usually disaggregated by industry. We create a global data set of energy intensity in manufacturing and analyze its structure by country, time and industry applying parallel factor analysis (CP). Since we are interested in the structure of the energy intensity, the absolute values are no more relevant for the analysis and the nature of this data set is compositional which requires specific adaptation of the methodology and suitable software

Archivio della ricerca - Università degli studi di Napoli Federico II

A compositional methodology with external information for free time allocation preferences

Author: Di Palma M. A.
Simonacci V.
Gallo M.
Simonacci V.
Di Palma M. A.
Gallo M.
Publication venue
Publication date: 01/01/2017
Field of study

The study of free-time activity preferences provides important information on the characteristics and inclinations of specific demographics. Correct modeling of these data can offer a useful insight in the definition of service demand and thus help define effective social strategies. Two important aspects need to be considered when analysing individual preferences on free time. The first difficulty, typical of optimal resource allocation, concerns the constrained nature of the data. There is a sum limit given by the total amount of free time available and, as a consequence, assigned values are not free to vary independently. Statistically this translates into a biased covariance structure. In this perspective the problem can be seen as compositional, which means that by definition these data only carry relative information and should be treated with ad-hoc tools. A second challenge consists in discerning the role that external factors play in determining preferences without, however, forcing the assumption that all information can be explained in this manner. In other words, there could be specific characteristics of the respondents (such as gender, education, etc. . . ) that influence part of the information, and should be considered, but are not able to explain the preference structure in its totality. This duality can be addressed with a methodology that combines together regression and multivariate analysis, proposed in literature as Principal Component Analysis with external information. The purpose of this work is thus to present an application that combines the compositional and external information approach to study free time allocation

ARCHIVIO ISTITUZIONALE DELLA RICERCA-UNIVERSITA' DEGLI STUDI DI NAPOLI "L'ORIENTALE"

Università degli Studi di Napoli L'Orientale: CINECA IRIS

Three–way compositional data: a multi–stage trilinear decomposition algorithm

Author: Di Palma M. A.
Simonacci V.
Gallo M.
Simonacci V.
Simonacci V
Gallo M
Di Palma M. A.
Gallo M.
Di Palma MA
Publication venue
Publication date: 01/01/2017
Field of study

The CANDECOMP/PARAFAC model is an extension of bilinear PCA and has been designed to model three-way data by preserving their multidimensional configuration. The Alternating Least Squares (ALS) procedure is the preferred estimating algorithm for this model because it guarantees stable results. It can, however, be slow at converging and sensitive to collinearity and over-factoring. Dealing with these issues is even more pressing when data are compositional and thus collinear by definition. In this talk the solution proposed is based on a multistage approach. Here parameters are optimized with procedures that work better for collinearity and over-factoring, namely ATLD and SWATLD, and then results are refined with ALS

Archivio della ricerca - Università degli studi di Napoli Federico II

ARCHIVIO ISTITUZIONALE DELLA RICERCA-UNIVERSITA' DEGLI STUDI DI NAPOLI "L'ORIENTALE"

Università degli Studi di Napoli L'Orientale: CINECA IRIS

How to improve the Quality Assurance System of the Universities: a study based on compositional analysis

Author: Menini T
Simonacci V
Gallo M
Menini T
Gallo M
Bertaccini B
Simonacci V
Publication venue
Publication date: 01/01/2018
Field of study

he National Agency for the Evaluation of Universities and Research (ANVUR) has for some decades defined the criteria for systematically evaluating student satisfaction. The analysis of these data presents various difficulties both in terms of data collection and analysis. The aim of this work is to propose Cande- comp/Parafac for a compositional analysis, which is able to capture the multidi- mensional aspects of the phenomenon taking into account its ordinal nature and the temporal characteristics of data collection

Archivio della ricerca - Università degli studi di Napoli Federico II

ARCHIVIO ISTITUZIONALE DELLA RICERCA-UNIVERSITA' DEGLI STUDI DI NAPOLI "L'ORIENTALE"

Università degli Studi di Napoli L'Orientale: CINECA IRIS

Statistical tools for student evaluation of academic educational quality

Author: Simonacci V
Gallo M
GALLO Michele
SIMONACCI VIOLETTA
Publication venue
Publication date: 01/01/2016
Field of study

Measuring academic educational quality presents three major difficulties, typical of all customer satisfaction and service quality studies: the use of subjective scales; the ordinal nature of the data; and the multifold structure of satisfaction. In order to solve these problems, principal component analysis (PCA) of compositional data is proposed in this work. The core idea behind this methodology is to analyze by PCA the relative information within the data rather than focusing on absolute scores. This approach is discussed in comparison with a widely used Item Response Theory method (the Partial Credit Model) in order to assess its merits, e.g. always identifying a coherent preference structure. Both procedures were, thus, carried out on a real dataset collected with the 2013/14 ANVUR questionnaire by L’Universita´ di Napoli-L’Orientale

Archivio della ricerca - Università degli studi di Napoli Federico II

Crossref

ARCHIVIO ISTITUZIONALE DELLA RICERCA-UNIVERSITA' DEGLI STUDI DI NAPOLI "L'ORIENTALE"

Università degli Studi di Napoli L'Orientale: CINECA IRIS

Evaluation of Research Quality (VQR): a case study based on DINDSCAL for compositions

Author: Simonacci V
Gallo M
Di Palma MA
TRENDAFILOV NIKOLAY
Publication venue
Publication date: 01/01/2017
Field of study

The eValuation of Research Quality (VQR) is one the most important assessment process achieved by the National Agency for the Evaluation of Universities and Research Institutes (ANVUR). Its main task is to provide information on the status of the Italian research system assessing the performance of universities in various scientific areas. The entities measured are made up of researchers, assistants, first and second band professors, fixed-term professors and researchers, technology and research executives. For the purposes, ”research products” as journal contributions, volume contributions, and other types of scientific products are considered. The basic evaluation criteria were defined by groups of experts (GEV) according to the specific characteristics of each subject area and through a synthetic statement on the products. In this framework differences between GEV groups on a differential set of quality judgment should be explained in terms of compositional dissimilarity matrices. In literature the INDSCAL (Individual Differences Scaling) model is used to study the individual differences in three-way data by doubly centered a set of matrices of squared dissimilarity measures between a range of stimuli. A direct approach is here preferred, defined DINDSCAL (Direct INDividual Differences SCALing), in order to directly analyze simultaneous slices of dissimilarity matrices organized as compositional data. The compositional aspect of data allow to understand, at a first glance, which is the research product with the highest assessment compared to the remaining ones, irrespective of the role and the type of institutions to which researchers belong. Additionally, the DINDSCAL algorithm underlines the main divergencies made by each GEV group in terms of research output classification

Archivio della ricerca - Università degli studi di Napoli Federico II

ARCHIVIO ISTITUZIONALE DELLA RICERCA-UNIVERSITA' DEGLI STUDI DI NAPOLI "L'ORIENTALE"