1,720,988 research outputs found

    Official environmental statistical information in Italy

    Full text link
    In this paper, we provide descriptions of the official statistical information sources regarding the environment with a focus on soil and water. The discussion focuses on two major contributors of official statistical information in the field, ISTAT and ISPRA, describing their internal structures and links to corresponding EU institutions. We further review their latest production of statistical information on soil and water and provide several suggestions for the potential development of the field

    Multi-source statistics on employment status in Italy, a machine learning approach

    No full text
    In recent decades, National Statistical Institutes have started to produce official statistics by exploiting multiple sources of information (multi-source statistics) rather than a single source, usually a statistical survey. In this context, one of the research projects addressed by the Italian National Statistical Institute (Istat) concerned methods for producing estimates on employment in Italy using survey data and administrative sources. The former are drawn from the Labour Force survey conducted by Istat, the latter from several administrative sources that Istat regularly acquires from external bodies. We use machine learning methods to predict the individual employment status. This approach is based on the application of decision tree and random forest techniques, that are frequently used to classify large amounts of data. We show how to construct a “new” response variable denoting agreement of the data sources: this approach is shown to maximise the information we may derive by machine learning approach in some problematic cases. The methods have been applied using the R software

    Upper bound estimators of the population size based on ordinal models for capture-recapture experiments

    Full text link
    Capture-recapture studies have attracted a lot of attention over the past few decades, especially in applied disciplines where a direct estimate for the size of a population of interest is not available. Epidemiology, ecology, public health, and biodiversity are just a few examples. The estimation of the number of unseen units has been a challenge for theoretical statisticians, and considerable progress has been made in providing lower bound estimators for the population size. In fact, it is well known that consistent estimators for this cannot be provided in the very general case. Considering a case where capture-recapture studies are summarized by a frequency of frequencies distribution, we derive a simple upper bound of the population size based on the cumulative distribution function. We introduce two estimators of this bound, without any specific parametric assumption on the distribution of the observed frequency counts. The behavior of the proposed estimators is investigated using several benchmark datasets and a large-scale simulation experiment based on the scheme discussed by Pledger

    Credit rationing and the financial structure of Italian small and medium enterprises

    Full text link
    Our aim is to analyze the effect of public subsidies on the development path of Italian small and medium enterprises (SMEs). Public subsidies to SMEs have been often used with the aim of favoring economic growth in less developed regions. The main theoretical arguments justifying this intervention are related to the idea that public subsidies can solve lack-of-capital problems deriving from asymmetric information. According to Stiglitz and Weiss (1981), public subsidies to rationed firms can reduce the informational gap, leading subsidized firms to reduce their financial constraints and to increase their investment levels. Results obtained modelling leverage, performance and investment behaviour in a panel of around 1,900 enterprises over the years 1989 to 1994 seem to confirm the working hypotheses. However, they can not be considered as conclusive and further research is needed in this context

    Extending a Logistic Approach to Risk Modeling through Semiparametric Mixing

    No full text
    Bankruptcy risk, logistic model, finite mixtures, nonparametric maximum likelihood,

    M-quantile regression for multivariate longitudinal data with an application to the Millennium Cohort Study

    Full text link
    Motivated by the analysis of data from the UK Millennium Cohort Study on emotional and behavioural disorders, we develop an M-quantile regression model for multivariate longitudinal responses. M-quantile regression is an appealing alternative to standard regression models; it combines features of quantile and expectile regression and it may produce a detailed picture of the conditional response variable distribution, while ensuring robustness to outlying data. As we deal with multivariate data, we need to specify what it is meant by M-quantile in this context, and how the structure of dependence between univariate profiles may be accounted for. Here, we consider univariate (conditional) M-quantile regression models with outcome-specific random effects for each outcome. Dependence between outcomes is introduced by assuming that the random effects in the univariate models are dependent. The multivariate distribution of the random effects is left unspecified and estimated from the observed data. Adopting this approach, we are able to model dependence both within and between outcomes. We further discuss a suitable model parameterisation to account for potential endogeneity of the observed covariates. An extended EM algorithm is defined to derive estimates under a maximum likelihood approach

    Two-part regression models for longitudinal zero-inflated count data

    No full text
    Two-part models are quite well established in the economic literature, since they resemble accurately a principal-agent type model, where homogeneous, observable, counted outcomes are subject to a (prior, exogenous) selection choice. The first decision can be represented by a binary choice model, modeled using a probit or a logit link; the second can be analyzed through a truncated discrete distribution such as a truncated Poisson, negative binomial, and so on. Only recently, a particular attention has been devoted to the extension of two-part models to handle longitudinal data. The authors discuss a semi-parametric estimation method for dynamic two-part models and propose a comparison with other, well-established alternatives. Heterogeneity sources that influence the first level decision process, that is, the decision to use a certain service, are assumed to influence also the (truncated) distribution of the positive outcomes. Estimation is carried out through an EM algorithm without parametric assumptions on the random effects distribution. Furthermore, the authors investigate the extension of the finite mixture representation to allow for unobservable transition between components in each of these parts. The proposed models are discussed using empirical as well as simulated data

    Going Beyond Counting First Authors in Author Co-citation Analysis

    Full text link
    The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed
    corecore