1,720,960 research outputs found

    APPLICATIONS OF COMBINATORIAL OPTIMIZATION ARISING FROM LARGE SCALE SURVEYS

    Full text link
    Many difficult statistical problems arising in censuses or in other large scale surveys have an underlying Combinatorial Optimization structure and can be solved with Combinatorial Optimization techniques. These techniques are often more efficient than the ad hoc solution techniques already developed in the field of Statistics. This thesis considers in detail two relevant cases of such statistical problems, and proposes solution approaches based on Combinatorial Optimization and Graph Theory. The first problem is the delineation of Functional Regions, the second one concerns the selection of the scope of a large survey, as briefly described below. The purpose of this work is therefore the innovative application of known techniques to very important and economically relevant practical problems that the "Censuses, Administrative and Statistical Registers Department" (DICA) of the Italian National Institute of Statistics (Istat), where I am senior researcher, has been dealing with. In several economical, statistical and geographical applications, a territory must be partitioned into Functional Regions. This operation is called Functional Regionalization. Functional Regions are areas that typically exceed administrative boundaries, and they are of interest for the evaluation of the social and economical phenomena under analysis. Functional Regions are not fixed and politically delimited, but are determined only by the interactions among all the localities of a territory. In this thesis, we focus on interactions represented by the daily journey-to-work flows between localities in which people live and/or work. Functional Regionalization of a territory often turns out to be computationally difficult, because of the size (that is, the number of localities constituting the territory under study) and the nature of the journey-to-work matrix (that is, the sparsity). In this thesis, we propose an innovative approach to Functional Regionalization based on the solution of graph partition problems over an undirected graph called transitions graph, which is generated by using the journey-to-work data. In this approach, the problem is solved by recursively partitioning the transition graph by using the min cut algorithms proposed by Stoer and Wagner and Brinkmeier. %In the second approach, the problem is solved maximizing a function of the sizes and interactions of subsets identified by successions of partitions obtained via Multilevel partitioning approach. This approach is applied to the determination of the Functional Regions for the Italian administrative regions. The target population of a statistical survey, also called scope, is the set of statistical units that should be surveyed. In the case of some large surveys or censuses, the scope cannot be the set of all available units, but it must be selected from this set. Surveying each unit has a cost and brings a different portion of the whole information. In this thesis, we focus on the case of Agricultural Census. In this case, the units are farms, and we want to determine a subset of units producing the minimum total cost and safeguarding at least a certain portion of the total information, according to the coverage levels assigned by the European regulations. Uncertainty aspects also occur, because the portion of information corresponding to each unit is not perfectly known before surveying it. The basic decision aspect is to establish the inclusion criteria before surveying each unit. We propose here to solve the described problem using multidimensional binary knapsack models

    Il ruolo centrale della conoscenza quale fattore di produzione fondamentale nelle economia contemporanee

    No full text
    Definire la natura della conoscenza con l'intento di fare emergere la sua parziale divisibilità dal soggetto conoscente servirà a delineare i tratti distintivi del nuovo capitalismo. Gli autori si interrogano sulla compatibilita' della conoscenza quale forza propulsiva del sistema socio-economico con il permanere di un sistema di produzione imperniato sulla trasformazione della conoscenza in capitale

    A Combinatorial Optimization Approach to the Selection of Statistical Units

    Full text link
    In the case of some large statistical surveys, the set of units that will constitute the scope of the survey must be selected. We focus on the real case of a Census of Agriculture, where the units are farms. Surveying each unit has a cost and brings a different portion of the whole information. In this case, one wants to determine a subset of units producing the minimum total cost for being surveyed and representing at least a certain portion of the total information. Uncertainty aspects also occur, because the portion of information corresponding to each unit is not perfectly known before surveying it. The proposed approach is based on combinatorial optimization, and the arising decision problems are modeled as multidimensional binary knapsack problems. Experimental results show the effectiveness of the proposed approach

    A min-cut approach to functional regionalization, with a case study of the Italian local labour market areas

    Full text link
    In several economical, statistical and geographical applications, a territory must be subdivided into functional regions. Such regions are not fixed and politically delimited, but should be identified by analyzing the interactions among all its constituent localities. This is a very delicate and important task, that often turns out to be computationally difficult. In this work we propose an innovative approach to this problem based on the solution of minimum cut problems over an undirected graph called here transitions graph. The proposed procedure guarantees that the obtained regions satisfy all the statistical conditions required when considering this type of problems. Results on real-world instances show the effectiveness of the proposed approach

    Exploring copulas for the imputation of complex dependent data

    No full text
    In this work we introduce a copula-based method for imputing missing data by using conditional density functions of the missing variables given the observed ones. In theory, such functions can be derived from the multivariate distribution of the variables of interest. In practice, it is very difficult to model joint distributions and derive conditional distributions, especially when the margins are different. We propose a natural solution to the problem by exploiting copulas so that we derive conditional density functions through the corresponding conditional copulas. The approach is appealing since copula functions enable us (1) to fit any combination of marginal distribution functions, (2) to take into account complex multivariate dependence relationships and (3) to model the marginal distributions and the dependence structure separately. We describe the method and perform a Monte Carlo study in order to compare it with two well-known imputation techniques: the nearest neighbour donor imputation and the regression imputation by EM algorithm. Our results indicate that the proposal compares favourably with classical methods in terms of preservation of microdata, margins and dependence structure

    Going Beyond Counting First Authors in Author Co-citation Analysis

    Full text link
    The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed

    Variations on the Author

    Full text link
    “Variations on the Author” discusses two of Eduardo Coutinho’s recent films (Um Dia na Vida, from 2010, and Últimas Conversas, posthumously released in 2015) and their contribution to the general question of documentary authorship. The director’s filmography is characterized by a consistent yet self-effacing form of authorial self-inscription: Coutinho often features as an interviewer that rather than express opinions propels discourses; an interviewer that is good at listening. This mode of self-inscription characterizes him as an author who is not expressive but who is nonetheless markedly present on the screen. In Um Dia na Vida, however, Coutinho is completely absent form the image, while Últimas Conversas, on the contrary, includes a confessional prologue that moves the director from the margins to the center of his films. This article examines the ways in which these works stand out in the filmography of a director who offers new insights into the notion of cinematic authorship

    Appropriate Similarity Measures for Author Cocitation Analysis

    Full text link
    We provide a number of new insights into the methodological discussion about author cocitation analysis. We first argue that the use of the Pearson correlation for measuring the similarity between authors’ cocitation profiles is not very satisfactory. We then discuss what kind of similarity measures may be used as an alternative to the Pearson correlation. We consider three similarity measures in particular. One is the well-known cosine. The other two similarity measures have not been used before in the bibliometric literature. Finally, we show by means of an example that our findings have a high practical relevance.information science;Pearson correlation;cosine;similarity measure;author cocitation analysis

    Dispelling the Myths Behind First-author Citation Counts

    Full text link
    We conducted a full-scale evaluative citation analysis study of scholars in the XML research field to explore just how different from each other author rankings resulting from different citation counting methods actually are, and to demonstrate the capability of emerging data and tools on the Web in supporting more realistic citation counting methods. Our results contest some common arguments for the continued use of first-author citation counts in the evaluation of scholars, such as high correlations between author rankings by first-author citation counts and other citation counting methods, and high costs of using more realistic citation counting methods that are not well-supported by the ISI databases. It is argued that increasingly available digital full text research papers make it possible for citation analysis studies to go beyond what the ISI databases have directly supported and to employ more sophisticated methods
    corecore