1,721,019 research outputs found

    Construct Validation by Hierarchical Bayesian Concept Maps: An Application to the Transaction Cost Economics Theory of the Firm

    Full text link
    A concept map is a diagram depicting relationships among concepts which is used as a knowledge representation tool in many knowledge domains. In this paper, we build on the modeling framework of Hui et al. (2008) in order to develop a concept map suitable for testing the empirical evidence of theories. We identify a theory by a set of core tenets each asserting that one set of independent variables affects one dependent variable, moreover every variable can have several operational definitions. Data consist of a selected sample of scientific articles from the empirical literature on the theory under investigation. Our “tenet map” features a number of complexities more than the original version. First the links are two-layer: first-layer links connect variables which are related in the test of the theory at issue; second-layer links represent connections which are found statistically significant. Besides, either layer matrix of link-formation probabilities is block-symmetric. In addition to a form of censoring which resembles the Hui et al. pruning step, observed maps are subject to a further censoring related to second-layer links. Still, we perform a full Bayesian analysis instead of adopting the empirical Bayes approach. Lastly, we develop a three-stage model which accounts for dependence either of data or of parameters. The investigation of the empirical support and consensus degree of new economic theories of the firm motivated the proposed methodology. In this paper, the Transaction Cost Economics view is tested by a tenet map analysis. Both the two-stage and the multilevel models identify the same tenets as the most corroborated by empirical evidence though the latter provides a more comprehensive and complex insight of relationships between constructs

    Inequalities between expected marginal log-likelihoods, with implications for likelihood-based model complexity and comparison measures

    No full text
    A multi-level model allows the possibility of marginalization across levels in different ways, yielding more than one possible marginal likelihood. Since log-likelihoods are often used in classical model comparison, the question to ask is which likelihood should be chosen for a given model. The authors employ a Bayesian framework to shed some light on qualitative comparison of the likelihoods associated with a given model. They connect these results to related issues of the effective number of parameters, penalty function, and consistent definition of a likelihood-based model choice criterion. In particular, with a two-stage model they show that, very generally, regardless of hyperprior specification or how much data is collected or what the realized values are, a priori, the first-stage likelihood is expected to be smaller than the marginal likelihood. A posteriori, these expectations are reversed and the disparities worsen with increasing sample size and with increasing number of model levels

    Capturing Distinctiveness: Transparent Procedures to Escape a Pervasive Black-Box Propensity

    No full text
    In many quantitative linguistics applications scholars are interested in identifying a set of linguistic features which proves distinctive for a text or a class of texts with reference to a corpus or a model. Ordinary features are lexical-based elements (words, multi-words, n-grams, lemmas), part of speech categories, or further phonetic and morphosyntactic phenomena. Moreover, in many applications a set of distinctive features is selected a priori in order to achieve a qualitative reading of the texts or to be exploited in text clustering, topic modelling or content mapping tasks. Classification based on supervised Machine Learning (ML) algorithms is commonly used to classify texts (test set) on the basis of training data (training set). Thanks to large amounts of available, mixed, undifferentiated, multilevel, multilayer, and multipurpose features, ML generally provides an effective way to discriminate among existing classes and, then, to ascribe each new text to one of them. Although the accuracy of classification is often highly satisfactory, the distinctive features of each class remain only seldom explainable and transparent. The need to move from black-box procedures to explainable methods is at the basis of the distinction between ML and Statistical Learning (SL) approaches. Both SL and ML exploit data to make predictions but SL aims at a more in-depth understanding of data structures and relations among variables. From this perspective, SL methods capable of identifying the distinctive features of each class should interact with the solutions offered by ML algorithms in order to achieve a description in terms of linguistic similarities and differences. The umbrella term keyness is often used in text analysis to refer to different measures that reveal to what extent a word can be considered distinctive of a text or a text class. Many measures have been developed to meet the requirements of different perspectives, e.g. term frequency-inverse document frequency (TFIDF), log-likelihood and odd ratios, p-values based on the hypergeometric model (to mention just a few) and methods for keyword extraction, distance-based measures as well as solutions provided by Bayesian approaches and generative (topic) models. Starting from an established corpus of institutional speeches (corpus of End-of-Year Addresses of the Italian Presidents of the Republic 1949-2022) arranged by President-classes, this study explored the concept of keyness to highlight the strengths and weaknesses of different approaches, their consistency (overlapping) and how they can be applied in practice, particularly when working with large corpora. As most procedures are grounded on the observation of the occurrences reported in a term-document matrix (TDM), where terms represent features and documents represent texts or classes, most measures should tackle data normalization and dispersion problems (e.g. a linguistic feature should not be considered distinctive of a text as a whole when it occurs only within a specific portion, or of an entire class when it occurs only in one or a limited number of its texts). This work also shows to what extent procedures that exploit equal-sized text chunks samples and tailor-made normalizations of raw frequencies (with related diagnostic measures) play a fundamental role in improving results

    Bayesian decision models for environmental risks

    Full text link
    L'analisi statistica di rischi ambientali _e caratterizzata da notevole complessità anche in relazione a strutture di dipendenza dei dati. Inoltre è spesso richiesto di andare oltre l'inferenza, per valutare le conseguenze di decisioni e raccomandare strategie risolutive. La metodologia Bayesiana ha dato prova di saper fornire strumenti molto flessibili sia per l'inferenza che per l'analisi decisionale nello studio di problemi complessi, e la tipologia dei modelli gerarchici si dimostra uno strumento congruo sia per analisi decisionali a livello locale che per la valutazione di politiche a livello generale. Il problema dell'avvelenamento da arsenico dei pozzi per il rifornimento idrico che colpisce vaste aree del Bangladesh è illustrato come caso esemplare che richieda la costruzione di un modello flessibile e un'analisi delle politiche di riduzione del rischio

    Labour force estimates for small geographical domains in Italy: problems, data and models

    No full text
    One of the contexts where small area estimation techniques have proved their potential is the analysis of data collected in national labour force surveys to obtain estimates for small geographical domains. Applications of small area estimation methods to data from labour force surveys have recently been considered in Italy. This paper gives a review of specific problems, data and opportunities for the application of small area estimation models for producing reliable information at provincial and sub-provincial level in Italy on labour force aggregates. Some new developments stimulated by the application of small area estimation models to the analysis of labour force survey data are also discussed
    corecore