1,721,002 research outputs found

    Incremental Discovery of Imprecise Functional Dependencies

    No full text
    Functional dependencies (FDs) are one of the metadata used to assess data quality and to perform data cleaning operations. However, in order to pursue robustness with respect to data errors, it has been necessary to devise imprecise versions of functional dependencies, yielding relaxed functional dependencies (RFDs). Among them, there exists the class of RFDs relaxing on the extent, i.e., those admitting the possibility that an FD holds on a subset of data. In the literature several algorithms to automatically discover RFDs from big data collections have been defined. They achieve good performances with respect to the inherent problem complexity. However, most of them are capable of discovering RFDs only by batch processing the entire dataset. This is not suitable in the era of big data, where the size of a database instance can grow with high-velocity, and the insertion of new data can invalidate previously holding RFDs. Thus, it is necessary to devise incremental discovery algorithms capable of updating the set of holding RFDs upon data insertions, without processing the entire dataset. To this end, in this paper we propose an incremental discovery algorithm for RFDs relaxing on the extent. It manages the validation of candidate RFDs and the generation of possibly new RFD candidates upon the insertion of the new tuples, while limiting the size of the overall search space. Experimental results show that the proposed algorithm achieves extremely good performances on real-world datasets

    La famille maltraitante

    No full text
    il volume analizza le dinamiche psicologiche e le relazioni familiari connesse alla violenza ai danni dell'infanzia e le conseguenze sulla vittima, attraverso una trattazione teorica corredata da esemplificazioni clinich

    A deep learning approach to classify country and value of modern coins

    Full text link
    The use of Artificial Intelligence (AI) to preserve and promote cultural heritage has experienced significant growth in recent years. Among the various areas of cultural heritage, numismatics have emerged as a particularly promising field where we can develop AI solutions. Numismatics refers to the study of coins, tokens, paper money, and medals, which play a critical role in understanding human history and culture. However, there are still limited resources available to help researchers and collectors in the identification of coins. This is due to the vast number of coins in circulation, which presents a significant challenge in developing smart tools for classification tasks. This paper aims to provide a contribution to this setting. In particular, we start by creating a new dataset called EURO-Coin, which consists of images showing the side of coins with reliefs and is designed to facilitate the training and testing of AI models for euro coin classification. Then, we propose two approaches that leverage Convolutional Neural Networks and self-attention layers to classify the country and value of the coins. In our experiments, we obtain an accuracy of 86.9% for country classification and an accuracy of 96.4% for value classification. Finally, we conduct an ablation study to evaluate the impact of the preprocessing activities and attention layers in our approache

    A deep learning approach to classify country and value of modern coins

    No full text
    The use of Artificial Intelligence (AI) to preserve and promote cultural heritage has experienced significant growth in recent years. Among the various areas of cultural heritage, numismatics have emerged as a particularly promising field where we can develop AI solutions. Numismatics refers to the study of coins, tokens, paper money, and medals, which play a critical role in understanding human history and culture. However, there are still limited resources available to help researchers and collectors in the identification of coins. This is due to the vast number of coins in circulation, which presents a significant challenge in developing smart tools for classification tasks. This paper aims to provide a contribution to this setting. In particular, we start by creating a new dataset called EURO-Coin, which consists of images showing the side of coins with reliefs and is designed to facilitate the training and testing of AI models for euro coin classification. Then, we propose two approaches that leverage Convolutional Neural Networks and self-attention layers to classify the country and value of the coins. In our experiments, we obtain an accuracy of 86.9% for country classification and an accuracy of 96.4% for value classification. Finally, we conduct an ablation study to evaluate the impact of the preprocessing activities and attention layers in our approaches

    Discovery Multiple Data Structures in Big Data through Global Optimization and Clustering Methods

    No full text
    In this paper, we propose an approach to Big Data visualization, based on clustering techniques, in order to find a structure of them and to facilitate their visualization. However, the main problem of clustering is that sometimes converge to a local minimum showing only one solution, so an optimization of the K-means algorithm has been proposed with the aim to escape from local minimum and to visualize different solutions of the same problem. In particular, we use the K-means algorithm with multiple random starting points, in order to find several solutions to the same problem. This algorithm considers the data of the Italian calls for tenders, extracted through a crawling technique, and optimized through the proposed approach to obtain multiple solutions. These are used to achieve a repository of products that can be easily displayed and inquired during the formulation of an offer from a bidder company willing to participate to a call for tenders. The case study results show the feasibility and validity of the proposed approach

    Discovering Functional Dependencies: Can We Use ChatGPT to Generate Algorithms?

    No full text
    The establishment of Large Language Models allowed people to interact with tools capable of answering in a natural language many kinds of questions on even very large sets of topics. Although the natural language generation processes have to address several issues (e.g., providing focused content w.r.t. queries, composing texts without ambiguities, and so forth), models and tools are becoming more and more capable of providing answers with a syntactically and semantically correct form, independently from both topics and languages. This led to enabling an algorithm to become capable of writing algorithms together with their implementation, so tackling an even more complex task since programming languages are more rigid and precise, and the generated code should also embrace the reasoning underlying methodologies used to solve problems at different levels of complexities. At present, the most representative example of such a tool is given by ChatGPT. Based on the GPT-3.5 model and trained over more than 300 Billion tokens, ChatGPT obtained high notoriety and is starting to impact society due to its wide usage in the daily life of people. This paper aims at evaluating to what extent ChatGPT and its underlying model are capable of generating algorithms for the discovery of Functional Dependencies (fds) from data. The latter represents a very complex problem to which the scientific literature has devoted much effort. The inference of a correct, minimal, and complete set of fds, holding on a given dataset, defines the main constraints guaranteeing literature solutions to be considered effective, leading to questioning if also solutions generated from ChatGPT can satisfy them. In particular, by following a prompt-based approach, we enabled ChatGPT to provide 7 different solutions to the fd discovery problem and measured their results in comparison with the ones provided by the HyFD discovery algorithm, one of the most efficient solutions provided in the literature

    An intelligent system for focused crawling from Big Data sources

    No full text
    Nowadays, the proper management of data is a key business enabler and booster for companies, so as to increase their competitiveness. Typically, companies hold massive amounts of data within their servers, which might include previously offered services, proposals, bids, and so on. They rely on their expert managers to manually analyse them in order to make strategic decisions. However, given the huge amount of information to be analysed and the necessity of making timely decisions, they often exploit a small amount of the available data, which often does not yield effective choices. For instance, this happens in the context of the e-procurement domain, where bids for new calls for tender are often formulated by looking at some past proposals from a company. Driven by an extensive experience on the e-procurement domain, in this paper we propose an intelligent system to support organisations in the focused crawling of artefacts (calls for tender, BIMs, equipment, policies, market trends, and so on) of interest from the web, semantically matching them against internal Big Data and knowledge sources, so as to let companies analysts make better strategic decisions. The novel contribution consists of a proper extension of the K-means algorithm used by a web crawler within the proposed system, and a semantic module exploiting search patterns to find relevant data within the crawled artefacts. The proposed solution has been implemented and extensively assessed in the e-procurement domain. It has been successively extended to other domains, such as robot programming, cloud providing, and several other domains. Since to the best of our knowledge in the literature do not exists similar systems, in order to prove its effectiveness we have compared its crawling component against similar crawlers, by plugging them within our system

    Going Beyond Counting First Authors in Author Co-citation Analysis

    Full text link
    The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed

    Variations on the Author

    Full text link
    “Variations on the Author” discusses two of Eduardo Coutinho’s recent films (Um Dia na Vida, from 2010, and Últimas Conversas, posthumously released in 2015) and their contribution to the general question of documentary authorship. The director’s filmography is characterized by a consistent yet self-effacing form of authorial self-inscription: Coutinho often features as an interviewer that rather than express opinions propels discourses; an interviewer that is good at listening. This mode of self-inscription characterizes him as an author who is not expressive but who is nonetheless markedly present on the screen. In Um Dia na Vida, however, Coutinho is completely absent form the image, while Últimas Conversas, on the contrary, includes a confessional prologue that moves the director from the margins to the center of his films. This article examines the ways in which these works stand out in the filmography of a director who offers new insights into the notion of cinematic authorship
    corecore