1,721,155 research outputs found

    Preface

    No full text

    Knowledge graph embedding for experimental uncertainty estimation

    Full text link
    Purpose: Experiments are the backbone of the development process of data-driven predictive models for scientific applications. The quality of the experiments directly impacts the model performance. Uncertainty inherently affects experiment measurements and is often missing in the available data sets due to its estimation cost. For similar reasons, experiments are very few compared to other data sources. Discarding experiments based on the missing uncertainty values would preclude the development of predictive models. Data profiling techniques are fundamental to assess data quality, but some data quality dimensions are challenging to evaluate without knowing the uncertainty. In this context, this paper aims to predict the missing uncertainty of the experiments. Design/methodology/approach: This work presents a methodology to forecast the experiments’ missing uncertainty, given a data set and its ontological description. The approach is based on knowledge graph embeddings and leverages the task of link prediction over a knowledge graph representation of the experiments database. The validity of the methodology is first tested in multiple conditions using synthetic data and then applied to a large data set of experiments in the chemical kinetic domain as a case study. Findings: The analysis results of different test case scenarios suggest that knowledge graph embedding can be used to predict the missing uncertainty of the experiments when there is a hidden relationship between the experiment metadata and the uncertainty values. The link prediction task is also resilient to random noise in the relationship. The knowledge graph embedding outperforms the baseline results if the uncertainty depends upon multiple metadata. Originality/value: The employment of knowledge graph embedding to predict the missing experimental uncertainty is a novel alternative to the current and more costly techniques in the literature. Such contribution permits a better data quality profiling of scientific repositories and improves the development process of data-driven models based on scientific experiments

    Extracting Large Scale Spatio-Temporal Descriptions from Social Media

    Full text link
    The ability to track large-scale events as they happen is essential for understanding them and coordinating reactions in an appropriate and timely manner. This is true, for example, in emergency management and decision-making support, where the constraints on both quality and latency of the extracted information can be stringent. In some contexts, real-time and large-scale sensor data and forecasts may be available. We are exploring the hypothesis that this kind of data can be augmented with the ingestion of semistructured data sources, like social media. Social media can diffuse valuable knowledge, such as direct witness or expert opinions, while their noisy nature makes them not trivial to manage. This knowledge can be used to complement and confirm other spatio-temporal descriptions of events, highlighting previously unseen or undervalued aspects. The critical aspects of this investigation, such as event sensing, multilingualism, selection of visual evidence, and geolocation, are currently being studied as a foundation for a unified spatio-temporal representation of multi-modal descriptions. The paper presents, together with an introduction on the topics, the work done so far on this line of research, also presenting case studies relevant to the posed challenges, focusing on emergencies caused by natural disasters

    Educational Chatbots: A Sustainable Approach for Customizable Conversations for Education

    No full text
    This paper proposes using chatbots as “tutors” in a learning environment; tutors who are not domain experts but helpers in guiding students through bodies of learning material. The most original contributions are the proposal that conversation should be content-independent (although chatbots speak about content); The production process should allow non-technical actors to customize chatbots and keep the costs of development and deployment low. We specifically discuss conversation customization, which is relevant, especially for learning applications, where users might have specific needs or problems. We achieve the features introduced above via extensive “configuration” (regarding direct programming), making the underlying technology novel and original. Experiments with teachers and students have shown that chatbots in education can be effective and that customization of conversations is relevant and valued by users

    CIME: Context-aware geolocation of emergency-related posts

    Full text link
    Information extracted from social media has proven to be very useful in the domain of emergency management. An important task in emergency management is rapid crisis mapping, which aims to produce timely and reliable maps of affected areas. During an emergency, the volume of emergency-related posts is typically large, but only a small fraction is relevant and help rapid mapping effectively. Furthermore, posts are not useful for mapping purposes unless they are correctly geolocated and, on average, less than 2% of posts are natively georeferenced. This paper presents an algorithm, called CIME, that aims to identify and geolocate emergency-related posts that are relevant for mapping purposes. While native geocoordinates are most often missing, many posts contain geographical references in their metadata, such as texts or links that can be used by CIME to filter and geolocate information. In addition, social media creates a social network and each post can be enhanced with indirect information from the post’s network of relationships with other posts (for example, a retweet can be associated with other geographical references which are useful to geolocate the original tweet). To exploit all this information, CIME uses the concept of context, defined as the information characterizing a post both directly (the post’s metadata) and indirectly (the post’s network of relationships). The algorithm was evaluated on a recent major emergency event demonstrating better performance with respect to the state of the art in terms of total number of geolocated posts, geolocation accuracy and relevance for rapid mapping
    corecore