1,721,166 research outputs found
Methodology for assessment of linked data quality
With the expansion in the amount of data being produced as Linked Data (LD), the opportunity to build use cases has also increased. However, a crippling problem to the reliability of these use cases is the underlying poor data quality. Moreover, the ability to assess the quality of the consumed LD, based on the satisfaction of the consumers' quality requirements, significantly influences usability of such data for a given use case. In this paper, we propose a data quality assessment methodology specifically designed for LD. This methodology consists of three phases and six steps with specific emphasis on considering a use case. Copyright is held by the author/owner(s
DC Proposal: Towards Linked Data Assessment and Linking Temporal Facts
Since the Linked Data is continuously growing on the Web, the quality of overall data can rapidly degrade over time. The research proposed here deals with the quality assessment in the Linked Data and the temporal linking techniques. First, we conduct an in-depth study of appropriate dimensions and their respectively metrics by defining a data quality framework that evaluates, along these dimensions, linked published data on the Web. Second, since the assessment and improvement of the Linked Data quality such as accuracy or the resolution of heterogeneities is performed through record linkage techniques, we propose an extended technique that apply time in similarity computation which can improve over traditional linkage techniques. This paper describes the core problem, presents the proposed approach, reports on initial results, and lists planned future tasks. © 2011 Springer-Verla
From Data Quality to Big Data Quality: A Data Integration Scenario.
Big data has made its appearance in many fields, including scientific research, business, public administration and so on. Although, it is acknowledged that there exist different aspects (e.g., acquisition of
data, extraction, pre-processing, analysis modelling and functionality, interpretation, etc.) that might
affect the benefit of such data, several authors identify data quality as the most decisive one. More
recently, a variety of data types have arisen from linguistic and visual information, used and diffused
through social networks, Internet of things, enterprise and public sector information systems as well
as the Web. The big data phenomenon has deeply impacted on the diversity of types of data. In our
previous work, we provided a deep investigation on how data quality concepts can be extended to such
vast set of data types, encompassing, e.g., semi-structured texts, maps, images and linked data. In this
work, we focus on Linked Data, a type of data that can be viewed as big data and study the effect of data
quality in a data integration scenario
Data Quality and Data Cleansing of Semantic Data
In this chapter, we first introduce the concepts of
Linked Data quality and its dimensions and metrics. Then we provide definitions for 18 quality
dimensions along with a total of 69 metrics to
measure the dimensions.
Thereafter, we provide an overview of tools
currently available for Linked Data quality
assessment followed by an introduction to the
W3C Data Quality Vocabulary. Finally, we
discuss some of the open research challenges
along with few solutions already available
A Distributed Registry of Multi-perspective Data Services in Cyber Physical Production Networks
The advances in smart technologies, such as sensor networks, cloud computing, data management and artificial intelligence, enable production systems to communicate with each other and rapidly configure themselves to meet dynamic production needs. In this context, the adoption of service-oriented computing is aimed at enabling modular and standardised software infrastructures, platform-independent interactions between software components and information hiding for ensuring data sovereignty in a fully distributed environment. However, for a full-fledged exploitation of service-oriented computing capabilities in the Industry 4.0 production systems, the existing service design solutions still lack a clear specification of what is the data which the service relies on, what is the business goal of the service and when it is invoked within the information flow throughout the production network. In this paper, we propose the model of a registry of data-oriented services in an industrial production chain. The organisation of services in the registry is guided by multiple aspects of the production network, namely: (i) the business goal of a real production network (ii) the perspective on production data that is managed through the service (iii) the high level action performed by the service The modelling strategy has been conceived to properly guide service design against ad-hoc solutions, thus facilitating future service selection and composition to meet the business goals of collaborating actors. The resulting portfolio of services can be declined by each actor of the production network, leading to a distributed registry that allows each actor to preserve control over the owned data. The application in a case study has been performed to demonstrate the feasibility of the data-oriented services
KGHeartBeat: A Knowledge Graph Quality Assessment Tool
This demo proposes KGHeartBeat, a community-shared open-source knowledge graph quality assessment tool to periodically perform quality analysis on all the freely available knowledge graphs registered on the LOD cloud and DataHub. As a proof of concept, we discuss the comparison of different linguistic versions of DBpedia via KGHeartBeat
KGHeartBeat: An Open Source Tool for Periodically Evaluating the Quality of Knowledge Graphs
Knowledge Graphs are an extraordinary source of data due to their vastness, the topics heterogeneity and the presence of sources curated by companies, research groups, volunteers, and dedicated communities. Identifying high-quality Knowledge Graphs requires supporting developers and end-users in comparing and assessing data quality of publicly available Knowledge Graphs. However, no fully working and maintained Knowledge Graph quality assessment tool was found during the review of related research. This article fully describes KGHeartBeat, a community shared open-source knowledge graph quality assessment tool designed to periodically perform quality analysis on a wide range of freely available knowledge graphs registered on the LOD Cloud and DataHub. Users can either visually explore the quality assessment report and compare knowledge graphs via a freely available web-based interface or download data analysis results for further analysis. Moreover, KGHeartBeat is also released as APIs so developers can easily integrate them into any quality management tool. As a proof of concept, we discuss different use cases to show KGHeartBeat in practice, demonstrating how it can be used to compare multiple Knowledge Graphs, assess quality dimensions over time, and report performance analysis in terms of execution time. Resource type Community Shared Software Framework License MIT Web-app http://www.isislab.it:12280/kgheartbeat Permanent URL https://zenodo.org/records/10990547 Pypi package https://pypi.org/project/kgheartbea
Multi-perspective Data Modelling in Cyber Physical Production Networks: Data, Services and Actors
In recent years, Cyber Physical Production Systems and Digital Threads opened the vision on the importance of data modelling and management to lead the smart factory towards a full-fledged vertical and horizontal integration. Vertical integration refers to the full connection of smart factory levels from the work centers on the shop floor up to the business layer. Horizontal integration is realised when a single smart factory participates in multiple interleaved supply chains with different roles (e.g., main producer, supplier), sharing data and services and forming a Cyber Physical Production Network. In such an interconnected world, data and services become fundamental elements in the cyberspace to implement advanced data-driven applications such as production scheduling, energy consumption optimisation, anomaly detection, predictive maintenance, change management in Product Lifecycle Management, process monitoring and so forth. In this paper, we propose a methodology that guides the design of a portfolio of data-oriented services in a Cyber Physical Production Network. The methodology starts from the goals of the actors in the network, as well as their requirements on data and functions. Therefore, a data model is designed to represent the information shared across actors according to three interleaved perspectives, namely, product, process and industrial assets. Finally, multi-perspective data-oriented services for collecting, monitoring, dispatching and displaying data are built on top of the data model, according to the three perspectives. The methodology also includes a set of access policies for the actors in order to enable controlled access to data and services. The methodology is tested on a real case study for the production of valves in deep and ultra-deep water applications. Experimental validation in the real case study demonstrates the benefits of providing a methodological support for the design of multi-perspective data-oriented services in Cyber Physical Production Networks, both in terms of usability of the data navigation through the services and in terms of service performances in presence of Big Data
A Distributed Registry of Multi-perspective Data Services for the Internet of Production
Service-oriented computing is one of the key enabling technologies to enable the digital transformation of production systems, to communicate with each other and rapidly configure themselves to meet dynamic production needs. Service-oriented architectures (SOA) are also crucial to promote the horizontal integration of digital factories across multiple interleaved supply chains, forming the so-called Internet of Production (IoP). The increasing availability of services from multiple supply chains to be aggregated and composed into composite services is leading to a new service ecosystem, named Big Services. In this context, the traditional vision of service registries, with a flat organisation of services to match the mutual requirements of supply chain actors, is no more feasible. A more structured model is required, taking into account the distinction between domain-oriented atomic services, at the single actor level, and demand-oriented composite services, at the supply chain and IoP levels. In this paper, we propose the model of a distributed registry of data-oriented services in an industrial production network. The organisation of services in the registry is guided by multiple perspectives of the production network, namely: (i) the business goal of a real production network; (ii) the perspective on production data that is managed through the services; (iii) the data flow stages implemented through the services (that is, data collection, monitor, dispatch and display). The resulting portfolio of services is distributed over the production network, allowing each actor to preserve control over the owned data and enabling the dynamic composition of services at higher levels. A preliminary validation in a real case study has been performed to demonstrate the feasibility of the approach
A Multi-Perspective Data Model for Cyber Physical Production Networks
Recently, the research on data management is moving towards the design of data models for Cyber Physical Production Networks. They constitute ecosystems where cyber and engineered physical elements record data (e.g., using sensors), analyse them using connected services (e.g., over cloud computing infrastructures) and interact with human actors using multi-channel interfaces, going beyond the boundaries of a single enterprise and spanning over the whole production network. In this paper, we propose a conceptual data model that relates the digital counterpart of a product with data collected over the phases of the product lifecycle, with special attention on the manufacturing process, and with data gathered to monitor the shop floor machines used during the production. This enables the development of multi-perspective data services that implement advanced business goals at the production network level, such as production scheduling, energy efficiency, product and process monitoring. We provide an architecture of the proposed solution and its preliminary validation in an ongoing industrial research project
- …
