1,721,117 research outputs found
Data Warehouse Life-Cycle and Design
The term data warehouse life-cycle is used to indicate the phases (and their relationships) a data warehouse system goes through between when it is conceived and when it is no longer available for use. Apart from the type of software, life cycles typically include the following phases: requirement analysis, design (including modeling), construction, testing, deployment, operation, maintenance, and retirement. On the other hand, different life cycles differ in the relevance and priority with which the phases are carried out, which can vary according to the implementation constraints (i.e., economic constraints, time constraints, etc.) and the software specificities and complexity. In particular, the specificities in the data warehouse life-cycle derive from the presence of the operational database that feeds the system and by the extent of this kind of system that must be considered in order to keep the cost and the complexity of the project under control.
Although the design phase is only a step within the overall life cycle, the identification of a proper life-cycle model and the adoption of a correct design methodology are strictly related since each one influences the other
Well-being in the digitalization era: Opportunities, challenges, and threats
By digital transformation we mean a set of cultural, social, creative, and organizational changes triggered by the progressive adoption of technological and digital solutions. Digital transformation is hinged into three pillars: (a) creation of digital culture among citizens, (b) adaptation of procedures on digital basis, (c) adaptation of infrastructures. It is therefore not a mere technological phenomenon, but a real transformation of the way of living and interacting. Digitization seems today an unstoppable process. However, as Pierpaolo Pasolini already noted, "development" and "progress" should not be considered synonymous. The first term concerns the progressive adoption of technological and scientific solutions essentially aimed at profit, the second refers to an effective improvement in the quality of life and social well-being. In companies, digital transformation is experienced as a virtuous process and in many respects a necessary and unavoidable step to guarantee the competitiveness and survival of companies in a globalized context. The impact of digitization in the personal sphere poses many more problems because it is necessary to put development at the service of progress or, in other words, it is necessary to ensure that digitization is functional to well-being. In this chapter we will describe, without any ambition of completeness, some of the main areas of application of digital functions in the life of citizens and communities, highlighting the opportunities for progress they determine and the risks associated with a distorted adoption of technology. The trade-off between these two factors determines the impact on well-being
Design Issues in Social Business Intelligence Projects
With the term Social Business Intelligence we refer to a branch of Business Intelligence specialized in applying On-Line Analytical Processing analysis to User-Generated Contents collected from the Web and other sources of social information. The high dynamics of the domain as well as the nature of the source data, that are textual rather than numerical, require specific techniques both for modeling data and managing a project. Despite the increasing diffusion of Social Business Intelligence applications, only few works in the academic literature addressed such distinguishing features. In this paper we propose both a modeling technique and a methodology that enable the possibility of carrying out a more dynamic and expressive design in Social Business Intelligence projects. We also propose a set of experimental results on real data and real projects proving the effectiveness of our solutions
OLAM
The term Online Analytical Mining, coined in 1997 by J. Han [9], refers to solutions that integrate online analytical processing (OLAP) with data mining functionalities so that mining can be performed in different portions of databases or data warehouses and at different levels of abstraction at the user’s fingertips. In such a system, data mining techniques will beneficiate of a higher level of integration, consistency, and cleanness, and data warehouse users will be able to express more powerful queries directly from their user interface. Although no commercial tools make available a complete and integrated set of OLAM features, many data mining techniques have been extended to deal with specific data warehouse features, while new algorithms, that specifically address the OLAP user’s advanced requirements, have been developed
Proceedings of the ACM 15th International Workshop on Data Warehousing and OLAP
The ACM International Workshop on Data Warehousing and OLAP – DOLAP is an annual event
that provides an international forum where both researchers and practitioners can share their
findings in theoretical foundations, current methodologies, practical experiences, and new research
directions in the areas of data warehousing and online analytical processing
Social BI to understand the debate on vaccines on the Web and social media: unraveling the anti-, free, and pro-vax communities in Italy
The debate on vaccines in Italy has greatly intensified in recent years. The promulgation of a law that makes a set of ten vaccines obligatory has pushed this formerly niche topic to a mainstream level. The law itself is an answer to the progressive erosion of the vaccine coverage. The debate has become a political topic with three main positions: supporters of the importance of vaccines, opponents who claim that vaccines are harmful to health, and the new position of those contesting only the mandatoriness of vaccinations. In this paper, we build on a Social Business Intelligence architecture to propose an in-depth analysis of the emerging social debate. Our analysis spans over more than three years, covering all the Web and social media. We adopt several techniques, including community detection and text analytics, to understand the evolution of the debate, the discussed topics, and the structure and peculiarities of the main social communities. The study reveals that the communities are well characterized, especially from a political perspective, and provides useful insights to official health organizations to improve their communication strategies
Social business intelligence: OLAP applied to user generated contents
Social BI is an emerging discipline that aims at applying OLAP analysis to textual user-generated content to let decision-makers analyze their business based on the trends perceived from the environment. Despite the increasing diffusion of SBI applications, only few works in the academic literature addressed the specificities of this applications. In this paper we report some of this distinguishing features and discuss possible solutions
From Star Schemas to Big Data: 20+ Years of Data Warehouse Research
Data Warehouses are the core of the modern systems for decision making. They store integrated information extracted from various and heterogeneous data sources, making it available in multidimensional form for analyses aimed at improving the users' knowledge of their business. Though the first use of the term dates back to the 80s, only during the late 90s data warehousing has emerged as a research area on its own, though in strict correlation with several other research topics as database integration, view materialization, data visualization, etc. This paper surveys more than 20 years of research on data warehouse systems, from their early relational implementations (still widely adopted in corporate environments), to the new architectures solicited by Business Intelligence 2.0 scenarios during the last decade, and up to the exciting challenges now posed by the integration with big data settings. The timeline of research is organized into three interrelated tracks: techniques, architectures, and methodologies
Schema Profiling of Document Stores
In document stores, schema is a soft concept and the documents in a collection can have different schemata; this gives designers and implementers augmented flexibility but requires an extra effort to understand the rules that drove the use of alternative schemata when heterogeneous documents are to be analyzed or integrated. In this paper we outline a technique, called schema profiling, to explain the schema variants within a collection in document stores by capturing the hidden rules explaining the use of these variants; we express these rules in the form of a decision tree, called schema profile, whose main feature is the coexistence of value-based and schema-based conditions. Consistently with the requirements we elicited from real users, we aim at creating explicative, precise, and concise schema profiles; to quantitatively assess these qualities we introduce a novel measure of entropy
Streaming Approach to Schema Profiling
Schema profiling consists in producing key insights about the schema of data in a high-variety context. In this paper, we present a streaming approach to schema profiling, where heterogeneous data is continuously ingested from multiple sources, as is typical in many IoT applications (e.g., with multiple devices or applications dynamically logging messages). The produced profile is a clustering of the schemas extracted from the data and it is computed and evolved in real-time under the overlapping sliding window paradigm. The approach is based on two-phase k-means clustering, which entails pre-aggregating the data into a coreset and incrementally updating the previous clustering results without recomputing it in every iteration. Differently from previous proposals, the approach works in a domain where dimensionality is variable and unknown apriori, it automatically selects the optimal number of clusters, and detects cluster evolution by minimizing the need to recompute the profile. The experimental evaluation demonstrated the effectiveness and efficiency of the approach against the naïve baseline and the state-of-the-art algorithms on stream clustering
- …
