Search CORE

579 research outputs found

Extreme-Scale Model-Based Time Series Management with ModelarDB (Invited Talk)

Author: Pedersen Torben Bach; id_orcid
Pedersen Torben Bach
Publication venue
Publication date: 01/01/2021
Field of study

To monitor critical industrial devices such as wind turbines, high quality sensors sampled at a high frequency are increasingly used. Current technology does not handle these extreme-scale time series well [Søren Kejser Jensen et al., 2017], so only simple aggregates are traditionally stored, removing outliers and fluctuations that could indicate problems. As a remedy, we present a model-based approach for managing extreme-scale time series that approximates the time series values using mathematical functions (models) and stores only model coefficients rather than data values. Compression is done both for individual time series and for correlated groups of time series. The keynote will present concepts, techniques, and algorithms from model-based time series management and our implementation of these in the open source Time Series Management System (TSMS) ModelarDB[Søren Kejser Jensen et al., 2018; Søren Kejser Jensen et al., 2019; Søren Kejser Jensen et al., 2021] . Furthermore, it will present our experimental evaluation of ModelarDB on extreme-scale real-world time series, which shows that that compared to widely used Big Data formats, ModelarDB provides up to 14× faster ingestion due to high compression, 113× better compression due to its adaptability, 573× faster aggregatation by using models, and close to linear scale-out scalability. ModelarDB is being commercialized by the spin-out company ModelarData

DROPS Dagstuhl Research Online Publication Server

VBN (Videnbasen) Aalborg Universitets forskningsportal

A foundation for spatio-textual-temporal cube analytics

Author: Matteo Lissandrini
Iqbal Mohsin
Pedersen (Torben Bach)
Lissandrini Matteo; id_orcid
Pedersen Torben Bach; id_orcid
Mohsin Iqbal
Publication venue
Publication date: 01/01/2021
Field of study

Large amounts of spatial, textual, and temporal (STT) data are being produced daily. This is data containing an unstructured component (text), a spatial component (geographic position), and a time component (timestamp). Therefore, there is a need for a powerful and general way of analyzing STT data together. In this paper, we define and formalize the Spatio-Textual-Temporal Cube (STTCube) structure to enable combined effective and efficient analytical queries over STT data. Our novel data model over STT objects enables novel joint and integrated STT insights that are hard to obtain using existing methods. Moreover, we introduce the new concept of STT measures with associated novel STTOLAP operators. To allow for efficient large-scale analytics, we present a pre-aggregation framework for exact and approximate computation of STT measures. Our comprehensive experimental evaluation on a real-world Twitter dataset confirms that our proposed methods reduce query response time by 1-5 orders of magnitude compared to the No Materialization baseline and decrease storage cost between 97% and 99.9% compared to the Full Materialization baseline while adding only a negligible overhead in the STTCube construction time. Moreover, approximate computation achieves an accuracy between 90% and 100% while reducing query response time by 3-5 orders of magnitude compared to No Materialization.</p

Catalogo dei prodotti della ricerca Università degli Studi di Verona

VBN (Videnbasen) Aalborg Universitets forskningsportal

Example-Driven Exploratory Analytics over Knowledge Graphs

Author: Matteo Lissandrini
Hose Katja; id_orcid
Pedersen (Torben Bach)
Lissandrini Matteo; id_orcid
Pedersen Torben Bach; id_orcid
Katja Hose
Publication venue
Publication date: 01/01/2023
Field of study

Due to their expressive power, Knowledge Graphs (KGs) have received increasing interest not only as means to structure and integrate heterogeneous information but also as a native storage format for large amounts of knowledge and statistical data. Therefore, analytical queries over KG data, typically stored as RDF, have become increasingly important. Yet, formulating such queries represents a difficult task for users that are not familiar with the query language (typically SPARQL) and the structure of the dataset at hand. To overcome this limitation, we propose Re2xOLAP: The first comprehensive interactive approach that allows to reverse-engineer and refine RDF exploratory OLAP queries over KGs containing statistical data. Thus, Re2xOLAP enables to perform KG exploratory analytics without requiring the user to write any query at all.We achieve this goal by first reverseengineering analytical SPARQL queries from a small set of userprovided examples and then, given the reverse-engineered query, we propose intuitive and explainable exploratory query refinements to iteratively help the user obtain the desired information. Our experiments on real-world large-scale KGs show that Re2xOLAP can efficiently reverse-engineer analytical SPARQL queries solely based on a small set of input examples. Additionally, we demonstrate the expressive power of our interactive refinement methods by showing that Re2xOLAP allows users to navigate hundreds of thousands of different exploration paths with just a few interactions.</p

Catalogo dei prodotti della ricerca Università degli Studi di Verona

VBN (Videnbasen) Aalborg Universitets forskningsportal

Towards Exploratory OLAP over Linked Open Data:a Case Study

Author: Dilshod Ibragimov
Hose Katja
Hose Katja; id_orcid
Esteban Zimányi
Pedersen Torben Bach; id_orcid
Zimanyi Esteban
Ibragimov Dilshod
Pedersen Torben Bach
Katja Hose
Torben Bach Pedersen
Publication venue
Publication date: 01/01/2015
Field of study

Business Intelligence (BI) tools provide fundamental support for analyzing large volumes of information. Data Warehouses (DW) and Online Analytical Processing (OLAP) tools are used to store and analyze data. Nowadays more and more information is available on the Web in the form of Resource Description Framework (RDF), and BI tools have a huge potential of achieving better results by integrating real-time data from web sources into the analysis process. In this paper, we describe a framework for so-called exploratory OLAP over RDF sources. We propose a system that uses a multidimensional schema of the OLAP cube expressed in RDF vocabularies. Based on this information the system is able to query data sources, extract and aggregate data, and build a cube. We also propose a computer-aided process for discovering previously unknown data sources and building a multidimensional schema of the cube. We present a use case to demonstrate the applicability of the approach.SCOPUS: cp.kinfo:eu-repo/semantics/publishe

Crossref

VBN (Videnbasen) Aalborg Universitets forskningsportal

DI-fusion

Efficient Temporal Pattern Mining in Big Time Series Using Mutual Information

Author: Pedersen Torben Bach; id_orcid
Ho Nguyen
Pedersen Torben Bach
Ho Long Van; id_orcid
Ho Nguyen Thi Thao
Ho Van Long
Publication venue
Publication date: 01/01/2022
Field of study

Very large time series are increasingly available from an ever wider range of IoT-enabled sensors deployed in different environments. Significant insights can be gained by mining temporal patterns from these time series. Unlike traditional pattern mining, temporal pattern mining (TPM) adds event time intervals into extracted patterns, making them more expressive at the expense of increased time and space complexities. Existing TPM methods either cannot scale to large datasets, or work only on pre-processed temporal events rather than on time series. This paper presents our Frequent Temporal Pattern Mining from Time Series (FTPMfTS) approach providing: (1) The end-to-end FTPMfTS process taking time series as input and producing frequent temporal patterns as output. (2) The efficient Hierarchical Temporal Pattern Graph Mining (HTPGM) algorithm that uses efficient data structures for fast support and confidence computation, and employs effective pruning techniques for significantly faster mining. (3) An approximate version of HTPGM that uses mutual information, a measure of data correlation, to prune unpromising time series from the search space. (4) An extensive experimental evaluation showing that HTPGM outperforms the baselines in runtime and memory consumption, and can scale to big datasets. The approximate HTPGM is up to two orders of magnitude faster and less memory consuming than the baselines, while retaining high accuracy

Archivio istituzionale della ricerca - Politecnico di Milano

VBN (Videnbasen) Aalborg Universitets forskningsportal

A design space for RDF data representations

Author: Matteo Lissandrini
Hose Katja; id_orcid
Sagi Tomer; id_orcid
Pedersen (Torben Bach)
Lissandrini Matteo; id_orcid
Pedersen Torben Bach; id_orcid
Tomer Sagi
Katja Hose
Publication venue
Publication date: 01/01/2022
Field of study

RDF triplestores’ ability to store and query knowledge bases augmented with semantic annotations has attracted the attention of both research and industry. A multitude of systems offer varying data representation and indexing schemes. However, as recently shown for designing data structures, many design choices are biased by outdated considerations and may not result in the most efficient data representation for a given query workload. To overcome this limitation, we identify a novel three-dimensional design space. Within this design space, we map the trade-offs between different RDF data representations employed as part of an RDF triplestore and identify unexplored solutions. We complement the review with an empirical evaluation of ten standard SPARQL benchmarks to examine the prevalence of these access patterns in synthetic and real query workloads. We find some access patterns, to be both prevalent in the workloads and under-supported by existing triplestores. This shows the capabilities of our model to be used by RDF store designers to reason about different design choices and allow a (possibly artificially intelligent) designer to evaluate the fit between a given system design and a query workload.<br/

Catalogo dei prodotti della ricerca Università degli Studi di Verona

VBN (Videnbasen) Aalborg Universitets forskningsportal

Optimizing EV flexibility for spot and mFRR market participation

Author: Pedersen Raymond Asoklis Kronborg
Jensen Andreas Ravnholt
Smedt Mikkel Müller
Publication venue
Publication date: 01/01/2025
Field of study

VBN (Videnbasen) Aalborg Universitets forskningsportal

Authors: Dennis Pedersen

Author: Dennis Pedersen
Karsten Riis
Karsten Riis Torben
Torben Bach Pedersen
Publication venue
Publication date: 01/01/2002
Field of study

The changing data requirements of today's dynamic business environments are not handled well by current On-Line Analytical Processing (OLAP) systems. Physically integrating unexpected data into such systems is a long and time-consuming process making logical integration, i.e., federation, the better choice in many situations. The increasing use of Extended Markup Language (XML), e.g. in business-to-business (B2B) applications, suggests that the required data will often be available as XML data. This means that logical federations of OLAP and XML databases will be very attractive in many cases. However, for such OLAP-XML federations to be useful, effective optimization techniques for such systems are needed

CiteSeerX

MAKER: Model- And chunK-based approach for Error-bound Regulation

Author: Harrington Teis Vognstoft
Bech Emil Laurits
Agneborn Frederik
Publication venue
Publication date: 01/01/2023
Field of study

VBN (Videnbasen) Aalborg Universitets forskningsportal

STREAM: System for Trajectory Reference Encoding And Modeling.

Author: Vilslev Daniel
Lykkegaard Martin Opal
Publication venue
Publication date: 01/01/2023
Field of study

VBN (Videnbasen) Aalborg Universitets forskningsportal