Search CORE

1,721,014 research outputs found

Distributed and stream data mining algorithms for frequent pattern discovery

Author: Silvestri Claudio
Publication venue
Publication date: 21/03/2006
Field of study

Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari

Multi-Line Customized Bus Planner for On-Demand Origin-Destination Travel Requests

Author: Silvestri Claudio
Giacomo Chiarot
Publication venue
Publication date: 01/01/2022
Field of study

Replacing private transport in large cities with public and shared alternatives is increasingly relevant to reduce congestion during rush hours and air pollution. The activation of customized bus services is one of the possible strategies toward that goal, and automatic bus route design is needed when the amount of pickup addresses to manage is large. The approaches described in the literature are not suitable for real-world applications because they tend to generate many more lines than necessary, or they don't work if addresses do not form clusters. In this paper, we propose a novel bus line generation approach suitable for any address database

Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari

Time series compression: a survey

Author: Chiarot Giacomo
Claudio Silvestri
Silvestri Claudio
Giacomo Chiarot
Publication venue
Publication date: 20/09/2022
Field of study

The presence of smart objects is increasingly widespread and their ecosystem, also known as Internet of Things, is relevant in many different application scenarios. The huge amount of temporally annotated data produced by these smart devices demand for efficient techniques for transfer and storage of time series data. Compression techniques play an important role toward this goal and, despite the fact that standard compression methods could be used with some benefit, there exist several ones that specifically address the case of time series by exploiting their peculiarities to achieve a more effective compression and a more accurate decompression in the case of lossy compression techniques. This paper provides a state-of-the-art survey of the principal time series compression techniques, proposing a taxonomy to classify them considering their overall approach and their characteristics. Furthermore, we analyze the performances of the selected algorithms by discussing and comparing the experimental results that where provided in the original articles. The goal of this paper is to provide a comprehensive and homogeneous reconstruction of the state-of-the-art which is currently fragmented across many papers that use different notations and where the proposed methods are not organized according to a classification.Comment: 33 pages, author versio

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari

Privacy-preserving techniques for location-based services

Author: Damiani Maria L.
Bertino Elisa
Silvestri Claudio
Publication venue
Publication date: 01/01/2008
Field of study

This paper outlines solutions to the problem of location privacy in mobile distributed applications. It then summarizes a novel approach that takes into account personal privacy preferences by individuals. The approach is highly efficient and experiments have shown that can be deployed on small devices

ARCA (Univ. Ca'Foscari)

Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari

Distributed approximate mining of frequent patterns

Author: C. SILVESTRI
S. ORLANDO
SILVESTRI Claudio
ORLANDO Salvatore
Publication venue
Publication date: 01/01/2005
Field of study

ARCA (Univ. Ca'Foscari)

Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari

Approximate Mining of Frequent Patterns on Streams

Author: C. SILVESTRI
S. ORLANDO
SILVESTRI Claudio
ORLANDO Salvatore
Publication venue
Publication date: 01/01/2007
Field of study

Many critical applications, like intrusion detection or stock market analysis, require a nearly immediate result based on a continuous and infinite stream of data. In most cases finding an exact solution is not compatible with limited availability of resources and real time constraints, but an approximation of the exact result is enough for most purposes. This paper introduces a new algorithm for approximate mining of frequent itemsets from streams of transactions using a limited amount of memory. The proposed algorithm is based on the computation of frequent itemsets in recent data and an effective method for inferring the global support of previously infrequent itemsets. Both upper and lower bounds on the support of each pattern found are returned along with the interpolated support. An extensive experimental evaluation shows that APstream, the proposed algorithm, yields a good approximation of the exact global result considering both the set of patterns found and their supports

ARCA (Univ. Ca'Foscari)

Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari

CCSM: an Efficient Algorithm for Constrained Sequence Mining

Author: R. Perego
S. Orlando
C. Silvestri
SILVESTRI Claudio
ORLANDO Salvatore
Publication venue
Publication date: 01/01/2003
Field of study

ARCA (Univ. Ca'Foscari)

Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari

gpuDCI: Exploiting GPUs in Frequent Itemset Mining

Author: Claudio Silvestri
Salvatore Orlando
S. Orlando
C. Silvestri
SILVESTRI Claudio
ORLANDO Salvatore
Publication venue
Publication date: 01/01/2012
Field of study

Frequent itemset mining (FIM) algorithms extract subsets of items that occurs frequently in a collection of sets. FIM is a key analysis in several data mining applications, and the FIM tools are among the most computationally intensive data mining ones. In this work we present a many-core parallel version of a state-of-the-art FIM algorithm, DCI, whose sequential version resulted, for most of the tested datasets, better than FP-Growth, one of the most efficient algorithms for FIM. We propose a couple of parallelization strategies for Graphics Processing Units (GPU) suitable for different resource availability, and we present the results of several experiments conducted on real-world and synthetic datasets. © 2012 IEEE

ARCA (Univ. Ca'Foscari)

Crossref

Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari

Spatio-Temporal Aggregations in Trajectory Data Warehouses

Author: RAFFAETA' Alessandra
ORSINI Renzo
RONCATO Alessandro
SILVESTRI Claudio
ORLANDO Salvatore
Publication venue
Publication date: 01/01/2007
Field of study

In this paper we investigate some issues related to the design of a simple Data Warehouse (DW), storing several aggregate measures about trajectories of moving objects. First we discuss the loading phase of our DW which has to deal with overwhelming streams of trajectory observations, possibly produced at different rates, and arriving in an unpredictable and unbounded way. Then, we focus on the measure presence, the most complex measure stored in our DW. Such a measure returns the number of trajectories that lie in a spatial region during a given temporal interval. We devise a novel way to compute an approximate, but very accurate, presence aggregate function, which algebraically combines a bounded amount of measures stored in the base cells of the data cube

Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari

RAT-CC: A Recurrent Autoencoder for Time-Series Compression and Classification

Author: Chiarot Giacomo
Silvestri Claudio
Ochoa Idoia
Vascon Sebastiano
Publication venue
Publication date: 01/01/2025
Field of study

The growth of interconnected devices has led to an enormous volume of temporal data that requires specialized compression models for efficient storage. Besides this, most applications need to classify these data efficiently, and having to reconstruct the original data from the compressed representation to then classify them is not optimal. For this reason, we propose a Recurrent Autoencoder for Time-series Compression and Classification, termed RAT-CC, that allows to perform any classification task on the compressed representation without needing to reconstruct the original time-series data. RAT-CC leverages a Long Short-Term Memory (LSTM) recurrent autoencoder with a dual-loss function: the standard reconstruction loss to minimize reconstruction error; and an embedding loss to preserve relative distances in the compressed embedding space. This combined loss ensures that the learned embeddings remain meaningful for classification tasks while preserving the necessary information for reconstruction. We assess the compression and classification performance of RAT-CC on four datasets taken from different domains. RAT-CC is implemented in Keras and freely available at (https://github.com/ChJ4m3s/RAT-CC)

Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari