Search CORE

1,721,149 research outputs found

Anomaly Detection Through Unsupervised Federated Learning

Author: Passarella Andrea
Valerio Lorenzo
Nardi Mirko
Publication venue
Publication date: 01/01/2023
Field of study

Federated learning (FL) is proving to be one of the most promising paradigms for leveraging distributed resources, enabling a set of clients to collaboratively train a machine learning model while keeping the data decentralized. The explosive growth of interest in the topic has led to rapid advancements in several core aspects like communication efficiency, handling non-IID data, privacy, and security capabilities. However, the majority of FL works only deal with supervised tasks, assuming that clients' training sets are labeled. To leverage the enormous unlabeled data on distributed edge devices, in this paper, we aim to extend the FL paradigm to unsupervised tasks by addressing the problem of anomaly detection (AD) in decentralized settings. In particular, we propose a novel method in which, through a preprocessing phase, clients are grouped into communities, each having similar majority (i.e., inlier) patterns. Subsequently, each community of clients trains the same anomaly detection model (i.e., autoencoders) in a federated fashion. The resulting model is then shared and used to detect anomalies within the clients of the same community that joined the corresponding federated process. Experiments show that our method is robust, and it can detect communities consistent with the ideal partitioning in which groups of clients having the same inlier patterns are known. Furthermore, the performance is significantly better than those in which clients train models exclusively on local data and comparable with federated models of ideal communities' partition

Archivio istituzionale della Ricerca - Scuola Normale Superiore

Service Provisioning through Opportunistic Computing in Mobile Clouds

Author: Mascitti Davide
Conti Marco
Passarella Andrea
Ricci Laura
Publication venue
Publication date: 01/01/2014
Field of study

Mobile clouds are a new paradigm enabling mobile users to access the heterogeneous services present in a pervasive mobile environment together with the rich service offers of the cloud infrastructures. In mobile computing environments mobile devices can also act as service providers, using approaches conceptually similar to service-oriented models. Many approaches implement service provisioning between mobile devices with the intervention of cloud-based handlers, with mobility playing a disruptive role to the functionality offered by of the system. In our approach, we exploit the opportunistic computing model, whereby mobile devices exploit direct contacts to provide services to each other, without necessarily go through conventional cloud services residing in the Internet. Conventional cloud services are therefore complemented by a mobile cloud formed directly by the mobile devices. This paper exploits an algorithm for service selection and composition in this type of mobile cloud environments able to estimate the execution time of a service composition. The model enables the system to produce an estimate of the execution time of the alternative compositions that can be exploited to solve a user's request and then choose the best one among them. We compare the performance of our algorithm with alternative strategies, showing its superior performance from a number of standpoints. In particular, we show how our algorithm can manage a higher load of requests without causing instability in the system conversely to the other strategies. When the load of requests is manageable for all strategies, our algorithm can achieve up to 75% less time spent in average to solve requests

Elsevier - Publisher Connector

Crossref

Archivio della Ricerca - Università di Pisa

Exploring the Impact of Disrupted Peer-to-Peer Communications on Fully Decentralized Learning in Disaster Scenarios

Author: Conti Marco
Passarella Andrea
Boldrini Chiara
Palmieri Luigi
Valerio Lorenzo
Publication venue
Publication date: 01/01/2023
Field of study

Fully decentralized learning enables the distribution of learning resources and decision-making capabilities across multiple user devices or nodes, and is rapidly gaining popularity due to its privacy-preserving and decentralized nature. Importantly, this crowdsourcing of the learning process allows the system to continue functioning even if some nodes are affected or disconnected. In a disaster scenario, communication infrastructure and centralized systems may be disrupted or completely unavailable, hindering the possibility of carrying out standard centralized learning tasks in these settings. Thus, fully decentralized learning can help in this case. However, transitioning from centralized to peer-to-peer communications introduces a dependency between the learning process and the topology of the communication graph among nodes. In a disaster scenario, even peer-to-peer communications are susceptible to abrupt changes, such as devices running out of battery or getting disconnected from others due to their position. In this study, we investigate the effects of various disruptions to peer-to-peer communications on decentralized learning in a disaster setting. We examine the resilience of a decentralized learning process when a subset of devices drop from the process abruptly. To this end, we analyze the difference between losing devices holding data, i.e., potential knowledge, vs. devices contributing only to the graph connectivity, i.e., with no data. Our findings on a Barabasi-Albert graph topology, where training data is distributed across nodes in an IID fashion, indicate that the accuracy of the learning process is more affected by a loss of connectivity than by a loss of data. Nevertheless, the network remains relatively robust, and the learning process can achieve a good level of accuracy

Archivio della Ricerca - Università di Pisa

On the Joint Effect of Culture and Discussion Topics on X (Twitter) Signed Ego Networks

Author: Tacchi Jack
Conti Marco
Passarella Andrea
Boldrini Chiara
Publication venue
Publication date: 01/01/2024
Field of study

Humans are known to structure social relationships according to certain patterns, such as the Ego Network Model (ENM). These patterns result from our innate cognitive limits and can therefore be observed in the vast majority of large human social groups. Until recently, the main focus of research was the structural characteristics of this model. The main aim of this paper is to complement previous findings with systematic and data-driven analyses on the positive and negative sentiments of social relationships, across different cultures, communities and topics of discussion. A total of 26 datasets were collected for this work. It was found that contrary to previous findings, the influence of culture is not easily ``overwhelmed'' by that of the topic of discussion. However, more specific and polarising topics do lead to noticeable increases in negativity across all cultures. These negativities also appear to be stable across the different levels of the ENM, which contradicts previous hypotheses. Finally, the number of generic topics being discussed between users seems to be a good predictor of the overall positivity of their relationships

Archivio istituzionale della Ricerca - Scuola Normale Superiore

Online Social Networks

Author: Sala Alessandra
Passarella Andrea
Strufe Thorsten
Fu Xiaoming
Quercia Daniele
Publication venue
Publication date: 01/01/2016
Field of study

This special issue of Computer Communications is devoted to Online Social Networks (OSN). OSN are one of the most disruptive communication platforms of the last 15 years. Nowadays, most people use OSN regularly, as a normal facet of their daily lives. Moreover, the widespread diffusion of mobile personal devices (smartphones and tablets) is boosting the use of OSN services in mobility. The statistics about these facts are just impressive. For example, as of now Facebook reached 1.49 billion monthly active users and 1.31 mobile monthly active users 1, while Twitter reached 316 million active users, with 80% of the users active on mobile devices2. Beyond accessing OSN services from mobile, the penetration of OSN use through personal mobile devices is having very significant impacts also on the type of OSN services and, more in general, on people?s behaviour, as we discuss in the following. In fact, OSN have been indicated as one of the hot topics for Computer Communication

Crossref

GRO.publications

GRO.publications (Univ. Göttingen)

PUblication MAnagement

Impact of network topology on the performance of Decentralized Federated Learning

Author: Conti Marco
Passarella Andrea
Boldrini Chiara
Palmieri Luigi
Valerio Lorenzo
Publication venue
Publication date: 01/01/2024
Field of study

Fully decentralized learning is gaining momentum for training AI models at the Internet’s edge, addressing infrastructure challenges and privacy concerns. In a decentralized machine learning system, data is distributed across multiple nodes, with each node training a local model based on its respective dataset. The local models are then shared and combined to form a global model capable of making accurate predictions on new data. Our exploration focuses on how different types of network structures influence the spreading of knowledge – the process by which nodes incorporate insights gained from learning patterns in data available on other nodes across the network. Specifically, this study investigates the intricate interplay between network structure and learning performance using three network topologies and six data distribution methods. These methods consider different vertex properties, including degree centrality, betweenness centrality, and clustering coefficient, along with whether nodes exhibit high or low values of these metrics. Our findings underscore the significance of global centrality metrics (degree, betweenness) in correlating with learning performance, while local clustering proves less predictive. We highlight the challenges in transferring knowledge from peripheral to central nodes, attributed to a dilution effect during model aggregation. Additionally, we observe that central nodes exert a pull effect, facilitating the spread of knowledge. In examining degree distribution, hubs in Barabási–Albert networks positively impact learning for central nodes but exacerbate dilution when knowledge originates from peripheral nodes. Finally, we demonstrate the formidable challenge of knowledge circulation outside of segregated communities, and discuss the impact of class cross-correlations

Archivio della Ricerca - Università di Pisa

Extending OpenStack Monasca for Predictive Elasticity Control

Author: Bacciu Davide
Passarella Andrea
Lanciano Giacomo
Galli Filippo
Cucinotta Tommaso
Publication venue
Publication date: 01/01/2024
Field of study

Traditional auto-scaling approaches are conceived as reactive automations, typically triggered when predefined thresholds are breached by resource consumption metrics. Managing such rules at scale is cumbersome, especially when resources require non-negligible time to be instantiated. This paper introduces an architecture for predictive cloud operations, which enables orchestrators to apply time-series forecasting techniques to estimate the evolution of relevant metrics and take decisions based on the predicted state of the system. In this way, they can anticipate load peaks and trigger appropriate scaling actions in advance, such that new resources are available when needed. The proposed architecture is implemented in OpenStack, extending the monitoring capabilities of Monasca by injecting short-term forecasts of standard metrics. We use our architecture to implement predictive scaling policies leveraging on linear regression, autoregressive integrated moving average, feed-forward, and recurrent neural networks (RNN). Then, we evaluate their performance on a synthetic workload, comparing them to those of a traditional policy. To assess the ability of the different models to generalize to unseen patterns, we also evaluate them on traces from a real content delivery network (CDN) workload. In particular, the RNN model exhibites the best overall performance in terms of prediction error, observed client-side response latency, and forecasting overhead. The implementation of our architecture is open-source

Archivio istituzionale della Ricerca - Scuola Normale Superiore

Extracting Active "Ego Networks" of Words: Methodology, Robustness, and Cross-Domain Validation

Author: Ollivier Kilian
Conti Marco
Passarella Andrea
Boldrini Chiara
Publication venue
Publication date: 01/01/2023
Field of study

The "ego network of words" model captures structural properties in language production associated with cognitive constraints. While previous research focused on the layer-based structure and its semantic properties, this paper argues that an essential element, the concept of an active network, is missing. Drawing inspiration from social ego networks, where the active part includes relationships regularly nurtured by individuals, we establish the notion of an active ego network of words. We demonstrate that without the active network concept, an ego network becomes vulnerable to the amount of data considered, leading to the disappearance of the layered structure in larger datasets. To address this, we define a methodology for extracting the active part of the ego network of words and validate it using interview transcripts and tweets. The robustness of our method to varying input data sizes and temporal stability is demonstrated. In addition, our results are well-aligned with prior analyses of the ego network of words, where the limitation of the data collected led automatically (and implicitly) to approximately consider the active part of the network only. Moreover, the validation on the transcripts dataset (MediaSum) highlights the generalizability of the model across diverse domains and the ingrained cognitive constraints in language usage

Archivio istituzionale della Ricerca - Scuola Normale Superiore

The effect of network topologies on fully decentralized learning: a preliminary investigation

Author: Passarella Andrea
Boldrini Chiara
Palmieri Luigi
Valerio Lorenzo
Publication venue
Publication date: 01/01/2023
Field of study

In a decentralized machine learning system, data is typically partitioned among multiple devices or nodes, each of which trains a local model using its own data. These local models are then shared and combined to create a global model that can make accurate predictions on new data. In this paper, we start exploring the role of the network topology connecting nodes on the performance of a Machine Learning model trained through direct collaboration between nodes. We investigate how different types of topologies impact the "spreading of knowledge", i.e., the ability of nodes to incorporate in their local model the knowledge derived by learning patterns in data available in other nodes across the networks. Specifically, we highlight the different roles in this process of more or less connected nodes (hubs and leaves), as well as that of macroscopic network properties (primarily, degree distribution and modularity). Among others, we show that, while it is known that even weak connectivity among network components is sufficient for information spread, it may not be sufficient for knowledge spread. More intuitively, we also find that hubs have a more significant role than leaves in spreading knowledge, although this manifests itself not only for heavy-tailed distributions but also when "hubs" have only moderately more connections than leaves. Finally, we show that tightly knit communities severely hinder knowledge spread

Archivio della Ricerca - Università di Pisa

Communication Costs Analysis of Unsupervised Federated Learning : an Anomaly Detection Scenario

Author: Passarella Andrea
Valerio Lorenzo
Nardi Mirko
Publication venue
Publication date: 01/01/2023
Field of study

The rapid growth of distributed data across edge devices has prompted the development of decentralized machine learning techniques, such as Federated Learning (FL), to address privacy and data transfer concerns. Only a few recent works have focused on unsupervised FL approaches compared to their supervised counterparts, with the consequence that many aspects of these solutions, e.g., the communication cost, have not been thoroughly investigated. In this paper, we analyse the communication cost associated with unsupervised federated anomaly detection, focusing on a proposed method where clients are grouped into communities based on inlier patterns and subsequently train autoencoder models in a federated fashion. Our analysis quantifies the communication overhead introduced by the federated learning process and compares it to traditional centralized approaches for anomaly detection. We also explore potential trade-offs between communication cost, privacy, and model performance. Our findings reveal that the unsupervised federated approach can achieve a significant reduction in communication cost (up to 83.33%) with comparable performance, by selecting better-suited models. Furthermore, the adjustments we implement render the methodology independent of dataset size, offering notable privacy benefits and competitive accuracy performance, making it highly effective in industrial scenarios with large local datasets and a moderate number of clients

Archivio istituzionale della Ricerca - Scuola Normale Superiore