1,720,982 research outputs found

    Andreoletti, Davide

    No full text

    Measurement and control of geo-location privacy on Twitter

    No full text
    The widespread diffusion of Online Social Networks and Media (OSNEM) has generated a huge amount of users’ personal data. As this data is often publicly available, users’ privacy is at risk. To address this issue, users may control the release of their sensitive data on OSNEM. An example of data that users rarely publish is their location. Besides being a privacy-sensitive information, location is a business-relevant data that third parties, e.g., Location-Based Service (LBS) providers, may be interested to obtain. It is, therefore, of paramount importance to understand to what extent the secrecy of location information can be violated. In this work, we investigate how users can measure the privacy of their geo-location on OSNEM and to control the factors affecting it. We define the privacy of a target user as the geographical distance between her actual unexposed location and the location estimated by an attacker. To measure privacy, we propose a novel deep learning architecture that uncovers a target user’s position based only on the publicly-available locations shared by users on Twitter. Results show that locations can be accurately unveiled for the majority of the users, thus suggesting the need for countermeasures to improve their privacy. To control privacy, we propose data perturbation techniques that users can apply to tune the public exposure of their location, and we show the resulting privacy improvements. To shed light on the factors influencing privacy, we then propose a machine learning model that measures privacy based on several users’ features (e.g., social and behavioral characteristics). Unlike the aforementioned deep learning approach, this model also allows to quantify the impact that each feature has on privacy. We observe that features related to the history of users’ visited locations proved to be the most relevant factors affecting privacy. Finally, we explore potential side effects resulting from the application of data perturbation strategies. In particular, we examine, as a study case, the trade-off between users’ privacy and the effectiveness of a proximity marketing LBS. Results suggest that privacy can be guaranteed while not significantly lowering the effectiveness of the LBS

    Scalable and Privacy-Preserving Inter-AS Routing Through Machine-Learning-Based Graph Pruning

    Full text link
    The decentralized nature of traditional inter-domain routing protocols may lead to several issues, including convergence issues and proneness to misconfiguration. In response to these problems, alternative approaches that leverage the Software Defined Networking (SDN) paradigm to increase the control over routing operations have been recently proposed. In this context, Autonomous Systems (ASs) form a multi-domain network where routing tasks are delegated to an SDN controller. To perform inter-domain routing, each controller must learn how to reach any other node outside its domain. Thus, severe privacy concerns emerge, as the controllers need to access sensitive, business-critical information (e.g., links costs) of all the domains. Recently, protocols for computing the shortest path between a source and a destination (i.e., a common policy in routing tasks) in a privacy-preserving manner have been proposed. These protocols are based on Multi-Party Computation (MPC) schemes, which guarantee privacy at the cost of high computational and communication complexity, thus limiting scalability. In this paper, we exploit machine learning (ML) techniques to prune the network graph by removing the nodes with a low likelihood of being traversed by the shortest path. Privacy-preserving shortest path algorithms are then executed on the pruned graph, at a much lower complexity. Extensive experiments performed in multiple scenarios (varying topologies and number of nodes) indicate a major reduction of computational complexity (up to 75%) and communication complexity (up to 85%), at the expense of an acceptable increase in the average path cost (at most by 16%)

    A Privacy-Preserving Reinforcement Learning Algorithm for Multi-Domain Virtual Network Embedding

    Full text link
    The problem of optimally deploying a virtual network onto a substrate physical network is referred to as Virtual Network Embedding (VNE). In general, this embedding is requested by a customer to an Internet Service Provider (ISP), which performs the VNE over its physical telecom network. In several situations, the physical substrate infrastructure is composed of multiple independent ISPs. In this scenario, ISPs are concerned about exposing to a third-party entity (e.g., the customer) sensitive infrastructural details that are needed to perform an effective embedding. Following a common privacy-preserving approach, known as Limited Information Disclosure (LID), the embedding may be performed by the customer based on a limited and abstracted view of the multi-domain infrastructure that ISPs accept to expose. With this approach, embedding is sub-optimal (e.g., embedding cost is not minimized) in comparison with the case where all information is available, i.e., Full Information Disclosure (FID). In this work, we propose a Reinforcement-Learning-based algorithm able to process data that the customer and ISPs cipher under the Shamir Secret Sharing (SSS) scheme. This approach guarantees total privacy to both the customer and the ISPs (e.g., details about a virtual function are only revealed to the ISP in charge of hosting it) and achieves comparable embedding cost of an existing FID heuristic, as observed from extensive simulations. The main drawback of our algorithm is the high overhead of data that ISPs and the customer need to exchange with each other to execute it. Hence, we also explore the trade-off between embedding cost and data overhead resulting from the reduction of operations done by the RL. In general, intermediate embedding costs between the FID and LID heuristics can be obtained at a significant reduction of data overhead, while not sacrificing any privacy guarantees

    Vertical Split Learning-Based Identification and Explainable Deep Learning-Based Localization of Failures in Multi-Domain NFV Systems

    Full text link
    Automated failure management in Network Function Virtualization (NFV) systems continues to gain significant attention as it allows identifying and mitigating failures in a timely manner, ensuring continuous and stable operation of services. In multi-domain systems, where services are provisioned across multiple domains, each domain is managed by a unique single-domain orchestrator (SDO), the problem of automated NFV failure management takes another dimension as it requires a privacy-preserving collaboration among the SDOs. This is due to the fact that SDOs are not willing to share private and business-critical information of their network to different parties. In this paper, we focus on the problem of failure identification and localization in NFV systems in multi-domain networks where SDOs collaborate, in a distributed privacy-preserving learning scheme, to train a single neural network without sharing any raw data. To this end, we propose a Vertical Split Learning (VSL)-based approach with a client-server architecture for failure identification and localization over vertically partitioned data. Additionally, we utilize Explainable Deep Learning (XDL) frameworks, namely Integrated Gradients and DeepLIFT, on the failure identification server model to locate the failures without accessing the original data or features and without training a separate localization model. We compare our approach to centralized baseline approaches, and illustrative numerical results show that our proposed solution preserves a performance close to the one achievable with a centralized approach and localizes failures with an accuracy of 83% without the necessity of training a new localization model

    Synthetic Data Generation using Diffusion Models for ML-based Lightpath Quality of Transmission Estimation Under Extreme Data Scarcity

    Full text link
    Generative diffusion models are gaining attention as a promising solution for synthetic data generation, offering a distinct advantage over traditional statistical methods and basic generative models. This work focuses on evaluating the effectiveness of such models in the context of estimating Lightpath Quality of Transmission (QoT) in optical networks, especially when real data availability is strongly limited. Numerical results demonstrate that leveraging diffusion models for data augmentation can significantly improve QoT classification accuracy and F1-score when available data are limited to a few dozens of samples. These findings highlight the potential of generative diffusion models in improving data-driven tasks for optical network management under sparse data conditions

    ML-based Network Pruning for Routing Data Overhead Reduction in Wireless Sensor Networks

    Full text link
    Routing in Wireless Sensor Networks (WSNs) is one of the tasks that heavily impact network lifetime: current routing protocols, such as Ad-hoc On-demand Distance Vector (AODV), generate excessive and rather unnecessary overhead for route discovery, which in turn contributes to deplete the limited power resources of sensors. In this work, we propose a novel machine learning-based approach to perform network pruning during route discovery aiming at reducing data overhead. Our approach assumes that sensor nodes are aware of their locations and have processing capabilities to run lightweight machine learning algorithms. We perform extensive simulations considering WSNs consisting of different amounts of nodes. Results show that our proposed approach can reduce data overhead by 50% to 65%, depending on the amount of nodes and pruning aggressiveness

    Network-Based Contact Tracing for Detection of Covid-19 Contagions: A Privacy-Preserving Approach

    No full text
    The outbreak of coronavirus 2019 (Covid-19) has imposed a worldwide lockdown, changing the way people live and work, and pressuring the healthcare systems in many countries. Contact tracing based on smartphone apps (app-based contact tracing) has emerged as a possible solution to trace contagions and enforce more sustainable and selective quarantines. However, these apps require a very high adoption rate to reach the critical mass for effective contact tracing. As an alternative, network-based contact tracing, which exploits geo-localization in next generation networks (e.g., 5G) can be used by mobile operators (MOs) to passively trace users' mobility and contacts, provided that a targeted localization accuracy of down to one meter can be achieved. To effectively trace contagions, the identities of positive individuals, which are known by governmental authorities (GAs), are also required. Hence, in network-based contact tracing, MOs and GAs must exchange users' privacy-sensitive data, as geo-locations and infection status, to compute the likelihood that an individual has been infected. To address the privacy issues raised by network contact tracing, after presenting an overview of app-based vs. network-based contact tracing systems, we propose a protocol to make network-based systems compliant with stringent privacy requirements. From extensive simulations, we observe that the cost to guarantee privacy (evaluated in terms of data overhead introduced by the protocol) is acceptable. Finally, we elaborate on the advantages of the proposed system (e.g., more efficient monitoring of virus spread), as well as on the open issues that should be solved before its adoption (e.g., extensive research on potential privacy leakage is required)
    corecore