1,721,209 research outputs found
A survey on privacy in human mobility
In the last years we have witnessed a pervasive use of location-aware technologies such as vehicular GPS-enabled devices, RFID based tools, mobile phones, etc which generate collection and storing of a large amount of human mobility data. The powerful of this data has been recognized by both the scientific community and the industrial worlds. Human mobility data can be used for different scopes such as urban traffic management, urban planning, urban pollution estimation, etc. Unfortunately, data describing human mobility is sensitive, because people's whereabouts may allow re-identification of individuals in a de-identified database and the access to the places visited by indi-viduals may enable the inference of sensitive information such as religious belief, sexual preferences, health conditions, and so on. The literature reports many approaches aimed at overcoming privacy issues in mobility data, thus in this survey we discuss the advancements on privacy-preserving mo-bility data publishing. We first describe the adversarial attack and privacy models typically taken into consideration for mobility data, then we present frameworks for the privacy risk assessment and finally, we discuss three main categories of privacy-preserving strategies: methods based on anonymization of mobility data, methods based on the differential privacy models and methods which protect privacy by exploiting generative models for synthetic trajectory generation.In the last years we have witnessed a pervasive use of location-aware technologies such as vehicular GPS-enabled devices, RFID based tools, mobile phones, etc which generate collection and storing of a large amount of human mobility data. The powerful of this data has been recognized by both the scientific community and the industrial worlds. Human mobility data can be used for different scopes such as urban traffic management, urban planning, urban pollution estimation, etc. Unfortunately, data describing human mobility is sensitive, because people’s whereabouts may allow re-identification of individuals in a de-identified database and the access to the places visited by individuals may enable the inference of sensitive information such as religious belief, sexual preferences, health conditions, and so on. The literature reports many approaches aimed at overcoming privacy issues in mobility data, thus in this survey we discuss the advancements on privacy-preserving mobility data publi..
Evaluating the privacy exposure of interpretable global and local explainers
During the last few years, the abundance of data has significantly boosted the performance of Machine Learning models, integrating them into several aspects of daily life. However, the rise of powerful Artificial Intelligence tools has introduced ethical and legal complexities. This paper proposes a computational framework to analyze the ethical and legal dimensions of Machine Learning models, focusing specifically on privacy concerns and interpretability. In fact, recently, the research community proposed privacy attacks able to reveal whether a record was part of the black-box training set or inferring variable values by accessing and querying a Machine Learning model. These attacks highlight privacy vulnerabilities and prove that GDPR regulation might be violated by making data or Machine Learning models accessible. At the same time, the complexity of these models, often labelled as “black-boxes”, has made the development of explanation methods indispensable to enhance trust and facilitate their acceptance and adoption in high-stake scenarios. Our study highlights the trade-off between interpretability and privacy protection. By introducing REVEAL, this paper proposes a framework to evaluate the privacy exposure of black-box models and their surrogate-based explainers, whether local or global. Our methodology is adaptable and applicable across diverse black-box models and various privacy attack scenarios. Through an in-depth analysis, we show that the interpretability layer introduced by explanation models might jeopardize the privacy of individuals in the training data of the black-box, particularly with powerful privacy attacks requiring minimal knowledge but causing significant privacy breaches
From depression to suicidal discourse on Reddit
Different mental health conditions may lead to
suicide, among which depression is one of the most common. One
in six people affected by major depression dies by suicide, with
suicide being the second leading cause of death among teenagers
and young adults. Symptoms of suicidal risk often remain latent
until the inevitable occurs, which highlights the significance of
prevention when it comes to such conditions. Language encodes
psychological aspects that reflect a s ubject’s p ersonality and
state of mind and could be used as a proxy to identify shifts
in a person’s mental health state. In this work we study how
psychometric traits of language vary when a person moves
from depression-related discourse to suicide-related discourse,
analyzing posts from Reddit dedicated to these themes. For each
user, we identify the changes in their psychometric profile when
the discourse shifts, and try to pinpoint groups of users with
the same behavior and psychometric evolution. Our findings
highlight several psychometric features as the most involved
in discourse shifting, setting the stage for easier identification
of high-risk individuals from the way they express themselves
through language
Privacy-Preserving Outsourcing of Data Mining
Data mining is gaining momentum in society due to the ever increasing availability of large amounts of data, easily gathered by a variety of collection technologies and stored via computer systems. Due to the limited computational resources of data owners and the developments in cloud computing, there has been considerable recent interest in the paradigm of data mining-as-a-service (DMaaS). In this paradigm, a company (data owner) lacking in expertise or computational resources outsources its mining needs to a third party service provider (server). Given the fact that the server may not be fully trusted, one of the main concerns of the DMaaS paradigm is the protection of data privacy. In this paper, we provide an overview of a variety of techniques and approaches that address the privacy issues of the DMaaS paradigm
Driving profiles computation and monitoring for car insurance CRM
Customer segmentation is one of the most traditional and valued tasks in customer relationship management (CRM). In this article, we explore the problem in the context of the car insurance industry, where the mobility behavior of customers plays a key role: Different mobility needs, driving habits, and skills imply also different requirements (level of coverage provided by the insurance) and risks (of accidents). In the present work, we describe a methodology to extract several indicators describing the driving profile of customers, and we provide a clustering-oriented instantiation of the segmentation problem based on such indicators. Then, we consider the availability of a continuous flow of fresh mobility data sent by the circulating vehicles, aiming at keeping our segments constantly up to date. We tackle a major scalability issue that emerges in this context when the number of customers is large-namely, the communication bottleneck-by proposing and implementing a sophisticated distributed monitoring solution that reduces communications between vehicles and company servers to the essential. We validate the framework on a large database of real mobility data coming from GPS devices on private cars. Finally, we analyze the privacy risks that the proposed approach might involve for the users, providing and evaluating a countermeasure based on data perturbation
- …
