1,720,993 research outputs found

    Where do migrants and natives belong in a community : a Twitter case study and privacy risk analysis

    Full text link
    Today, many users are actively using Twitter to express their opinions and to share information. Thanks to the availability of the data, researchers have studied behaviours and social networks of these users. International migration studies have also benefited from this social media platform to improve migration statistics. Although diverse types of social networks have been studied so far on Twitter, social networks of migrants and natives have not been studied before. This paper aims to fill this gap by studying characteristics and behaviours of migrants and natives on Twitter. To do so, we perform a general assessment of features including profiles and tweets, and an extensive network analysis on the network. We find that migrants have more followers than friends. They have also tweeted more despite that both of the groups have similar account ages. More interestingly, the assortativity scores showed that users tend to connect based on nationality more than country of residence, and this is more the case for migrants than natives. Furthermore, both natives and migrants tend to connect mostly with natives. The homophilic behaviours of users are also well reflected in the communities that we detected. Our additional privacy risk analysis showed that Twitter data can be safely used without exposing sensitive information of the users, and minimise risk of re-identification, while respecting GDPR

    PRIMULE: Privacy risk mitigation for user profiles

    No full text
    The availability of mobile phone data has encouraged the development of different data-driven tools, supporting social science studies and providing new data sources to the standard official statistics. However, this particular kind of data are subject to privacy concerns because they can enable the inference of personal and private information. In this paper, we address the privacy issues related to the sharing of user profiles, derived from mobile phone data, by proposing PRIMULE, a privacy risk mitigation strategy. Such a method relies on PRUDEnce (Pratesi et al., 2018), a privacy risk assessment framework that provides a methodology for systematically identifying risky-users in a set of data. An extensive experimentation on real-world data shows the effectiveness of PRIMULE strategy in terms of both quality of mobile user profiles and utility of these profiles for analytical services such as the Sociometer (Furletti et al., 2013), a data mining tool for city users classification

    Privacy by Design for Mobility Data Analytics

    No full text
    Privacy is an ever-growing concern in our society and is becoming a fundamental aspect to take into account when one wants to use, publish and analyze data involving human personal sensitive information, like data referring to individual mobility. Unfortunately, it is increasingly hard to transform the data in a way that it protects sensitive information: we live in the era of big data characterized by unprecedented opportunities to sense, store and analyze social data describing human activities in great detail and resolution. This is especially true when we work on mobility data, that are characterized by the fact that there is no longer a clear distinction between quasi-identifiers and sensitive attributes. Therefore, protecting privacy in this context is a significant challenge. As a result, privacy preservation simply cannot be accomplished by de-identification alone. In this chapter, we propose the Privacy by Design paradigm to develop technological frameworks for countering the threats of undesirable, unlawful effects of privacy violation, without obstructing the knowledge discovery opportunities of social mining and big data analytical technologies. Our main idea is to inscribe privacy protection into the knowledge discovery technology by design, so that the analysis incorporates the relevant privacy requirements from the start. We show three applications of the Privacy by Design principle on mobility data analytics. First we present a framework based on a data-driven spatial generalization, which is suitable for the privacy-aware publication of movement data in order to enable clustering analysis. Second, we present a method for sanitizing semantic trajectories, using a generalization of visited places based on a taxonomy of locations. The private data then may be used for extracting frequent sequential patterns. Lastly, we show how to apply the idea of Privacy by Design in a distributed setting in which movement data from individual vehicles is made private through differential privacy manipulations and then is collected, aggregated and analyzed by a centralized station

    Fast estimation of privacy risk in human mobility data

    No full text
    Mobility data are an important proxy to understand the patterns of human movements, develop analytical services and design models for simulation and prediction of human dynamics. Unfortunately mobility data are also very sensitive, since they may contain personal information about the individuals involved. Existing frameworks for privacy risk assessment enable the data providers to quantify and mitigate privacy risks, but they suffer two main limitations: (i) they have a high computational complexity; (ii) the privacy risk must be re-computed for each new set of individuals, geographic areas or time windows. In this paper we explore a fast and flexible solution to estimate privacy risk in human mobility data, using predictive models to capture the relation between an individualâs mobility patterns and her privacy risk. We show the effectiveness of our approach by experimentation on a real-world GPS dataset and provide a comparison with traditional methods

    A Data Mining Approach to Assess Privacy Risk in Human Mobility Data

    Full text link
    Human mobility data are an important proxy to understand human mobility dynamics, develop analytical services, and design mathematical models for simulation and what-if analysis. Unfortunately mobility data are very sensitive since they may enable the re-identification of individuals in a database. Existing frameworks for privacy risk assessment provide data providers with tools to control and mitigate privacy risks, but they suffer two main shortcomings: (i) they have a high computational complexity; (ii) the privacy risk must be recomputed every time new data records become available and for every selection of individuals, geographic areas, or time windows. In this article, we propose a fast and flexible approach to estimate privacy risk in human mobility data. The idea is to train classifiers to capture the relation between individual mobility patterns and the level of privacy risk of individuals. We show the effectiveness of our approach by an extensive experiment on real-world GPS data in two urban areas and investigate the relations between human mobility patterns and the privacy risk of individuals

    Privacy Preserving Multidimensional Profiling

    No full text
    Recently, big data had become central in the analysis of human behavior and the development of innovative services. In particular, a new class of services is emerging, taking advantage of different sources of data, in order to consider the multiple aspects of human beings. Unfortunately, these data can lead to re-identification problems and other privacy leaks, as diffusely reported in both scientific literature and media. The risk is even more pressing if multiple sources of data are linked together since a potential adversary could know information related to each dataset. For this reason, it is necessary to evaluate accurately and mitigate the individual privacy risk before releasing personal data. In this paper, we propose a methodology for the first task, i.e., assessing privacy risk, in a multidimensional scenario, defining some possible privacy attacks and simulating them using real-world datasets

    Cultural dimensions in online purchase behavior: Evidence from a cross-cultural study

    Full text link
    The objective of this research is to investigate how cultural differences affect con- sumers’ online purchase behavior. We reviewed the recent literature on cross-cul- tural studies on online behavior and building on Hofstede’s theory of cultural dimen- sions and the theory of planned behavior (TPB), we developed a conceptual model exploring how the dimensions of national culture influence perceptions of website usability, trust, and perceived risk, which in turn impact on intention to use and online purchase behavior. A web-based questionnaire was distributed to a sample of 350 European and Asian consumers actively using Alibaba e-commerce platforms. The conceptual model was validated through a confirmatory factor analysis (CFA), while structural equation modelling (SEM) was used to empirically test the hypothe- sized relationships among variables. Results showed how culture significantly influ- enced website usability and perceived risk in European consumers and, in turn, their intention and behavior. Differently, culture significantly influenced trust of Asian consumers, as well as their intention and online behavior. With this study, we con- tribute to the literature on consumer online purchase behavior from a cross-cultural perspective. As culture emerged among the significant antecedents of mechanisms explaining online purchase behavior, e-tailers should tailor digital marketing strate- gies according to consumer cultural differences

    Going Beyond Counting First Authors in Author Co-citation Analysis

    Full text link
    The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed
    corecore