1,720,971 research outputs found

    Filtering procedures for sensor data in basketball

    No full text
    Big Data Analytics help team sports’ managers in their decisions by processing a number of different kind of data. With the advent of Information Technologies, collecting, processing and storing big amounts of sport data in different form became possible. A problem that often arises when using sport data regards the need for automatic data cleaning procedures. In this paper we develop a data cleaning procedure for basketball which is based on players’ trajectories. Starting from a data matrix that tracks the movements of the players on the court at different moments in the game, we propose an algorithm to automatically drop inactive moments making use of available sensor data. The algorithm also divides the game into sorted actions and labels them as offensive or defensive. The algorithm’s parameters are validated using proper robustness checks

    On Clustering Daily Mobile Phone Density Profiles

    Full text link
    In the context of Smart cities, local institutions face the increasing need for monitoring the dynamic of the flow of people’s presences inside urban areas in order to plan the improvement and the maintaining of the urban infrastructure. Rectangular grid polygons reporting the density of people using mobile phone (Carpita, Simonetto, 2014) are source of very large data. Telecom Italia Mobile (TIM), which is currently the largest operator in Italy in this sector, thanks to a research agreement with the Statistical Office of the Municipality of Brescia, provided to us about two years (April 2014 to June 2016, n about 700) of Daily Mobile Phone Density Profiles (DMPDPs) for the Province of Brescia in the form of a regular grid of 923 x 607 cells each 15 minutes. In order to find regularities and detect anomalies in the flow of people’s presences, this work aims to cluster similar DMPDPs, where each DMPDP is characterized by both the 2-D spatial component (i.e. 923 x 607 dimensions, one for each cell of the grid) and by the temporal component (i.e. each cell has repeated values in time, for a total of 96 daily dimensions per cell). So, while each DMPDP counts for p about 50 millions (923 x 607 x 96) of space-time dimensions, time and economic constraints prevent us from having a longer time series of DMPDPs. In this terms, to group DMPDPs configures as an High Dimensional Low Sample Size (HDLSS) problem, since p >> n. We propose a mixed-approach procedure that we apply to the city of Brescia. First, borrowing the method of the Histogram of Oriented Gradients (HOG) from the Image Clustering discipline (Tomasi, 2012), we perform a reduction of the DMPDPs dimensionality computing their features extractions. In doing so, we perform some tuning on the HOG parameters in order to reduce as much as possible the DMPDPs dimensionality while preserving as much as possible the information contained in the extracted features. With this approach we preserve both the spatial and the temporal components of the DMPDPs. Then, using the HOG features extractions, we group DMPDPs by applying - and by testing the feasibility of - different clustering approaches for large data (Kaufman, Rousseeuw, 2009)

    Modeling and forecasting traffic flows with mobile phone big data in flooding risk areas to support a data-driven decision making

    No full text
    Floods are one of the natural disasters which cause the worst human, social and economic impacts to the detriment of both public and private sectors. Today, public decision-makers can take advantage of the availability of data-driven systems that allow to monitor hydrogeological risk areas and that can be used for predictive purposes to deal with future emergency situations. Flooding risk exposure maps traditionally assume amount of presences constant over time, although crowding is a highly dynamic process in metropolitan areas. Real-time monitoring and forecasting of people’s presences and mobility is thus a relevant aspect for metropolitan areas subjected to flooding risk. In this respect, mobile phone network data have been used with the aim of obtaining dynamic measure for the exposure risk in areas with hydrogeological criticality. In this work, we use mobile phone origin-destination signals on traffic flows by Telecom Italia Mobile (TIM) users with the aim of forecasting the exposure risk and thus to help decision-makers in warning to who is transiting through that area. To model the complex seasonality of traffic flows data, we adopt a novel methodological strategy based on introducing in a Vector AutoRegressive with eXogenous variable (VARX) model a Dynamic Harmonic Regression (DHR) component. We apply the method to the case study of the “Mandolossa”, an urbanized area subject to flooding located on the western outskirt of Brescia, using hourly-basis data from September 2020 to August 2021. A cross validation based on the hit-rate and the mean absolute percentage error measures show a good forecasting accuracy

    Integration of flows and signals data from mobile phone network for statistical analyses of traffic in a flooding risk area

    No full text
    In this paper, we present a robust spatiotemporal statistical methodology that is capable of accurately forecasting traffic in the flood-prone area of the Mandolossa in the Province of Brescia (Italy). An innovative combination of two sources of mobile phone data is proposed to obtain an extremely accurate representation of the flows of people passing by the streets directly linked to the risky area. Three types of flows have been considered: outflows (from the flood-prone area to the neighborhood), inflows (from the neighborhood to the flood-prone area), and internal flows (within the flood-prone area). The three flows are assumed to be dependent on each other and are modeled using a vector autoregressive approach. We found evidence of both weekly and daily seasonal components in the time series. To capture the seasonality, a dynamic harmonic regression component has been included, where the optimal number of Fourier bases in the periodic functions has been chosen according to a criterion based on the Akaike Information Criteria. On the other side, the set of autoregressive parameters has been defined in such a way as to represent the time period necessary for the mobile phone company to observe, process, and release the data. The forecasting ability of the model has been assessed using blocked k-folds cross-validation along with the mean absolute percentage error and the hit rate. Though the model performs better for non-summer days, we found that it satisfactorily forecasts both the number and the level of people moving

    Measuring sport performances under pressure by classification trees with application to basketball shooting

    No full text
    Measuring players' performance in team sports is fundamental since managers need to evaluate players with respect to the ability to score during crucial moments of the game. Using Classification and Regression Trees (CART) and play-by-play basketball data, we estimate the probabilities to score the shot with respect to a selection of game covariates related to game pressure. We use scoring probabilities to develop a player-specific shooting performance index that takes into account for the difficulty associated to score different types of shots. By applying this procedure to a large sample of 2016–2017 Basketball Champions League (BCL) and 2017–2018 National Basketball Association (NBA) games, we compare the factors affecting shooting performance in Europe and in the United States and we evaluate a selection of players in terms of the proposed shooting performance index with the final aim of providing useful guidelines for the team strategy

    A Spatio-Temporal Indicator for City Users Based on Mobile Phone Signals and Administrative Data

    No full text
    To know the number of city users is essential since it provides a big amount of useful information in the context of Smart City evaluations that traditional static measures—represented by the number of residents from census data—are not able to provide. In this paper we use spatiotemporal mobile phone data along with administrative data to develop a dynamic indicator for the number of city users. In doing so, we propose a multi-stage approach for high-dimensional data, that, in the first part, it permits to estimate the number of phone company users for different reference days by means of an approach based on Histogram of Oriented Gradients for data dimensionality reduction, and by means of a mix of k-means and Functional Data Analysis Model-Based Clustering methods for clustering days. The second part is aimed at employing a method—based on matching mobile phone and administrative data—to estimate the phone company market share at small area level, which is used to derive city users. Applying the method to the case study of the Municipality of Brescia, we find that our estimated market share outperforms the national level counterpart. Moreover, we find that the number of city users reaches a peak of 270–280 thousand during the central hours of autumn to spring weekdays

    Dynamic maps of human exposure to floods based on mobile phone data

    Full text link
    Floods are acknowledged as one of the most serious threats to people's lives and properties worldwide. To mitigate the flood risk, it is possible to act separately on its components: hazard, vulnerability, exposure. Emergency management plans can actually provide effective non-structural practices to decrease both human exposure and vulnerability. Crowding maps depending on characteristic time patterns, herein referred to as dynamic exposure maps, represent a valuable tool to enhance the flood risk management plans. In this paper, the suitability of mobile phone data to derive crowding maps is discussed. A test case is provided by a strongly urbanized area subject to frequent flooding located on the western outskirts of Brescia (northern Italy). Characteristic exposure spatiotemporal patterns and their uncertainties were detected with regard to land cover and calendar period. This novel methodology still deserves verification during real-world flood episodes, even though it appears to be more reliable than crowdsourcing strategies, and seems to have potential to better address real-time rescues and relief supplies
    corecore