1,720,971 research outputs found

    Long-term-based road blackspot screening procedures by machine learning algorithms

    Full text link
    Screening procedures in road blackspot detection are essential tools for road authorities for quickly gathering insights on the safety level of each road site they manage. This paper suggests a road blackspot screening procedure for two-lane rural roads, relying on five different machine learning algorithms (MLAs) and real long-term traffic data. The network analyzed is the one managed by the Tuscany Region Road Administration, mainly composed of two-lane rural roads. An amount of 995 road sites, where at least one accident occurred in 2012-2016, have been labeled as "Accident Case". Accordingly, an equal number of sites where no accident occurred in the same period, have been randomly selected and labeled as "Non-Accident Case". Five different MLAs, namely Logistic Regression, Classification and Regression Tree, Random Forest, K-Nearest Neighbor, and Naïve Bayes, have been trained and validated. The output response of the MLAs, i.e., crash occurrence susceptibility, is a binary categorical variable. Therefore, such algorithms aim to classify a road site as likely safe ("Accident Case") or potentially susceptible to an accident occurrence ("Non-Accident Case") over five years. Finally, algorithms have been compared by a set of performance metrics, including precision, recall, F1-score, overall accuracy, confusion matrix, and the Area Under the Receiver Operating Characteristic. Outcomes show that the Random Forest outperforms the other MLAs with an overall accuracy of 73.53%. Furthermore, all the MLAs do not show overfitting issues. Road authorities could consider MLAs to draw up a priority list of on-site inspections and maintenance interventions

    Handling Imbalanced Data in Road Crash Severity Prediction by Machine Learning Algorithms

    No full text
    Crash severity is undoubtedly a fundamental aspect of a crash event. Although machine learning algorithms for predicting crash severity have recently gained interest by the academic community, there is a significant trend towards neglecting the fact that crash datasets are acutely imbalanced. Overlooking this fact generally leads to weak classifiers for predicting the minority class (crashes with higher severity). In this paper, in order to handle imbalanced accident datasets and provide a better prediction for the minority class, the random undersampling the majority class (RUMC) technique is used. By employing an imbalanced and a RUMC-based balanced training set, we propose the calibration, validation, and evaluation of four different crash severity predictive models, including random tree, k-nearest neighbor, logistic regression, and random forest. Accuracy, true positive rate (recall), false positive rate, true negative rate, precision, F1-score, and the confusion matrix have been calculated to assess the performance. Outcomes show that RUMC-based models provide an enhancement in the reliability of the classifiers for detecting fatal crashes and those causing injury. Indeed, in imbalanced models, the true positive rate for predicting fatal crashes and those causing injury spans from 0% (logistic regression) to 18.3% (k-nearest neighbor), while for the RUMC-based models, it spans from 52.5% (RUMC-based logistic regression) to 57.2% (RUMC-based k-nearest neighbor). Organizations and decision-makers could make use of RUMC and machine learning algorithms in predicting the severity of a crash occurrence, managing the present, and planning the future of their works.A

    Overfitting Prevention in Accident Prediction Models: Bayesian Regularization of Artificial Neural Networks

    No full text
    In the present paper, we implemented the Bayesian regularization (BR) backpropagation algorithm for calibrating an artificial neural network (ANN) as an accident prediction model (APM) to be used on Italian four-lane divided roads. We chose the BR-ANN since it efficiently allows for dealing with small sample size and avoiding overfitting issues by adding a regularization term in the objective function to be minimized during training. Moreover, BR-ANNs are sparsely employed in road safety analyses, and their peculiarities deserve to be emphasized. In our work, the BR-ANN aims to predict the number of fatal and injury (FI) crashes across 236 road elements, for a total length of 78 km. The input features are road element length, horizontal and vertical alignment, cross-section geometry, operating speed, traffic flow, sight distance, and road area type (i.e., a categorical predictor accounting for the potential influence of merge and diverge influence areas). Training and test phases of the BR-ANN have been evaluated by determination coefficient (R2), root mean square error (RMSE), overfitting ratio (OR), scatterplots, residuals analysis, and by the same ANN architecture trained with the gradient descent (GD) with momentum and adaptive learning rate backpropagation algorithm (GD-ANN). Results demonstrate that the BR-ANN markedly outperforms the GD-ANN, which suffers severe overfitting issues. Furthermore, BR-ANN does not overfit data (OR close to the unity), reports a satisfactory R2 (0.726), and shows a Gaussian residual distribution with zero mean. Therefore, road authorities could consider regularized ANNs for performing appropriate safety analyses, especially when dealing with small road sample sizes

    Defining machine learning algorithms as accident prediction models for Italian two-lane rural, suburban, and urban roads

    No full text
    Four Accident Prediction Models have been defined for Italian two-lane rural, suburban, and urban roads by exploiting different Machine Learning Algorithms. Specifically, a Classification and Regression Tree, a Boosted Regression Tree, a Random Forest, and a Support Vector Machine have been implemented to predict the number of Fatal and Injury crashes on a 905-km network, which experienced 5,802 FI crashes in 2008-2016. The dataset incorporates geometrical, functional, and environmental information. Several performance metrics have been computed, such as Determination Coefficient, Mean Absolute Error, Root Mean Square Error, and scatterplots. Outcomes suggest that Support Vector Machine outperforms the other Machine Learning Algorithms for predicting Fatal and Injury crashes. In Addition, the computation of Predictor Importance shows that traffic flow, the density of intersections, driveway density, and type of area are the most impacting factors on crash likelihood. Road authorities may use these findings for conducting reliable safety analyses

    Going Beyond Counting First Authors in Author Co-citation Analysis

    Full text link
    The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed

    Surface motion prediction and mapping for road infrastructures management by PS-InSAR measurements and machine learning algorithms

    Full text link
    This paper introduces a methodology for predicting and mapping surface motion beneath road pavement structures caused by environmental factors. Persistent Scatterer Interferometric Synthetic Aperture Radar (PS-InSAR) measurements, geospatial analyses, and Machine Learning Algorithms (MLAs) are employed for achieving the purpose. Two single learners, i.e., Regression Tree (RT) and Support Vector Machine (SVM), and two ensemble learners, i.e., Boosted Regression Trees (BRT) and Random Forest (RF) are utilized for estimating the surface motion ratio in terms of mm/year over the Province of Pistoia (Tuscany Region, central Italy, 964 km2), in which strong subsidence phenomena have occurred. The interferometric process of 210 Sentinel-1 images from 2014 to 2019 allows exploiting the average displacements of 52,257 Persistent Scatterers as output targets to predict. A set of 29 environmental-related factors are preprocessed by SAGA-GIS, version 2.3.2, and ESRI ArcGIS, version 10.5, and employed as input features. Once the dataset has been prepared, three wrapper feature selection approaches (backward, forward, and bi-directional) are used for recognizing the set of most relevant features to be used in the modeling. A random splitting of the dataset in 70% and 30% is implemented to identify the training and test set. Through a Bayesian Optimization Algorithm (BOA) and a 10-Fold Cross-Validation (CV), the algorithms are trained and validated. Therefore, the Predictive Performance of MLAs is evaluated and compared by plotting the Taylor Diagram. Outcomes show that SVM and BRT are the most suitable algorithms; in the test phase, BRT has the highest Correlation Coefficient (0.96) and the lowest Root Mean Square Error (0.44 mm/year), while the SVM has the lowest difference between the standard deviation of its predictions (2.05 mm/year) and that of the reference samples (2.09 mm/year). Finally, algorithms are used for mapping surface motion over the study area. We propose three case studies on critical stretches of two-lane rural roads for evaluating the reliability of the procedure. Road authorities could consider the proposed methodology for their monitoring, management, and planning activities

    Time Modalities over Many-valued Logics

    No full text
    Model checking has been traditionally concerned on verifying a (critical) system against its specification, which is generally expressed in temporal logic. Despite this verification technique is mature, it becomes useless when the specification incorporates vagueness, especially for the temporal constraints. This is often the case when non-critical adaptive systems are considered. These systems may tolerate small violations or may need to be aware of the satisfaction degree of their specification for re-configuration purposes. We present FTL (Fuzzy-time Temporal Logic), an extension of LTL that relaxes the notion of time, and propose a verification technique to evaluate the truth degree of such vague temporal properties. Our verification technique has been implemented in a prototype and the experimental results are promising

    Assessing resilience of infrastructures towards exogenous events by using ps-insar-based surface motion estimates and machine learning regression techniques

    No full text
    Technologically advanced strategies in infrastructural maintenance are increasingly required in countries such as Italy, where recovery and rehabilitation interventions are preferred to new works. For this purpose, Interferometric Synthetic Aperture Radar (InSAR) techniques have been employed in recent years, achieving reliable outcomes in the identification of infrastructural instabilities. Nevertheless, using the InSAR survey exclusively, it is not feasible to recognize the reasons for such vulnerabilities, and further in-depth investigations are essential.The primary purpose of this paper is to predict infrastructural displacements connected to surface motion and the related causes by combining InSAR techniques and Machine Learning algorithms. The development and application of a Regression Tree-based algorithm have been carried out for estimating the displacement of road pavement structures detected by the Persistent Scatterer InSAR technique.The study area is located in the province of Pistoia, Tuscany, Italy. Sentinel-1 images from 2014 to 2019 were used for the interferometric process, and a set of 29 environmental parameters was collected in a GIS platform. The database is randomly split into a Training (70%) and Test sets (30%). With the Training set, through a 10-Fold Cross-Validation, the model is trained, validated, and the Goodness-of-Fit is evaluated. Also, with the Test set, the Predictive Performance of the model is assessed. Lastly, we applied the model onto a stretch of a two-lane rural road that crosses the area. Results show that the suggested procedure can be used for supporting decision-making processes on planning road maintenance by National Road Authorities.
    corecore