IFIP Open Access Digital Library
Not a member yet
22614 research outputs found
Sort by
An AI-Based Approach to Identify Financial Risks in Transportation Infrastructure Construction Projects
Part 2: Machine LearningInternational audience“In a perfect world, free from budgets, every piece of data that is collectable would be collected, and every byte would be analyzed [...]”. Big data analysis frameworks have already found their way into mainstream application and have seen wide-spread deployment in scientific communities as well as in organizations across different industry fields. Moreover, also AI-support is a necessary requirement for modern (big data) analysis applications nowadays. An exemplar industrial application domain highlighting the necessity of AI-supported data exploration in a real-world big data analysis application scenario is the risk analysis (economic risks) of building and construction projects. Currently, economical risk analysis is largely based on so-called expert knowledge, an experience- and intuition-based analysis of the risk. Even this experience-based (manual) process can be formalized in different ways, “construction projects are characterized by carrying a high level of uncertainty and complexity”. With a strong focus on the project execution phase, this paper outlines how ML algorithms can be applied to automatically predict the financial outcome based on financial controlling data and thus to potentially assist in mitigation of financial losses
Multi-dimensional Classification on Social Media Data for Detailed Reporting with Large Language Models
Part 1: Reinforcement/Natural LanguageInternational audienceEvery day, more and more people harness the power of social media platforms to express their thoughts, share information and personal experiences, and engage with others. All this knowledge can then be transformed into informative reports with the assistance of Large Language Models (LLMs), like ChatGPT, which leverage deep learning techniques to analyze data and generate comprehensive analyses. By effectively classifying user-generated posts based on dimensions such as topic, sentiment, and emotion, it is possible to create even more detailed reports by carefully condensing large amounts of data collected along the different dimensions considered. To tackle this challenge, we have developed an automated approach with two primary goals: (i) categorizing posts across different dimensions using ready-to-use and fine-tuned classifiers; and (ii) generating detailed reports via LLMs that summarize posts with similar characteristics along the defined dimensions. In our analysis, we examined a large and varied set of posts about COVID, classifying them along several dimensions, including topic, content type, expressed sentiment and emotions, and reliability of information. Specifically, by choosing to generate a report for the main discussion topics present in the dataset, such as allergic reactions or school issues, and using the remaining dimensions for post classification, we successfully created highly detailed and informative reports with ChatGPT. These reports outperformed those generated directly by ChatGPT, in both quantitative measures such as linguistic scores and qualitative evaluations by field experts
Session Replication Attack Through QR Code Sniffing in Passkey CTAP Registration
International audiencePasskey is an authentication method to supplement passwords and leverages the open standard fast identity online (FIDO) and public key cryptography technology to ensure security. In this study, we uncover vulnerabilities within the Passkey registration process by employing the FIDO client to authenticator protocol (CTAP) method using a PC and an authenticator. We emphasize the risks of unauthorized individuals exploiting vulnerabilities in Chromium-based browsers to initiate concurrent registration processes, register their own Passkeys instead of legitimate users’, and the lack of registration success acknowledgment from the server to the authenticator. Considering these vulnerabilities, we implement a session replication attack, which is a local attack, through QR code sniffing during Passkey CTAP registration, and employed physical proximity and Wi-Fi jamming attacks within the Passkey registration process. We elucidate methods that enable these attacks and categorize the attack scenarios based on the smartphone of the victim. Our experimental results indicate a notable success rate for attackers, exceeding 87% for victims with Android phones and more than 67% success for victims with iPhones. We disclosed the vulnerabilities identified in Chromium-based browsers to Google
Cross-Validation for Detecting Label Poisoning Attacks: A Study on Random Forest Algorithm
International audienceThe widespread adoption of machine learning (ML) algorithms has revolutionized various aspects of modern life. However, their susceptibility to data poisoning attacks remains a significant concern due to their potential to compromise model integrity and performance. This study examines the impact of two types of data poisoning attacks on the Random Forest algorithm. It highlights the vulnerability of ML systems, especially in continual learning settings. We propose a simple yet effective strategy for continual learning ML systems to detect potential label poisoning attacks. This involves observing significant performance changes during model retraining. Experimental evaluation with Random Forest algorithms confirms the efficacy of the strategy in detecting and mitigating label poisoning attacks in continual learning systems
Predicting the Execution Time of Secure Neural Network Inference
International audienceIn the secure neural network inference (SNNI) problem, a service provider offers inference as a service with a pre-trained neural network (NN). Clients can use the service by providing an input and obtaining the output of the inference with the NN. For reasons of privacy and intellectual property protection, the service provider must not learn anything about the input or the output, and the client must not learn anything about the internal parameters of the NN. This is possible by applying techniques like multi-party computing (MPC) or homomorphic encryption (HE), although with a significant performance overhead.One way to improve the efficiency of SNNI is by selecting NN architectures that can be evaluated faster using MPC or HE. For this, it would be important to predict how long SNNI with a given NN takes. This turns out to be challenging. Traditional predictors for NN inference time, like the number of parameters in the NN, are poor predictors of SNNI execution time, since they ignore the characteristics of cryptographic protocols. This paper is the first to address this problem. We propose three different prediction methods for SNNI execution time, and investigate experimentally their strengths and weaknesses. The results show that the proposed methods offer different advantages in terms of accuracy and speed
Obfuscating Code Vulnerabilities Against Static Analysis in Android Apps
International audienceIn this paper, we investigate using obfuscation as a security-through-obscurity approach to hide app code vulnerabilities in Android apps. Obfuscation refers to a set of techniques that change the syntax of the code but preserve its semantics. This way, the app maintains the same runtime behavior, but the obfuscated code is hardly readable to a human being.Here, we aim to empirically assess whether obfuscation could also negatively affect the vulnerability detection rate of SAST (i.e., Static Application Security Testing) tools. Such tools automatically reverse-engineer the app and look for vulnerability patterns in the code according to proper heuristics.Our findings show that obfuscation reduces the detection rate of SAST tools, suggesting that investigating novel and vulnerability-focused obfuscation techniques in the future may reduce the probability of an attacker detecting vulnerabilities in obfuscated app code, both manually (due to unreadability) and automatically (by deceiving SAST tools)
A Novel Signature for Distinguishing Non-lesional from Lesional Skin of Atopic Dermatitis Based on a Machine Learning Approach
Part 1: Biomedical/ClassificationInternational audienceAtopic dermatitis is a common inflammatory skin disease, characterized by great heterogeneity and complexity. Its underlying causes are not yet fully understood. As a result, current therapies do not always lead to satisfactory outcomes. Very few studies have addressed the potential use of transcriptomic data and machine learning algorithms in atopic dermatitis. In this paper, we present and detail the use of machine learning models over omics data for identifying potential biomarkers to use for distinguishing non-lesional from lesional skin samples in patients with atopic dermatitis. Particularly, we identified an optimal signature that includes eight genes – FUT3, STRIP2, SMPD3, ZNF285, BTC, SUSD2, HSD11B1 and FABP7 – and obtained an AUC of 0.839 and an accuracy of 86.42%. We performed some functional analyses and concluded that some potential biomarkers interfere with the same molecular mechanisms and may be involved in atopic dermatitis. We expected to provide new insights for a deeper comprehension of the mechanisms behind the manifestation of the disease
Exploration of Ensemble Methods for Cyber Attack Detection in Cyber-Physical Systems
Part 2: Cyber Security/Anomaly DetectionInternational audienceCyber-physical systems (CPS) are prevalent in critical infrastructure, industrial settings, cybersecurity, healthcare, transportation and more. As applications and benefits of CPS continues to expand in all aspects of human existence, the number of cyber attacks increases exponentially. Although there exist myriad ensemble tecqniques for cyber attack detection, identifying the most suitable one for a given dataset can be challenging. This study presents a comparative analysis of ensemble methods for detecting binary and multiclass cyber attacks in a CPS specifically a water distribution system. This research focuses on the application and efficacy of various ensemble learning techniques such as voting, bagging, boosting and stacking using the Water Distribution testbed (WDT) dataset. The results of the experiment demonstrated that Bagging Decision Trees ensemble (BAGTREE) achieved high performance both in binary and multiclass classification. BAGTREE reached a 98% accuracy level for binary classification tasks and 99% accuracy in the multiclass classification
Advanced Mortality Prediction in Adult ICU: Introducing a Deep Learning Approach in Healthcare
Part 1: Biomedical/ClassificationInternational audienceAccurate mortality prediction in Intensive Care Units (ICUs) is crucial for optimizing patient care and resource allocation. Traditional prediction models that are essential in guiding clinical decision-making and resource allocation in critical care settings, such as Acute Physiology and Chronic Health Evaluation (APACHE) and Simplified Acute Physiology Score (SAPS), while effective, have limitations that restrain their adaptability and prediction efficacy. Recent paces in Machine Learning (ML), especially in deep learning, present promising opportunities for enhancing prediction accuracy. This study provides a comprehensive evaluation of ML algorithms, encompassing deep learning, for predicting mortality in adult ICUs using certain clinical inputs. Various ML techniques underwent thorough examination, preprocessing, and hyperparameter optimization processes. An ensemble approach combining multiple models, such as CatBoost, LightGBM, Feedforward Neural Networks (FNNs) and Extra Trees, yielded higher performance, achieving an AUC of 0.873 and an accuracy of 81.82%, compared to the respective metrics delivered by the traditional APACHE IV model (AUC: 0.819). The current study bridges gaps in current research by exploring advanced ML methodologies and demonstrates the potential of deep learning in ICU mortality prediction. In doing so, it makes a significant stride towards advancing predictive analytics in healthcare
Optimization of Healthcare Process Management Using Machine Learning
Part 1: Biomedical/ClassificationInternational audienceHealthcare management plays a crucial role in ensuring the efficient delivery of healthcare services. This responsibility encompasses strategic planning, resource management, and regulatory compliance to enhance patient care outcomes. In this paper, we delve into the multifaceted nature of healthcare management, highlighting the expertise required to optimize processes within dynamic healthcare environments. Furthermore, we explore the potential of machine learning in addressing operational challenges within healthcare. By examining various machine learning algorithms, we identify their advantages and limitations, proposing a structured method of application. Through this analysis, we aim to illuminate how machine learning can minimize patient waiting times and optimize overall healthcare operations. This inspection aims to elucidate how machine learning methodologies can mitigate patient wait times and refine overall healthcare logistics. By amalgamating cutting-edge technologies with strategic methodologies, healthcare entities can leverage the transformative capabilities of machine learning to enhance operational efficiency and elevate the delivery of patient care