1,720,972 research outputs found

    Unleashing the Power of AI to Automate Cybersecurity

    Full text link
    L'abstract è presente nell'allegato / the abstract is in the attachmen

    A field-measurements-based LoRa network planning tool

    Full text link
    Long range (LoRa) transmission technology enables energy-constrained devices such as the tiny sensor systems used in internet-of-things applications that are distributed over wide areas while still being able to establish appropriate connectivity. This has resulted in the development of an exponentially increasing number of different solutions and services based on LoRa, be they dedicated to the long-term monitoring of distributed plants and infrastructures or to human-centred applications such as safety-oriented sensor systems for use in the workplace. In dense LoRa networks, predicting the number of supported nodes in relation to their position and the propagation environment is essential for ensuring reliable and stable communication and minimising costs. In this paper, after comparing different path loss models based on a field measurement campaign for LoRa received signal strength indicator values within a university campus, two main modifications of the LoRa simulator tool were implemented. These were aimed at improving the accuracy of the prediction of the number of sustainable nodes in relation to the target data extraction rate set. The simulations based on field measurements demonstrated that through an improved path loss evaluation and the use of three gateways, the number of nodes could be increased theoretically from around 100 to around 6,000

    BitsAndBites at SemEval-2025 Task 9: Improving Food Hazard Detection with Sequential Multitask Learning and Large Language Models

    Full text link
    Automatic and early detection of foodborne hazards is crucial for preventing foodborne outbreaks. Existing AI-based solutions often cannot handle complexity and noise in food recall reports and they struggle to overcome the dependency between product and hazard labels. We introduce a methodology for classifying reports on food-related incidents that addresses these challenges. Our approach leverages LLM-based information extraction, to minimize report variability, along with a two-stage classification pipeline. The first model assigns coarse-grained labels that narrow the space of eligible fine-grained labels for the second model. This sequential process allows us to capture hierarchical label dependencies between products and hazards and between their respective categories. Additionally, we designed each model with two classification heads that rely on the inherent relations between food products and associated hazards. We validate our approach on two multi-label classification sub-tasks. Experimental results demonstrate the effectiveness of our approach, which achieves an improvement of +30% and +40% in classification performance compared to the baseline

    Exploring Temporal GNN Embeddings for Darknet Traffic Analysis

    Full text link
    Network Traffic Analysis (NTA) serves as a foundational tool for characterizing network entities and uncovering suspicious traffic patterns, thereby enhancing our understanding of network operations and security. As successfully done in other domains, due to the scarcity of labelled data, Deep Learning (DL)-based solutions for NTA have started adopting a 2-stage approach; (i) a self-supervised upstream task generates compact and information-rich representations (embeddings) of network data without the need for a ground truth; (ii) the embeddings serve as input to specialized models for downstream tasks (supervised or unsupervised) -- e.g. traffic classification or anomaly detection. Since graphs are intuitive representations of network traffic, in this work, we explore the potential of temporal Graph Neural Networks (tGNNs) in generating intermediate embeddings in a self-supervised fashion. We assess the quality of such embeddings by solving a host classification problem in a darknet traffic scenario. We evaluate static and temporal GNNs over a month-long period of traffic traces. We find that the inclusion of node features and temporal aspects in the model, together with an incremental training approach, allows for an accurate description of host activity dynamics and enables the creation of 2-stage NTA pipelines

    Detecting Edge and Node Anomalies with Temporal GNNs

    Full text link
    Computer and social networks can be effectively represented as complex temporal graphs where entities (nodes) keep interconnecting through various relationships (edges), forming evolving structures. Anomaly Detection (AD) in such networks consists of identifying patterns diverging from what is expected (or normal). In fact, computer and social networks lack common definitions of what is anomalous. The identification of anomalies is therefore fundamental for monitoring, management, and detection of potential threats -- e.g. suspicious connections between nodes (edge AD) or compromised entities (node AD). However, the literature is scarce about solutions to detect node anomalies. This work addresses three challenges by employing temporal Graph Neural Networks (tGNNs): fast-evolving graphs from communications networks, absence of ground truth, and simultaneous node and edge AD. For this, we propose the usage of a tGNN coupled with custom AD blocks that we train in a completely self-supervised way. We also embed an attention mechanism providing interpretability to the decision process. We extensively validate and test the tGNNs on synthetic and real-world datasets showing that the proposed architectures successfully detect both node and edge anomalies (0.9 of average AUC)

    Dynamic Cluster Analysis to Detect and Track Novelty in Network Telescopes

    Full text link
    In the context of cybersecurity, tracking the activi-ties of coordinated hosts over time is a daunting task because both participants and their behaviours evolve at a fast pace. We address this scenario by solving a dynamic novelty dis-covery problem with the aim of both re-identifying patterns seen in the past and highlighting new patterns. We focus on traffic collected by Network Telescopes, a primary and noisy source for cybersecurity analysis. We propose a 3-stage pipeline: (i) we learn compact representations (embeddings) of hosts through their traffic in a self-supervised fashion; (ii) via clustering, we distinguish groups of hosts performing similar activities; (iii) we track the cluster temporal evolution to highlight novel patterns. We apply our methodology to 20 days of telescope traffic during which we observe more than 8 thousand active hosts. Our results show that we efficiently identify 50-70 well-shaped clusters per day, 60-70% of which we associate with already analysed cases, while we pinpoint 10-20 previously unseen clusters per day. These correspond to activity changes and new incidents, of which we document some. In short, our novelty discovery methodology enormously simplifies the manual analysis the security analysts have to conduct to gain insights to interpret novel coordinated activities

    Incremental Federated Host Embeddings for Network Telescopes Traffic Analysis

    Full text link
    Network telescopes are ranges of IP addresses with nothing connected. They are contacted by botnets and scanners that look for possible victims. Each telescope exposes a partial view, and merging the information with that coming from other telescopes is fundamental. Machine learning allows us to build models to solve classification tasks automatically. However, the continuous evolution of traffic calls for a continuous update of such a model. This work explores applying collaborative Artificial Intelligence solutions via Federated Learning (FL) to build a global model without sharing the raw (and sensitive) data, also limiting data exchange. We leverage a two-stage pipeline: (i) a self-supervised upstream task generates and updates an incremental compact representation of the senders hitting the telescope; (ii) such embeddings serve as input for a downstream classification task to identify possible offenders. We compare the embedding that a single telescope generates with those obtained via FL from data collected by multiple telescopes and evaluate the benefits of the incremental approach. We show that FL can produce embeddings of better quality than a single network telescope can, increasing the model accuracy (+6%) and coverage (+12%) while limiting the amount of data exchanged (from GBs to MBs

    Sensors Characterization for a Calibration-Free Connected Smart Insole for Healthy Ageing

    Full text link
    The design of technological aids to assist older adults in their ageing process and to ensure proper attendance and care, despite the decreasing percentage of young people in the demographic profiles of many developed countries, requires the proper selection of sensing components, in order to come up with devices that can be easily used and integrated into everyday life. This paper addresses the metrological characterization of pressure sensors to be inserted into smart insoles aimed at monitoring the older adult’s physical activity levels. Two types of sensing elements are evaluated and a recommendation provided, based on the main requirement of designing a calibration-free insole: in this case, the pressure sensor should act as a switch, and the FSR 402 Short sensing element appears to be the proper solution to adopt

    Explainable Stacking Models based on Complementary Traffic Embeddings

    Full text link
    Network security relies on effective measurements and analysis for identifying malicious traffic. Recent proposals aim at automatically learning compact and informative representations (i.e. embeddings) of network traffic that capture salient features. These representations can serve multiple downstream tasks, streamlining the machine learning pipeline. Researchers have proposed techniques borrowed from Natural Language Processing (NLP) and Graph Neural Networks (GNN) to learn such embeddings, with both lines delivering promising results. This paper investigates the benefits of combining comple-mentary sources of information represented by embeddings learnt via different techniques and from different data. We rely on classifiers based on traditional features engineering and on automatic embedding generation (borrowing from NLP and GNN) to classify hosts observed from darknets and honeypots. We then stack these base classifiers trained on each embedding through meta-learning to combine the complementary information sources to improve performance. Our results show that meta-learning outperforms each single classifier. Importantly, the proposed meta-learner provides explainability on the importance of the embedding types and the impact of each data source on the outcome. All in all, this work is a step forward in the search for more effective, general, understandable, and practical representations that could carry multiple traffic characteristics

    Cross-network Embeddings Transfer for Traffic Analysis

    Full text link
    Artificial Intelligence (AI) approaches have emerged as powerful tools to improve traffic analysis for network monitoring and management. However, the lack of large labeled datasets and the ever-changing networking scenarios make a fundamental difference compared to other domains where AI is thriving. We believe the ability to transfer the specific knowledge acquired in one network (or dataset) to a different network (or dataset) would be fundamental to speed up the adoption of AI-based solutions for traffic analysis and other networking applications (e.g., cybersecurity). We here propose and evaluate different options to transfer the knowledge built from a provider network, owning data and labels, to a customer network that desires to label its traffic but lacks labels. We formulate this problem as a domain adaptation problem that we solve with embedding alignment techniques and canonical transfer learning approaches. We present a thorough experimental analysis to assess the performance considering both supervised (e.g., classification) and unsupervised (e.g., novelty detection) downstream tasks related to darknet and honeypot traffic. Our experiments show the proper transfer techniques to use the models obtained from a network in a different network. We believe our contribution opens new opportunities and business models where network providers can successfully share their knowledge and AI models with customers
    corecore