Search CORE

1,720,975 research outputs found

Editorial: AI meets cybersecurity

Author: Appice A.
Andresini G., Appice, A
Andresini G.
Publication venue
Publication date: 01/01/2023
Field of study

Cyberspace has revolutionised our ways of life, prompting important advances in science and technology. However, the benefits of cyberspace are now threatened by the systemic risk of proliferation of offensive cyber-tools and cyber-operations. Cybersecurity is the practice of protecting networks, systems and any other digital infrastructure from malicious attacks. Traditional cybersecurity relies on the static control of cyberspace security by monitoring it according to pre-defined rules. However, this passive and reactive defence methodology is no longer useful for protecting cyberspace against new cybersecurity threats. Artificial Intelligence (AI) technologies, such as machine learning and deep learning, have recently been introduced in cybersecurity to build smart models for implementing malware classification, intrusion detection, vulnerability and threat discovery. On the other hand, it is important to consider that AI can also be used by attackers to continually improve their techniques and refine their offensive capabilities. Therefore, the study of the effectiveness of both Defensive and Offensive AI is crucial to ensure modern cybersecurity equipped to deal with emerging threats provoked by malicious uses of AI

Archivio istituzionale della ricerca - Università di Bari

Nearest cluster-based intrusion detection through convolutional neural networks

Author: Malerba D.
Andresini G.
Appice A.
Publication venue
Publication date: 01/01/2021
Field of study

The recent boom in deep learning has revealed that the application of deep neural networks is a valuable way to address network intrusion detection problems. This paper presents a novel deep learning methodology that uses convolutional neural networks (CNNs) to equip a computer network with an effective means to analyse traffic on the network for signs of malicious activity. The basic idea is to represent network flows as 2D images and use this imagery representation of the flows to train a 2D CNN architecture. The novelty consists in deriving an imagery representation of the network flows through performing a combination of the nearest neighbour search and the clustering process. The advantage is that the proposed data mapping method allows us to build imagery data that express potential data patterns arising at neighbouring flows. The proposed methodology leads to better predictive accuracy when compared to competitive intrusion detection architectures on three benchmark datasets

Archivio istituzionale della ricerca - Università di Bari

Autoencoder-based deep metric learning for network intrusion detection

Author: Malerba D.
Andresini G.
Appice A.
Publication venue
Publication date: 01/01/2021
Field of study

Nowadays intrusion detection systems are a mandatory weapon in the war against the ever-increasing amount of network cyber attacks. In this study we illustrate a new intrusion detection method that analyses the flow-based characteristics of the network traffic data. It learns an intrusion detection model by leveraging a deep metric learning methodology that originally combines autoencoders and Triplet networks. In the training stage, two separate autoencoders are trained on historical normal network flows and attacks, respectively. Then a Triplet network is trained to learn the embedding of the feature vector representation of network flows. This embedding moves each flow close to its reconstruction, restored with the autoencoder associated with the same class as the flow, and away from its reconstruction, restored with the autoencoder of the opposite class. The predictive stage assigns each new flow to the class associated with the autoencoder that restores the closest reconstruction of the flow in the embedding space. In this way, the predictive stage takes advantage of the embedding learned in the training stage, achieving a good prediction performance in the detection of new signs of malicious activities in the network traffic. In fact, the proposed methodology leads to better predictive accuracy when compared to competitive intrusion detection architectures on benchmark datasets

Archivio istituzionale della ricerca - Università di Bari

Clustering-Aided Multi-View Classification: A Case Study on Android Malware Detection

Author: Malerba D.
Andresini G.
Appice A.
Publication venue
Publication date: 01/01/2020
Field of study

Recognizing malware before its installation plays a crucial role in keeping an android device safe. In this paper we describe a supervised method that is able to analyse multiple information (e.g. permissions, api calls and network addresses) that can be retrieved through a broad static analysis of android applications. In particular, we propose a novel multi-view machine learning approach to malware detection, which couples knowledge extracted via both clustering and classification. In an assessment, we evaluate the effectiveness of the proposed method using benchmark Android applications and established machine learning metrics

Archivio istituzionale della ricerca - Università di Bari

SILVIA: An eXplainable Framework to Map Bark Beetle Infestation in Sentinel-2 Images

Author: Malerba D.
Andresini G.
Appice A.
Publication venue
Publication date: 01/01/2023
Field of study

Recent long spells of high temperatures and drought-hit summers have fostered the conditions for an unprecedented outbreak of bark beetles in Europe. This phenomenon has ruined vast swathes of European conifer forests creating a need among forest managers to find effective methods to gather information about the mapping of bark beetle infestation hotspots. Sentinel-2 data have been recently established as an alternative to field surveys for certain inventory tasks. Hence, this study explores the achievements of machine learning to perform the inventory mapping of bark beetle infestation hotspots in Sentinel-2 images. In particular, we investigate the accuracy performance of a spectral classifier that is learned for the study task by leveraging spectral vegetation indices and performing self-training. We use a dataset of Sentinel-2 images acquired in nonoverlapping forest scenes from the North-east of France acquired in October 2018. The selected scenes host bark beetle infestation hotspots of different sizes, which originate from the mass reproduction of the bark beetle in the 2018 infestation. We perform a learning stage by accounting for the ground-truth bark beetle infestation masks of a subset of images in the study imagery dataset (training imagery set). The goal is to produce a prediction of the bark beetle infestation masks for the remaining images in the study imagery dataset (working imagery set). Moreover, we use an explainable artificial intelligence technique to study the relevance of spectral information and explain the effect of both self-training and spectral vegetation indices on the mapping decisions

Archivio istituzionale della ricerca - Università di Bari

DIAMANTE: A data-centric semantic segmentation approach to map tree dieback induced by bark beetle infestations via satellite images

Author: Recchia V.
Ienco D.
Andresini G.
Appice A.
Publication venue
Publication date: 01/01/2024
Field of study

Forest tree dieback inventory has a crucial role in improving forest management strategies. This inventory is traditionally performed by forests through laborious and time-consuming human assessment of individual trees. On the other hand, the large amount of Earth satellite data that are publicly available with the Copernicus program and can be processed through advanced deep learning techniques has recently been established as an alternative to field surveys for forest tree dieback tasks. However, to realize its full potential, deep learning requires a deep understanding of satellite data since the data collection and preparation steps are essential as the model development step. In this study, we explore the performance of a data-centric semantic segmentation approach to detect forest tree dieback events due to bark beetle infestation in satellite images. The proposed approach prepares a multisensor data set collected using both the SAR Sentinel-1 sensor and the optical Sentinel-2 sensor and uses this dataset to train a multisensor semantic segmentation model. The evaluation shows the effectiveness of the proposed approach in a real inventory case study that regards non-overlapping forest scenes from the Northeast of France acquired in October 2018. The selected scenes host bark beetle infestation hotspots of different sizes, which originate from the mass reproduction of the bark beetle in the 2018 infestation

Archivio istituzionale della ricerca - Università di Bari

GAN augmentation to deal with imbalance in imaging-based intrusion detection

Author: Malerba D.
De Rose L.
Andresini G.
Appice A.
Publication venue
Publication date: 01/01/2021
Field of study

Nowadays attacks on computer networks continue to advance at a rate outpacing cyber defenders’ ability to write new attack signatures. This paper illustrates a deep learning methodology for the binary classification of the network traffic. The basic idea is to represent network flows as 2D images and use this imagery representation of the network traffic to train a Generative Adversarial Network (GAN) and a Convolutional Neural Network (CNN). The GAN is trained to produce new images of unforeseen network attacks by augmenting the training data used to learn a CNN-based intrusion detection model. The advantage is that the 2D data mapping technique used builds images of the network flows, which allow us to take advantage of deep learning architectures with convolution layers. In addition, the GAN-based data augmentation allows us to deal with the possible imbalance of malicious traffic that is commonly rarer than the normal traffic in the network traffic. Specifically, it is used to simulate unforeseen attacks to train a robust intrusion detection model. The proposed methodology leads to better predictive accuracy when compared to competitive intrusion detection architectures on four benchmark datasets

Archivio istituzionale della ricerca - Università di Bari

GLORIA: A Graph Convolutional Network-Based Approach for Review Spam Detection

Author: Gasbarro R.
Malerba D.
Andresini G.
Appice A.
Publication venue
Publication date: 01/01/2023
Field of study

Spam reviews contain untruthful content created with malevolent intent, to affect the overall reputation of a product, service or company. This content is commonly made by malicious users or automated programs (i.e., bots) that mimic human behaviour. With the recent boom of online review systems, performing accurate review spam detection has become of primary importance for a review platform, to mitigate the effect of malicious users responsible for untruthful content. In this work, we propose a review spam classification approach, named GLORIA, that adopts a graph representation of review data and trains a graph convolutional neural network for edge classification as a review spam detection model. In particular, GLORIA represents both users (i.e., authors of reviews) and products (i.e., reviewed items) as nodes of a heterogeneous graph, while it represents reviews as graph edges that connect each author of a review to the reviewed item. Features of users, products and reviews are associated with nodes and edges, respectively. Experiments performed on publicly available review datasets prove the effectiveness of the proposed approach compared with some state-of-the-art approaches

Archivio istituzionale della ricerca - Università di Bari

A two-step network intrusion detection system for multi-class classification

Author: Malerba D.
Andresini G.
Appice A.
Publication venue
Publication date: 01/01/2021
Field of study

A network intrusion detection system aims to discover any unauthorised access to computer networks by analysing the network traffic for signs of malicious activity. In this paper, we present a two-step system for network intrusion detection. The first step comprises a Triplet network that processes the flow-based characteristics of the historical network traffic data to learn an embedding space, where distances between samples labelled with opposite classes are greater than distances between samples labelled with the same class. We take adavantage of this embedding space to separate the normal samples from the malicious ones. The second step uses a multi-class eXtreme Gradient Boosting classifier to recognize the attack family of the detected malicious flows. The experiments prove the effectiveness of the proposed system as it leads to higher accuracy when compared to several, recent competitors

Archivio istituzionale della ricerca - Università di Bari

PANACEA: a neural model ensemble for cyber-threat detection

Author: Malerba D.
AL-Essa M.
Andresini G.
Appice A.
Publication venue
Publication date: 01/01/2024
Field of study

Ensemble learning is a strategy commonly used to fuse different base models by creating a model ensemble that is expected more accurate on unseen data than the base models. This study describes a new cyber-threat detection method, called PANACEA, that uses ensemble learning coupled with adversarial training in deep learning, in order to gain accuracy with neural models trained in cybersecurity problems. The selection of the base models is one of the main challenges to handle, in order to train accurate ensembles. This study describes a model ensemble pruning approach based on eXplainable AI (XAI) to increase the ensemble diversity and gain accuracy in ensemble classification. We base on the idea that being able to identify base models that give relevance to different input feature sub-spaces may help in improving the accuracy of an ensemble trained to recognise different signatures of different cyber-attack patterns. To this purpose, we use a global XAI technique to measure the ensemble model diversity with respect to the effect of the input features on the accuracy of the base neural models combined in the ensemble. Experiments carried out on four benchmark cybersecurity datasets (three network intrusion detection datasets and one malware detection dataset) show the beneficial effects of the proposed combination of adversarial training, ensemble learning and XAI on the accuracy of multi-class classifications of cyber-data achieved by the neural model ensemble

Archivio istituzionale della ricerca - Università di Bari