1,720,965 research outputs found
Mathematical Methods for eXplainable and Reliable Machine Learning in Trustworthy Artificial Intelligence
The common thread in all my research has been the goal of finding a so-called safety region in the input space of an inference model that allows providing probabilistic guarantees on the output of the model and tools to control the prediction. The idea of safety region fits well with the task of classification in machine learning: the goal is to classify instances into well-defined and closed envelopes, respecting some probabilistic performance or guarantees. So my research started from a thorough and accurate review of the main classification algorithms in machine learning, from support vector machines to neural networks via rule-based models as well. But the best algorithm I found to achieve my purpose was Support Vector Data Descrip- tion (SVDD), an established algorithm for outlier detection whose main purpose is to enclose target data within a sphere with a center and radius learned from the data distribution. The choice of such an algorithm for defining the safety re- gion is quite trivial and supportable: SVDD allows a closed region to be defined in the input space and also provides a radius that can easily control the shape of the classification boundary to “inflate” or “deflate” it according to the performance objective. Starting from a totally data-driven definition of safety region, with only empirical (but effective) performance guarantees, I moved to a more mathematical definition, placing my idea of safety region within the framework of probabilistic scaling. This technique, in the state of the art of order statistics, provides a clear and indisputable way to obtain probabilistic guarantees on the safety region. Here, moreover, I applied the idea of safety region to a broader class of classifiers, called scalable classifiers, i.e., classification models that all share a scalable parameter in the classifier’s predictor definition that can be appropriately adjusted to obtain the desired guarantees for the safety region and I also specialized these concepts into exponential distributions that allow special properties of safety regions. This allows to extend the concepts developed in Chapter 3 from SVDD to any kind of ma- chine learning classifier. In particular, I introduced new algorithms both to control performance in classification and to obtain probabilistic guarantees of the safety region. Performance control was achieved by minimizing the misclassification error, reducing the number of false positives or false negatives or both, depending on the application. On the other hand, probabilistic guarantee has been shown mathemat- ically to be effective. Both concepts, however, can be applied to real-world problems to achieve safety in cyber-physical systems applications, such as vehicle platooning monitoring, DNS tunneling detection, and type-2 diabetes disease prediction, just to name a few tested applications of my methods.
However, before getting good results in my research, several ways were tried. An- other line of research for defining the safety region was the use of conformal predic- tion, a new but well-established theory for evaluating conformity in machine learning algorithm performance. In this case, the idea behind conformal prediction is that it is possible to correctly calibrate an algorithm to obtain marginal probability cover- age that the desired output of the model is as expected. In this field, it is necessary to define a real-valued function, called score function, that encodes the characteris- tics of the model and calibrate the algorithm to the result of evaluating that function on a calibration set. This line of research is getting good prospects and is one of the lines I will follow in my future work.
But reliability is not enough to make AI totally trustworthy. In fact, controllability is another crucial aspect to consider. From this point of view, I focused on studying and developing new techniques to control the output of a classification algorithm. This was done in the spirit of counterfactual explanation, a fairly new but already state-of-the-art eXplanaible AI technique. The idea of counterfactual explanations is that it is possible to minimally change the input parameters of a machine learning algorithm so as to change the prediction results. In the sense that will be explained in the chapter dedicated to counterfactual explanations (Chapter 9) will be clear that the expression “ minimal change” refers to the idea of minimizing a specific cost function between the actual input and the desired one. My contribution in this topic lies in the development of a counterfactual approach based on SVDD, totally in line with the idea of safety region investigated in the first part of my research. The proposed approach was first attempted to be solved completely analytically, but then, given the complexity of the task, a numerical solution based on random sampling techniques was developed. The algorithm, again, was applied to real-world application problems, such as crowd control in subways. This topic, however, allows for more exploration, for example by merging it together with the conformal frame- work provided by safety regions.
Finally, all the work presented in this thesis has been surrounded by explainable AI, the field of study dedicated to making AI explainable and expressible by intelligible rules. In this regard, explainable AI can also be declined in terms of controllability and reliability, thus placing all my research totally in line with this theme.
In conclusion, my thesis covered three years of research in the field of artificial intelligence, spending most of the time evaluating the problem of how to make a good machine learning algorithm from a reliable, explainable and controllable point of view, with the hope of having really improved the body of knowledge in such a crucial aspect of Science
CONFIDERAI: CONFormal Interpretable-by-Design score function for Explainable and Reliable Artificial Intelligence
The concept of trustworthiness has been declined in different ways in the field of artificial intelligence, but all its definitions agree on two main pillars: explainability and conformity. In this extended abstract, our aim is to give an idea on how to merge these concepts, by defining a new framework for conformal rule-based predictions. In particular, we introduce a new score function for rule-based models, that leverages on rule relevance and geometrical position of points from rule classification boundaries
eXplainable Checker of Video Analytics Performance in Indoor Smart Mobility
Ensuring the performance of object detection systems in dynamic environments requires not only accurate predictions but also the ability to assess the certainty of those predictions. This paper proposes an interpretable framework for monitoring the Operational Design Domain (ODD) of real-time object detectors through visual feature-based certainty evaluation. Using a dual-path architecture, the system combines a standard object detection pipeline with a parallel branch that extracts visual features and classifier predictions as Certain or Uncertain using decision rules learned via Decision Trees (DTs). A Cumulative Feature Ranking (CFR) strategy ensures robust selection of discriminative features across perturbed and real-world datasets. Extensive experiments on the Pedestrian and Wheelchair object categories demonstrate the system’s ability to detect prediction uncertainty under a variety of visual conditions. The interpretable nature of the learned rules provides transparency, while the low false alarm rate demonstrates the effectiveness of the ODD checker in supporting safe and explainable perception for indoor smart mobility applications
On The Detection Of Adversarial Attacks Through Reliable AI
Adversarial machine learning manipulates datasets to mislead machine learning algorithm decisions. We propose a new approach able to detect adversarial attacks, based on eXplainable and Reliable AI. The results obtained show how canonical algorithms may have difficulty in identifying attacks, while the proposed approach is able to correctly identify different adversarial settings
Going Beyond Counting First Authors in Author Co-citation Analysis
The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation
counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings
are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that
only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into
account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed
Variations on the Author
“Variations on the Author” discusses two of Eduardo Coutinho’s recent films (Um Dia na Vida, from 2010, and Últimas Conversas, posthumously released in 2015) and their contribution to the general question of documentary authorship. The director’s filmography is characterized by a consistent yet self-effacing form of authorial self-inscription: Coutinho often features as an interviewer that rather than express opinions propels discourses; an interviewer that is good at listening. This mode of self-inscription characterizes him as an author who is not expressive but who is nonetheless markedly present on the screen. In Um Dia na Vida, however, Coutinho is completely absent form the image, while Últimas Conversas, on the contrary, includes a confessional prologue that moves the director from the margins to the center of his films. This article examines the ways in which these works stand out in the filmography of a director who offers new insights into the notion of cinematic authorship
Appropriate Similarity Measures for Author Cocitation Analysis
We provide a number of new insights into the methodological discussion about author cocitation analysis. We first argue that the use of the Pearson correlation for measuring the similarity between authors’ cocitation profiles is not very satisfactory. We then discuss what kind of similarity measures may be used as an alternative to the Pearson correlation. We consider three similarity measures in particular. One is the well-known cosine. The other two similarity measures have not been used before in the bibliometric literature. Finally, we show by means of an example that our findings have a high practical relevance.information science;Pearson correlation;cosine;similarity measure;author cocitation analysis
Dispelling the Myths Behind First-author Citation Counts
We conducted a full-scale evaluative citation analysis study of scholars in the XML research field to explore just how different from each other author rankings resulting from different citation counting methods actually are, and to demonstrate the capability of emerging data and tools on the Web in supporting more realistic citation counting methods. Our results contest some common arguments for the continued
use of first-author citation counts in the evaluation of scholars, such as high correlations between author rankings by first-author citation counts and other citation
counting methods, and high costs of using more realistic citation counting methods that are not well-supported by the ISI databases. It is argued that increasingly available digital full text research papers make it possible for citation analysis studies to go beyond what the ISI databases have directly supported and to employ more
sophisticated methods
- …
