Search CORE

1,721,013 research outputs found

Tuning the transfer function : the reversed wedge and beyond

Author: BEX Geert Jan
Publication venue
Publication date: 01/01/1996
Field of study

Not availabl

Document Server@UHasselt

Tuning the transfer function : the reversed wedge and beyond

Author: BEX Geert Jan
Publication venue
Publication date: 01/01/1996
Field of study

Not availabl

Document Server@UHasselt (Universiteit Hasselt)

Topic modelling and text classification models for applications within EFSA

Author: Brecht Vandevoort
Frank Neven
Geert‐Jan Bex
CREVECOEUR Jonas
BEX Geert Jan
VANDEVOORT Brecht
Jonas Crevecoeur
NEVEN Frank
Publication venue
Publication date: 2023
Field of study

This report presents an overview of topic modelling and classification models in relation to four case studies in the EFSA project OC/EFSA/AMU/2020/02. As adequate document embeddings have a positive influence on the effectiveness of topic modelling as well as text classification, an extensive number of different possibilities for word and document embeddings are discussed. It was found that a multitude of increasingly more complex embeddings are readily available for off-the-shelf use. But as they are trained on large but mostly general text corpora, their utility for domain specific text varies. Fine tuning or creating document embeddings from scratch is only feasible in the presence of enough data and has an associated computational cost. For some domains (like scientific articles), pretrained embeddings are available. For topic modelling, we discuss standard techniques like non-negative matrix factorization and latent Dirichlet allocation as well as more recent methods based on clustering of document embeddings like Top2Vec and BERTopic. For text classification, we consider hierarchical text classification approaches combined with established techniques for text classification via document embeddings. We propose a selection of techniques for each of the case studies justifying their choice and present a plan for evaluation. Finally, we discuss our findings after having implemented and validated the selected techniques

Crossref

Document Server@UHasselt (Universiteit Hasselt)

Discovering structure in semi-structured data

Author: Bex Geert Jan
Publication venue
Publication date: 01/01/2008
Field of study

Excerpt of introduction: Unfortunately, in spite of the above mentioned advantages, the presence of a schema is not mandatory and many XML documents are not accompanied by one. For instance, in a recent study, Barbosa et al. have shown that approximately half of the XML documents available on the web do not refer to a schema. In another study, we have noted that about two-thirds of XSDs gathered from schema repositories and from the web are not valid with respect to the W3C XML Schema specification, rendering them essentially useless for immediate application (see Chapter 6). A similar observation was made by Sahuguet concerning DTDs. Based on the lack of schemas in practice, it is essential to devise algorithms that can infer a schema for a given collection of XML documents when none, or no syntactically correct one, is present. This is also acknowledged by Florescu who emphasizes that in the context of data integration: “We need to extract good-quality schemas automatically from existing data and perform incremental maintenance of the generated schemas.” It should be noted that even when a schema is already available, there are situations where inference can be useful. One such situation is schema cleaning: sometimes a schema is too general with respect to the XML data that it is supposed to describe. In that case, it can be advantageous to infer a new schema based solely on the data at hand.... In general, schema inference can be used to restrict schemas to a relevant subset of data needed by the application at hand, thereby facilitating difficult tasks like schema matching and data integration. Indeed, as argued by Hinkelman [Hin05], industry-level standards are too loosely defined in general, which can result in XML schemas where many business structures are formally specified as being optional.... Based on the above observations, it is hence essential to devise algorithms that can automatically infer a DTD or XSD from a given corpus of XML documents...

Document Server@UHasselt (Universiteit Hasselt)

Author Instructions

Author: Instructions Author
Publication venue
Publication date: 04/11/2013
Field of study

Crossref

Cartographic Perspectives (E-Journal - North American Cartographic Information Society, NACIS)

Discovering structure in semi-structured data

Author: Bex Geert Jan
Publication venue
Publication date: 01/01/2008
Field of study

Document Server@UHasselt

Generalisation with neural networks

Author: BEX Geert Jan
Publication venue
Publication date: 01/01/1995
Field of study

Document Server@UHasselt (Universiteit Hasselt)

Trustworthy Artificial Intelligence Methods for Image Analysis and Benchmarking of Neural Network Interpretability

Author: ROUSSEAU Axel-Jan
Publication venue
Publication date: 2024
Field of study

The focus of this thesis is on Artificial Intelligence (AI) and on how AI can be presented to people in a way that is more explainable, intuitive and trustworthy. The field of Artificial intelligence is vast and encompasses many subdomains concerned with how machines and software think and act and how to build intelligent entities [3]. Broadly speaking, AI deals with a machine’s ability to learn and acquire knowledge, reason, solve problems and apply and adapt what is learned to new situations. Several definitions of AI have been used. Some try to define AI in terms of how humanlike it is vs how rational it is on the one hand, and on the other in how it acts vs how it thinks. For example, for a machine to pass the Turing test, it needs to convincingly act humanly. The cognitive modelling approach tries to get machines to think humanly and deals more with cognitive science and psychology. Another approach is to make the machine think rationally, by enforcing logic and inference rules. This is different still from a machine that acts rational, where the machine needs to do the right or optimal thing. This latter approach is the most common one as it is more easily captured in mathematical formulation, for example optimizing a utility or loss function. Throughout the years, the field of AI has continued to grow, and the number of subfields under its umbrella is numerous. Throughout the history of modern AI many different methods have been used and proposed, from artificial neurons [4], Hebbian learning [5], and reasoning as search and heuristics [6], and later expanded to include logic programming such as Prolog [7], genetic programs [8], and expert systems (of which Dendral [9] is often considered the first one). In the 80’s, hidden Markov models [10] became more popular, and Bayesian networks [11, 12] followed suit, leading to machine learning, Big Data, deep learning, and now large language models [3]. Presently, AI permeates our daily lives in various forms,from video games and selfie filters to personalised video recommendations and medical devices, even extending to autonomous vehicles. Large language models, such as the ones used in chatbots and smart assistants, are used in content generation for news articles, product descriptions, and academic theses. Such models are currently garnering significant attention and have become the next major breakthrough in AI. In this thesis, the focus will mainly be on Deep Neural Networks (DNN) and their applications. The smallest building block of a neural network is the model of a neuron, which was first introduced by McCulloch and Pitts [4] and was inspired by the function of biological neurons. Consider a very high-level abstraction of a neuron: through dendrites and receptors, the neuron receives stimuli, and when the stimuli reach a threshold, the neuron fires an electrical signal through the axon. The mathematical equivalent of such a neuron is a non-linear element with inputs xi multiplied with weights wi. After adding a bias term b, this is passed through a non-linear function f, also called the activation function. Originally, a neuron was a binary classifier, and the non-linearity used was the Heaviside step function or sign function. However, a variety of other functions have been used, such as the sigmoid and tanh. In current neural networks, the rectified linear unit (ReLU) is the most widely used. A single neuron is only able to learn linearly separable concepts. By combining several neurons in a layer and stacking several layers so that the units in one layer are fully connected to all the units in the previous network (so that the output h of layer l is hl = f l (Wlhl−1)), we can build a Multilayer perceptron (MLP) that can distinguish non-linearly separable data. This type of feed-forward network is a basic neural network, and can already achieve remarkable results. In Hornik et al. [13] showed that an MLP with as few as one hidden layer is a universal approximator, meaning they can approximate any measurable funtion to any degree of accuracy, given enough hidden units. The networks are trained to optimize a loss function that quantifies the performance and serves as a proxy for the real objective. The gradients of this loss function w.r.t the model’s weights can be efficiently calculated using the back-propagation algorithm [14]. Gradients are propagated layer by layer using the chain rule. The weights are then updated based on these gradients using the gradient descent optimisation or variants thereof. Neural networks are discussed further in the next section, section 1.1. The term deep in Deep Learning refers to the utilisation of a larger number of consecutive layers in these networks. By adding more layers on top of each other, each layer is able to learn increasingly complex and meaningful features, enabling the model to more easily grasp complex interactions in the data and simplify the modelling of complex functions. In this way, Deep Learning does a form of automated featureengineering by learning relevant features directly from the raw data, potentially capturing more intricate patterns and relationships that may not be easily identifiable or feasible with handcrafted features, especially for computer vision. However, fully connected layers have the disadvantage that they contain a huge amount of learnable weights, making these models computationally expensive and prone to overfitting. Therefore, in computer vision applications, convolutional layers are used. The neurons in these layers are locally connected to a window of the input as opposed to a fully connected layer. The window will then slide over the whole input to process it. This reduces the number of weights in the layer and introduces a useful inductive bias: pixels that are spatially close are processed together, leveraging the spatial structure of images. Together with convolutional layers, convolutional neural networks (CNN) use pooling layers to reduce the spatial resolution of features further, further reducing the number of weights needed. Still, these models can contain millions of parameters, making it an impossible task to comprehend the function of every single one. They are considered black-box models, as we often do not know which features exactly have been learned or how the decisions are being made. CNNs had been successfully applied in real-world applications before, but it took until 2011 for them to really take off, as more computation power became more readily available due to efficient GPU implementations. In 2011, the model by Cire¸san et al. [15] started winning image competitions, and in 2012, the AlexNet architecture [16] won the ImageNet Large Scale Visual Recognition Challenge. CNNs have since delivered state-of-the-art performance on many computer vision tasks. Additional background is given in section 1.2. While AI applications have become ubiquitous over the past few years and will undoubtfully become an even more prevalent part of our lives in the years to come, the black-box nature of AI models causes friction in AI uptake, especially where transparency, accountability and interpretability is critical. For example, in healthcare, where life-and-death decisions are being made. Early detection of a disease at an early phase is critical to prevent disease progression and massively improve patient outcomes. A wrong diagnosis can lead to harmful or fatal consequences. In autonomous vehicles, a wrong detection or a missed traffic sign can cause a fatal crash. In finance, explanations are needed to assess risks, facilitate decision making and is needed for regulatory compliance. Moreover, a “right to explanation” is mandated by the GDPR [17, Articles 13-15, 22]. No matter the field, no model is perfect, and mistakes will happen. Without a reasonable explanation of the decisions made, it is difficult for people to trust the AI and justify its use. Explanations are not just necessary to justify the decisions and predictions beingmade. They are also essential to debug the model in several ways. There are various ways in which biases can end up in the model, such as biased or skewed training data or algorithmic bias, which needs to be snuffed out. By having the model explain predictions, it becomes possible to debug the biases and take action. Explanations help investigate the errors made by the model, allowing developers to understand the underlying causes or to detect known failure modes. It also makes it easier to monitor the performance over time, as a perfectly working system may start to misbehave over time due to distribution shifts. Furthermore, it allows us to uncover new relations in the data previously unknown to domain experts, allowing them to formulate new hypotheses and create new knowledge. Because of these reasons, Explainable AI is a rapidly evolving field, and new papers are published at a rapid pace. In section 1.3, we will expand more on several commonly used XAI methods. Part II discusses feature attribution methods, a group of XAI methods specifically used for image classification models. These chapters cover a basic introduction of simple and common XAI methods, and the specific feature attribution methods used in the papers presented. We focus on methods that are local and model-centric i.e. they explain a specific sample for a specific model. This is of course only a subset of possible XAI methods. A taxonomy for the different can be found in [18]. For an overview of the state-of-the-art methods, we refer to Minh et al. [19] and Linardatos et al. [20]

Document Server@UHasselt (Universiteit Hasselt)

Generalisation with neural networks

Author: BEX Geert Jan
Publication venue
Publication date: 01/01/1995
Field of study

Document Server@UHasselt

Going Beyond Counting First Authors in Author Co-citation Analysis

Author: Zhao Dangzhi
Publication venue
Publication date: 01/01/2005
Field of study

The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed

E-LIS