215 research outputs found
A design space for RDF data representations
RDF triplestores’ ability to store and query knowledge bases augmented with semantic annotations has attracted the attention of both research and industry. A multitude of systems offer varying data representation and indexing schemes. However, as recently shown for designing data structures, many design choices are biased by outdated considerations and may not result in the most efficient data representation for a given query workload. To overcome this limitation, we identify a novel three-dimensional design space. Within this design space, we map the trade-offs between different RDF data representations employed as part of an RDF triplestore and identify unexplored solutions. We complement the review with an empirical evaluation of ten standard SPARQL benchmarks to examine the prevalence of these access patterns in synthetic and real query workloads. We find some access patterns, to be both prevalent in the workloads and under-supported by existing triplestores. This shows the capabilities of our model to be used by RDF store designers to reason about different design choices and allow a (possibly artificially intelligent) designer to evaluate the fit between a given system design and a query workload.<br/
Visual cortex is sensitive to order-disorder phase transition
Initial stages of visual processing are well characterized in terms of band-limited oriented receptive filters. However, brain mechanisms underlying the integration of their outputs are much less understood. In the domain of texture perception, two types of mechanisms have been suggested: (A) first-order statistics and (B) autocorrelation function. In texture perception, considering local symmetry as a statistical property, we can employ the order parameter used in physics to analyze transitions between order and disorder. When the thermodynamic temperature (T) decreases monotonically, the order parameter changes monotonically from zero for disordered systems to one for symmetric systems. Recently, we have synthesized images corresponding to different T's and showed that human observers are sensitive to phase transition. Their sensitivity function is well approximated by an observer based on the order parameter. Here, we investigated the neural correlates of order-disorder perception using functional imaging combined with a phase-encoded paradigm. We hypothesized that BOLD response would depend monotonically on T if first-order statistics are involved. Conversely, the BOLD response would be larger for images around phase transition than for symmetric and disordered images if autocorrelation is involved, since correlations of all lengths are present only in these images. We presented the stimuli in 4 consecutive 16 s blocks: 1) disordered images, 2) images with continuous change of order parameter from disordered to symmetric, 3) symmetric images, 4) images with continuous change of the order parameter from symmetric to disordered. We found that the BOLD response in early visual areas as well as in lateral occipital complex (LOC) was highest for images close to the phase transition, thus supporting the autocorrelation hypothesis and rejecting first-order statistics as an underlying mechanism. These results may partially account for the weak activation of the LOC to both highly ordered and highly disordered textures compared to object shapes
Graph Neural Networks for Sematic Entity suggestion: Vaialbity study of the use of Graph Neural Networks for Entity Suggestion via Dense Retrieval
The primary objective of this research project is to enhance the accuracy of entity linking inscientific table data by leveraging a knowledge base. This will be achieved by investigatingthe feasibility of employing machine learning techniques to generate multimodal embeddingsfor both entity linking and corpus embedding. In contrast to the current state-of-the-artmethods that heavily depend on lexicographical features, this project aims to exploit thecapabilities of a multimodal embedding approach to improve the suggestion of candidatesfor entity linking. The main focus is to understand how multimodal embedding can beused to extract relevant entities, considering the contextual data within the corpus forEntity Linking with a Knowledge Base.The structure of this thesis is organized into six chapters. Chapter 1 serves as an introduc-tion to the subject, providing necessary background information and discussing relatedworks that have influenced this project. Chapter 2 delves into the datasets used in theproject, specifically focusing on the conversion of these datasets into mention datasetsthat cover both tabular data and text data for the mentions. This chapter also coversthe target knowledge graphs. Chapter 3 presents the model architecture of the project,including the projection head, mention encoder, entity encoder, and the dual encoderarchitecture. It also discusses the scoring functions that will be utilized. Chapter 4 outlinesthe experiments that will be conducted in the project and the evaluation metrics thatwill be used to assess the results. Chapter 5 presents the results of the experiments andprovides a thorough evaluation of these results. Finally, Chapter 6 and Chapter 7 concludesthe project, summarizing the findings and suggesting potential avenues for future work.This research project provides an in-depth exploration of the methodologies employed inthe project, focusing on the concepts of text embedding and mention embedding. Theinput data for the project is categorized into text data and tabular data, each with itsunique input structure in the BERT tokenizer. The text data input structure is based onthe methodology proposed by Wu et al. , where each mention in the corpus is encapsulatedwithin a specific string format. On the other hand, the tabular data input structure isinspired by the work of Trabelsi et al. , but with a simplified format due to the specificrequirements and constraints of the project.The projection head, a fundamental component in machine learning, is utilized to transforminput data, specifically embeddings, into a different space, thereby generating projectedembeddings . This transformation is accomplished through a series of operations collectivelyknown as projection layers. The projection head is essentially a simple feed-forward neuralnetwork that serves to project the entity and mention embeddings to the same dimensionality. It is equipped with a ReLU activation function and its primary function is to reduce thedimensionality of the embeddings, making them comparable.The Mention Encoder model is a sophisticated architecture that leverages the power ofthe BERT model with an additional projection head. The input data for this model isdivided into two categories: text data and tabular data. Each mention in the corpus isencapsulated within a string and structured as per the methodology proposed by Wu et al.For tabular data, the project aims to perform Column Type Annotation, with the inputinto the BERT tokenizer inspired by the work of Trabelsi et al. The Mention Encoder’sarchitecture includes the BERT model and a projection head, which allows the model toproject the output of the BERT model into a lower-dimensional space, enabling efficientcomputation and storage.The Entity Embedding model architecture is a crucial component of our research. Themodel architecture is divided into two main subsections: Input Encoding and OntologyEmbedding Model. The input format for the ontology embedding process is derived fromvarious ontologies/knowledge graphs, as detailed in Section 2. The model architecturefor ontology embedding is based on the work of Wu et al. and Louis et al.The model’sflexibility, particularly in the utilization of Graph Neural Network (GNN) layers, is a keyfeature that allows it to be tailored to specific requirements and scenarios.The dual encoding model architecture is another significant aspect of our research. Asunderscored by Dong et al. , there exists a variety of dual encoder architectures, includingbut not limited to Siamese Dual Encoder (SDE), Asymmetric Dual Encoder (ADE), ADEwith Shared Token Embedding (ADE-STE), ADE with Frozen Token Embedding (ADE-FTE), and ADE with a yet to be defined component (ADE-SPL). The primary focus ofthis project is the ADE model architecture, which is further bifurcated into two maincomponents: the entity encoder and the mention encoder. This architecture facilitatessimpler modifications to the different components of the dual encoder, thereby enablingeach stack to adapt and better fit the conclusion.The loss function plays a critical role in steering both encoders to acquire identicalrepresentations. The score function must be adept at evaluating the similarity betweenmention and entity embeddings. Furthermore, it is essential for practical applications thatthe entity’s embedding can be computed offline. The inference should be achievable bycalculating the mention embedding and retrieving the nearest k neighbors in less than aminute, even in extensive knowledge bases like DBpedia, which encompasses billions ofentities.This study presents three subsections, each focusing on a different function relevant to thethesis. The first function, Cosine similarity/dot function, introduces the cosine embeddingloss. This is a common scoring function that enables dense retrieval based on the angularsimilarity of embeddings representing entities. The second function, Triplet margin loss,introduces the triplet loss function. This is a common loss function that allows for denseretrieval based on the Euclidean similarity of embeddings representing entities. Fromthe perspective that embeddings are maps from higher dimensionality into a manifold inlower dimension, the idea behind triplet loss is to move similar entities closer together,and dissimilar entities farther away. The third function, with slight abuse of notationcross-entropy, is a scoring function aimed at classification. However, the aim of the functionis to maximize the value of the right ”class” and minimize the value to the other negativeanchors. This can be seen as an updated triplet margin loss where the positive anchoris pushed closer and the remainder is pushed farther away. These three functions arepresented because they provide a comprehensive understanding of the scoring and lossfunctions used in machine learning, which is crucial to the thesis.The experimental setup for this study is detailed in Table 4.2. The subsequent sectionsprovide an in-depth analysis of the results derived from these experiments, with a particularemphasis on the influence of the configuration on the final outcomes. Due to constraintsin resources, each configuration is executed only once. For a more robust statistical un-derstanding of the significance of each configuration, it would be necessary to conductadditional runs. However, due to time limitations, not all configurations of the model withthe text encoder being bert-base-uncased were executed, with 4 runs remaining incomplete.The results of the executed runs are presented in Table 5.1.Table 7.1 indicates that while the model is unable to retrieve the correct entities, thesuggested entities are more semantically in nature as opposed to lexicographic similarity.The model is capable of making suggestions based on semantic similarities, demonstratingthe feasibility of entity suggestion based on dense retrieval. However, the model requiresfurther fine-tuning to achieve state-of-the-art performance
Improving camera motion classification for undersea coral videos
The health of the planets oceans is facing a rapid decline, particularly the world's coral reefs have seen significant reduction since 2009. 3D reconstructions of coral structures are vital methods for quantifying and monitoring the health of coral reefs but such methods often require professionally obtained footage to be viable. However, there are great amounts of amateur footage available online which might be viable for use in 3D reconstruction but identifying it is a time consuming task, as coral structures require views from more than one angle. We therefore propose a model which might bridge a gap between public footage and scientific research by identifying sections of public videos which might be relevant for 3D reconstruction. In this work we present a model which identifies and isolates the desired camera motion by extracting motion vectors from video footage and converting them to HSI color images which are applied to a Swin transformer model. In order to train and validate this model we expanded upon a benchmark dataset containing data amateur footage for coral 3D reconstruction. In order to validate our model, it is tested against two other approaches. A Convolutional Neural Network (CNN) model also trained and validated upon HSI color images from vector and a Heuristic model applied to motion vectors. The CNN model and Heuristic model both performed poorly with an F1 score of 0.11 and 0.16 respectively. In contrast, Swin transformer outperformed these approaches by scoring 0.19. However, simply applying the Swin transformer without data augmentation performed the best with a score of 0.26. The HSI Swin transformer performed significantly better on the validation set, meaning the approach might be prone to over-fitting, or causes information loss for the model.The health of the planets oceans is facing a rapid decline, particularly the world's coral reefs have seen significant reduction since 2009. 3D reconstructions of coral structures are vital methods for quantifying and monitoring the health of coral reefs but such methods often require professionally obtained footage to be viable. However, there are great amounts of amateur footage available online which might be viable for use in 3D reconstruction but identifying it is a time consuming task, as coral structures require views from more than one angle. We therefore propose a model which might bridge a gap between public footage and scientific research by identifying sections of public videos which might be relevant for 3D reconstruction. In this work we present a model which identifies and isolates the desired camera motion by extracting motion vectors from video footage and converting them to HSI color images which are applied to a Swin transformer model. In order to train and validate this model we expanded upon a benchmark dataset containing data amateur footage for coral 3D reconstruction. In order to validate our model, it is tested against two other approaches. A Convolutional Neural Network (CNN) model also trained and validated upon HSI color images from vector and a Heuristic model applied to motion vectors. The CNN model and Heuristic model both performed poorly with an F1 score of 0.11 and 0.16 respectively. In contrast, Swin transformer outperformed these approaches by scoring 0.19. However, simply applying the Swin transformer without data augmentation performed the best with a score of 0.26. The HSI Swin transformer performed significantly better on the validation set, meaning the approach might be prone to over-fitting, or causes information loss for the model
Exploring the Efficacy of Specially-Trained Transformers on Geospatial Entity Matching of Historic Toponyms
Substantial effort has been put into digitizing and extracting information from historical and ancient manuscripts. These efforts often focus on a single civilization, its language, and culture. Thereby isolating these efforts and making it harder to collaborate and share knowledge between them. Some works have tried to connect these efforts and their data based on toponym matches using traditional methods such as transliteration for toponym matching. However, results have been uneven. The advent of transformer-based language models such as BERT has brought about improved performance in many language-related tasks, including toponym matching. However, these language models are often trained over large corpora of modern text in English. Even multi-lingual models are often trained on modern texts collected on the web. Here, we examine whether creating specially-trained multi-lingual models over ancient texts matching the toponym languages can be beneficial for this task.In this paper, we examine several methods using ancient manuscripts to adapt BERT-based models to identify matching toponyms in Arabic and Hebrew, two related Semitic languages with historical dialects and sizeable corpora of ancient texts. We evaluated our methods on a historical toponym matching task comprising several datasets of toponyms extracted from Middle East scholars The evaluation results were surprising in that the models presented in this work were outperformed by a multilingual model (mBERT) that was pre-trained on modern data
2B or not 2B and everything in between — novel evaluation methods for matching problems
Social Processes: Self-supervised Meta-learning Over Conversational Groups for Forecasting Nonverbal Social Cues
Free-standing social conversations constitute a yet underexplored setting for human behavior forecasting. While the task of predicting pedestrian trajectories has received much recent attention, an intrinsic difference between these settings is how groups form and disband. Evidence from social psychology suggests that group members in a conversation explicitly self-organize to sustain the interaction by adapting to one another’s behaviors. Crucially, the same individual is unlikely to adapt similarly across different groups; contextual factors such as perceived relationships, attraction, rapport, etc., influence the entire spectrum of participants’ behaviors. A question arises: how can we jointly forecast the mutually dependent futures of conversation partners by modeling the dynamics unique to every group? In this paper, we propose the Social Process (SP) models, taking a novel meta-learning and stochastic perspective of group dynamics. Training group-specific forecasting models hinders generalization to unseen groups and is challenging given limited conversation data. In contrast, our SP models treat interaction sequences from a single group as a meta-dataset: we condition forecasts for a sequence from a given group on other observed-future sequence pairs from the same group. In this way, an SP model learns to adapt its forecasts to the unique dynamics of the interacting partners, generalizing to unseen groups in a data-efficient manner. Additionally, we first rethink the task formulation itself, motivating task requirements from social science literature that prior formulations have overlooked. For our formulation of Social Cue Forecasting, we evaluate the empirical performance of our SP models against both non-meta-learning and meta-learning approaches with similar assumptions. The SP models yield improved performance on synthetic and real-world behavior datasets.Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.Pattern Recognition and Bioinformatic
Generating Ontology-Learning Training-Data through Verbalization
Ontologies play an important role in the organization and representation of knowledge. However, in most cases, ontologies do not fully cover domain knowledge, resulting in a gap. This gap, often expressed as a lack of concepts, relations, or axioms, is usually filled by domain experts in a manual and tedious process. Utilizing large language models (LLMs) can ease this process; a fine-tuned LLM could receive as input up-to-date and reliable domain knowledge natural text and output a structured graph in OWL RDF/Turtle format, which is the standard format of ontologies. Thus, to fine-tune a model, text-owl sentence pairs that constitute such a dataset must be acquired. Unfortunately, such a dataset does not exist in the literature or within the open-source community. Therefore, this paper introduces our LLM-assisted verbalizer to create such a data set by converting OWL statements from existing ontologies into natural text. We evaluate the verbalizer on 322 classes from four different ontologies using two different LLMs, achieving precision and recall as high as 0.99 and 0.96, respectively
- …
