South African Tuberculosis Vaccine Initiative
UCT Computer Science Research Document ArchiveNot a member yet
1270 research outputs found
Sort by
Subword Segmental Language Modelling for Nguni Languages
Subwords have become the standard units of text in NLP, enabling efficient open-vocabulary models. With algorithms like byte-pair encoding (BPE), subword segmentation is viewed as a preprocessing step applied to the corpus before training. This can lead to sub-optimal segmentations for low-resource languages with complex morphologies. We propose a subword segmental language model (SSLM) that learns how to segment words while being trained for autoregressive language modelling. By unifying subword segmentation and language modelling, our model learns subwords that optimise LM performance. We train our model on the 4 Nguni languages of South Africa. These are low-resource agglutinative languages, so subword information is critical. As an LM, SSLM outperforms existing approaches such as BPE-based models on average across the 4 languages. Furthermore, it outperforms standard subword segmenters on unsupervised morphological segmentation. We also train our model as a word-level sequence model, resulting in an unsupervised morphological segmenter that outperforms existing methods by a large margin for all 4 languages. Our results show that learning subword segmentation is an effective alternative to existing subword segmenters, enabling the model to discover morpheme-like subwords that improve its LM capabilities
A Sequence Modelling Approach to Question Answering in Text-Based Games
Interactive Question Answering (IQA) requires an intelligent agent to interact with a dynamic environment in order to gather information necessary to answer a question. IQA tasks have been proposed as means of training systems to develop language or visual comprehension abilities. To this end, the Question Answering with Interactive Text (QAit) task was created to produce and benchmark interactive agents capable of seeking information and answering questions in unseen environments. While prior work has exclusively focused on IQA as a reinforcement learning problem, such methods suffer from low sample efficiency and poor accuracy in zero-shot evaluation. In this paper, we propose the use of the recently proposed Decision Transformer architecture to provide improvements upon prior baselines. By utilising a causally masked GPT-2 Transformer for command generation and a BERT model for question answer prediction, we show that the Decision Transformer achieves performance greater than or equal to current state-of-the-art RL baselines on the QAit task in a sample efficient manner. In addition, these results are achievable by training on sub-optimal random trajectories, therefore not requiring the use of online agents to gather data
Exploring the ontology of pandemic
Pandemics do take place. When exactly they begin and end, and why, is harder to determine, as also demonstrated in early
2020 at the start of the Covid pandemic and the many debates in 2022 on calling it over. To determine these points, one has
to know which criteria have to be satisfied and which not, respectively. This requires a clear definition of what a pandemic
is, with at least its necessary and sufficient characteristics. There is no such crisp and clear definition, neither in the expert
documentation nor in domain ontologies. In this paper, we assess mentions of ‘pandemic’ in domain ontologies, evaluate
the argument that foundational ontologies may provide guidance, and examine the characteristics that domain experts have
put forward for pandemics. The guidance from foundational ontologies is underwhelming when taken together, but tooling
greatly simplified the alignment. The assessment of characteristics show that pandemic is not bearer of them all but they are
of attendant entities, elucidates which ones are dependent and which essential, and it demonstrates why one may compute
more than one unique start and end of a pandemic. Considering the complexities, it may be of use to develop an ontology
of pandemics
PowerQoPE: A Personal Quality of Internet Protection and Experience Configurator
Security configuration remains obscure for many Internet users, especially those with limited computing skills. This obscurity exposes such users to various Internet attacks.
Recently, there has been an increase in cyberattacks targeted at individuals due to the remote workforce imposed by the COVID 19 pandemic. These attacks have exposed the inefficiencies of the non-human-centric implementation of Internet security mechanisms and protocols. Security research usually positions users as the weakest link in the security ecosystem, making system and protocol developers exclude the users in the development process. This stereotypical approach has negatively affected users’ security uptake. Mostly, security systems are not comprehensible for an average user, negatively affecting performance and Quality of Experience. This causes the users to shun using security mechanisms. Building on human-centric cybersecurity research, we present a tool that aids in configuring Internet Quality of protection and Experience (referred to as PowerQoPE in this paper). We describe its architecture and design methodology and finally present evaluation results. Preliminary evaluation results show that user-centric and data-driven approaches in the design of Internet security systems improve users’ Quality of Experience. The controlled experiment results show that users are not really stupid; they know what they want and that given proper security configuration platforms with proper framing of components and information, they can make optimal security decisions
Generic Overgeneralization in Pre-trained Language Models
Generic statements such as “ducks lay eggs” make claims about kinds, e.g., ducks as a category. The generic overgeneralization effect refers to the inclination to accept false universal generalizations such as “all ducks lay eggs” or “all lions have manes” as true. In this paper, we investigate the generic overgeneralization effect in pre-trained language models experimentally. We show that pre-trained language models suffer from overgeneralization and tend to treat quantified generic statements such as “all ducks lay eggs” as if they were true generics. Furthermore, we demonstrate how knowledge embedding methods can lessen this effect by injecting factual knowledge about kinds into pre-trained language models. To this end, we source factual knowledge about two types of generics, minority characteristic generics and majority characteristic generics, and inject this knowledge using a knowledge embedding model. Our results show that knowledge injection reduces, but does not eliminate, generic overgeneralization, and that majority characteristic generics of kinds are more susceptible to overgeneralization bias
Low Resource, Post-processed Lecture Recording from 4K Video Streams
Many universities are using lecture recording technology to expand the reach of their
teaching programs, and to continue instruction when face to face lectures are not possi-
ble. Increasingly, high-resolution 4K cameras are used, since they allow for easy reading of board/screen context. Unfortunately, while 4K cameras are now quite affordable, the back-end computing infrastructure to process and distribute a multitude of recorded 4K streams can be costly. Furthermore, the bandwidth requirements for a 4K stream are exorbitant - running to over 2GB for a 45-60 minute lecture. These factors mitigate against the use of such technology in a low-resource environment, and motivated our investigation into methods to reduce resource requirements for both the institution and students. We describe the design and implementation of a low resource 4K lecture recording solution, which addresses these problems through a computationally efficient video processing pipeline. The pipeline consists of a front-end, which segments presenter motion and writing/board surfaces from the stream and a back-end, which serves as a virtual cinematographer (VC), combining this contextual information to draw attention to the lecturer and relevant content. The bandwidth saving is realized by defining a smaller fixed-size, context-sensitive ‘cropping window’ and generating a new video from the crop regions. The front-end utilises computationally cheap temporal frame differencing at its core: this does not require expensive GPU hardware and also limits the memory required for processing. The VC receives a small set of motion/content bounding boxes and applies established framing heuristics to determine which region to extract from the full 4K frame. Performance results coupled to a user survey show that the system is fit for purpose: it is able to produce good presenter framing/context, over a range of challenging lecture venue layouts and lighting conditions within a time that is acceptable for lecture video processing
Carbohydrate Force Fields: The Role of Small Partial Atomic Charges in Preventing Conformational Collapse.
Although the quality of current additive all-atom force fields for carbohydrates has been demonstrated in many applications, occasional significant differences reported for the hydrodynamic behavior of specific polysaccharides modeled with different force fields is a cause for concern. In particular, irreversible conformational collapse has been noted for some polysaccharide simulations with the GLYCAM06j force field. Here, we investigate the cause of this phenomenon through comparative simulations of a range of saccharides with both the GLYCAM06j and the CHARMM36 carbohydrate force fields. We find that conformational collapse in GLYCAM06j occurs for saccharide chains containing the deoxy sugar α-l-rhamnose after relatively long simulation intervals. Further, we explore the mechanism of conformational collapse and show that this phenomenon arises because of the anomalous low energy in GLYCAM06j (as compared to quantum mechanical calculations) of a specific orientation of α-l-Rha to α-l-Rha glycosidic linkages, which are subsequently sustained by intramolecular interactions in the saccharide chain. We identify the lack of partial charges on aliphatic hydrogens in GLYCAM as the source of this anomaly, demonstrating that addition of small partial atomic charges on the aliphatic protons in rhamnose removes the conformational collapse phenomenon. This work reveals the large cumulative impact that small partial charges may have on the dynamic behavior of polysaccharides and indicates that future reparameterization of the GLYCAM06j force field should investigate the addition of partial charges on all aliphatic hydrogens
Deciphering the Mechanism of Binding Selectivity of Chlorofluoroacetamide-Based Covalent Inhibitors toward L858R/T790M Resistance Mutation.
Covalent modification of the oncogenic mutant epidermal growth factor receptor (EGFR) by small molecules is an efficient strategy for achieving an enhanced and sustained pharmacological effect in the treatment of non-small-cell lung cancer. NSP-037 (18), an irreversible inhibitor of the L858R/T790M double-mutant EGFR (EGFRDM) using α-chlorofluoroacetamide (CFA) as a novel warhead, has seven times the inhibition selectivity for EGFRDM over the wild type (EGFRWT), as compared to clinically approved osimertinib (7). Here, we employ multiple computational approaches to elucidate the mechanism underlining this improved selectivity, as well as the effect of CFA on the selectivity enhancement of inhibitor 18 over 7. We find that EGFRDM undergoes significantly larger conformational changes than EGFRWT upon binding to 18. The conformational stability of the diamine side chain and the CFA motif of 18 in the orthosteric site of EGFRDM is identified as key for the disparate binding mechanism and inhibitory prowess of 18 with respect to EGFRWT and EGFRDM and 18’s higher selectivity than 7. The binding free energy of the 18-bound complexes is −6.38 kcal/mol greater than that of the 7-bound complexes, explaining the difference in selectivity of these inhibitors. Further, free energy decomposition analysis indicates that the electrostatic contribution of key residues plays an important role in the 18-bound complexes. QM/MM calculations show that the most favored mechanism for the Cys797 alkylation reaction is the direct displacement mechanism through a CFA-based inhibitor, producing a reaction with the lowest energy barrier and most stable product
Streptococcus pneumoniae serotype 15B polysaccharide conjugate elicits a cross-functional immune response against serotype 15C but not 15A.
Protection conferred by pneumococcal polysaccharide conjugate vaccines (PCVs) is associated with PCV-induced antibodies against vaccine-covered serotypes that exhibit functional opsonophagocytic activity (OPA). Structural similarity between capsular polysaccharides of closely related serotypes may result in induction of cross-reactive antibodies with or without a cross-functional activity against a serotype not covered by a PCV, with the former providing an additional protective clinical benefit. Serotypes 15B, 15A, and 15C, in the serogroup 15, are among the most prevalent Streptococcus pneumoniae serotypes associated with invasive pneumococcal disease following the implementation of a 13-valent PCV; in addition, 15B contributes significantly to acute otitis media. Serological discrimination between closely related serotypes such as 15B and 15C is complicated; here, we implemented an algorithm to quickly differentiate 15B from its closely related serotypes 15C and 15A directly from whole-genome sequencing data. In addition, molecular dynamics simulations of serotypes 15A, 15B, and 15C polysaccharides demonstrated that while 15B and 15C polysaccharides assume rigid branched conformation, 15A polysaccharide assumes a flexible linear conformation. A serotype 15B conjugate, included in a 20-valent PCV (PCV20), induced cross-functional OPA serum antibody responses against the structurally similar serotype 15C but not against serotype 15A, both not included in PCV20. In PCV20-vaccinated adults (18–49 years), robust OPA antibody titers were detected against both serotypes 15B (the geometric mean titer [GMT] of 19,334) and 15C (GMTs of 1692 and 2747 for strains PFE344340 and PFE1160, respectively), but were negligible against serotype 15A (GMTs of 10 and 30 for strains PFE593551 and PFE647449, respectively). Cross-functional 15B/C responses were also confirmed using sera from a larger group of older adults (60–64 years)
Predicting diarrhoea outbreaks with climate change
Climate change is expected to exacerbate diarrhoea outbreaks across the developing
world, most notably in Sub-Saharan countries such as South Africa. In South Africa, diseases
related to diarrhoea outbreak is a leading cause of morbidity and mortality. In this
study, we modelled the impacts of climate change on diarrhoea with various machine learning
(ML) methods to predict daily outbreak of diarrhoea cases in nine South African
provinces