South African Tuberculosis Vaccine Initiative

UCT Computer Science Research Document Archive

Not a member yet

1270 research outputs found

Sort by

Molecular modeling of Group B Streptococcus type II and III capsular polysaccharides explains low filter retention of type II and lack of cross-reactivity with type III

Author: Richardson N. I.
Berti F.
Ravenscroft N.
Kuttel M. M.
Publication venue: Elsevier
Publication date: 01/01/2025
Field of study

Group B Streptococcus (GBS) is a bacterial pathogen associated with significant morbidity and mortality in pregnant women and infants, particularly in resource-limited settings. A hexavalent vaccine candidate in development incorporates the capsular polysaccharides (CPSs) from the most prevalent serotypes: Ia, Ib, II, III, IV, and V. Vaccine production is facilitated by a standardized CPS purification process. In the final purification step, a 30 kDa membrane filter gives high-yield recovery for five of the six CPSs, but <50 % for type II (GBSII), despite similar CPS structure and size. However, a smaller 10 kDa membrane improves recovery to about 90 %, suggesting that CPS conformation affects retention. Here comparative molecular modeling – corroborated by through-space NMR correlations – reveals that GBSII forms compact, globular conformations, while type III (GBSIII) forms an elongated zig-zag. This explains GBSII's poor retention during filtration: GBSII's compact globules pass through the 30 kDa membrane more easily than GBSIII's elongated forms. Additionally, we identify distinct epitopes and compare their interactions with a GBSIII-specific fragment antibody to clarify the lack of cross-reactivity between GBSII and GBSIII. This work provides valuable mechanistic insight into physically observed behavior to inform development of multivalent GBS vaccines to reduce maternal and infant mortality

Navigating Sociocultural Tensions in Prenatal Care: Opportunities for Digital Health from Diverse Stakeholders in Techiman, Ghana

Author: Dsane-Nsor Sarah
Gautam Aakash
Fuentes Carolina
Toffah Gideon Kwesi
Verdezoto Nervo
Joolay Yaseen
Densmore Melissa
Publication venue: Springer Nature
Publication date: 01/01/2025
Field of study

Pregnant women in the Global South face significant challenges due to lack of resources and informational gaps. In this paper, we take an assets-based lens to examine the experiences of pregnant women in a low-resource setting in Ghana, focusing on the role of information and technology in prenatal care. Through interviews and co-design workshops, we sought to understand the perspectives of multiple stakeholders including pregnant women, their family members, and healthcare professionals. We highlight the complexities involved in making decisions during pregnancy, including the challenges arising from the tensions between traditional healthcare practices and modern Western health services. We discuss opportunities in digital maternal health where we argue for the importance of attending to local needs and values and advocate for recognising community strengths and integrating rural care practices as valuable assets in prenatal intervention design. Our work aims to bridge some of the gaps between the theoretical understanding of digital health and the practical realities of prenatal care in low-resource settings

Deep-Learning Classifiers for Small Data Orthopedic Radiology

Author: Aslan B.
Kazaka W.
Slaven T.
Chetty S.
Kruger N.
Nitschke G.
Publication venue: IEEE Press
Publication date: 01/01/2025
Field of study

Training deep-learning classifiers in orthopedic pathology is problematic due to the scarceness of extensive datasets for training and testing meaning most orthopedic image data is small, sparse and noisy. This study evaluates the efficacy of various state-of-the-art supervised Convolutional Neural Network (CNN) image classifiers, complemented by data augmentation and transfer-learning, versus various Neural Architecture Search (NAS) based deep-learning classifiers. These classifiers are comparatively evaluated on two (cervical spine and elbow) small, multi-label (with unbalanced data distribution) orthopedic radiographic (X-ray) datasets, with the objective of detecting multiple pathologies with high accuracy. To bypass the pervasive problem of small datasets medical datasets, we implement preprocessing and layer freezing to boost all task performance metrics (accuracy, precision, recall, specificity, F1 score), with the ResNet CNN and EfficientNet classifiers yielding the best results overall. Results highlight the efficacy of applying specially tuned CNN and NAS classifiers to small, unbalanced and noisy datasets indicative of those used in orthopedic radiology, demonstrating the potential of such methods as automated prognostic and diagnostic tools to assist orthopedic practitioners

Neural Morphological Tagging for Nguni Languages

Author: Marquard Cael
Mawere Simbarashe
Meyer Francois
Publication venue
Publication date: 01/01/2025
Field of study

Morphological parsing is the task of decomposing words into morphemes, the smallest units of meaning in a language, and labelling their grammatical roles. It is a particularly challenging task for agglutinative languages, such as the Nguni languages of South Africa, which construct words by concatenating multiple morphemes. A morphological parsing system can be framed as a pipeline with two separate components, a segmenter followed by a tagger. This paper investigates the use of neural methods to build morphological taggers for the four Nguni languages. We compare two classes of approaches: training neural sequence labellers (LSTMs and neural CRFs) from scratch and finetuning pretrained language models. We compare performance across these two categories, as well as to a traditional rule-based morphological parser. Neural taggers comfortably outperform the rule-based baseline and models trained from scratch tend to outperform pretrained models. We also compare parsing results across different upstream segmenters and with varying linguistic input features. Our findings confirm the viability of employing neural taggers based on pre-existing morphological segmenters for the Nguni languages

Herds From Video: Learning a Microscopic Herd Model From Macroscopic Motion Data

Author: Gong Xianjin
Gain James
Rohmer Damien
Lyonnet Sixtine
Pettre Julien
Cani Marie-Paule
Publication venue
Publication date: 01/01/2025
Field of study

We present a method for animating herds that automatically tunes a microscopic herd model based on a short video clip of real animals. Our method handles videos with dense herds, where individual animal motion cannot be separated out. Our contribution is a novel framework for extracting macroscopic herd behaviour from such video clips, and then deriving the microscopic agent parameters that best match this behaviour. To support this learning process, we extend standard agent models to provide a separation between leaders and followers, better match the occlusion and field-of-view limitations of real animals, support differentiable parameter optimization and improve authoring control. We validate the method by showing that once optimized, the social force and perception parameters of the resulting herd model are accurate enough to predict subsequent frames in the video, even for macroscopic properties not directly incorporated in the optimization process. Furthermore, the extracted herding characteristics can be applied to any terrain with a palette and region-painting approach that generalizes to different herd sizes and leader trajectories. This enables the authoring of herd animations in new environments while preserving learned behaviour

Paying the Price for Reach: Size-Dependent Emergence of Efficient Wiring in Cognitive Recurrent Neural Networks

Author: Aslan Bilal
Nitschke Geoff
Publication venue
Publication date: 01/01/2025
Field of study

Artificial neural networks often neglect the physical wiring costs that are crucial to biological nervous systems. This study investigates how incorporating such a biologically inspired wiring constraint influences the structure and function of Recurrent Neural Networks (RNNs) during learning. We systematically varied network size (N) and the strength (λ) of a communicability-weighted spatial regularization penalty applied to RNNs performing a seasonal foraging task requiring short and long-term memory, and decisionmaking. Our results reveal that while all networks achieved high task accuracy, larger networks (N ≥ 100) exhibited different sensitivity patterns to higher penalties (λ) compared to smaller ones (N=50). Increasing λ induced neural network topologies with similarities to biological brains, including sparsity, shorter connection lengths (while preserving crucial long-range connections), increased modularity, and enhanced small-world characteristics. We identify a size-dependent optimum (sweet spot) for λ ∈ [0.05, 0.10] that yields these efficient, brain-like structural properties without compromising functional performance. These results highlight the importance of physical network constraints in shaping adaptive systems, demonstrate how functional networks can self-organize towards efficient topologies under cost pressures and offer design principles for developing neuromorphic systems

Neural Morphological Tagging for Nguni Languages

Author: Marquard Cael
Mawere Simbarashe
Meyer Francois
Publication venue: Association for Computational Linguistics
Publication date: 01/01/2025
Field of study

Towards Effective Communication: IsiXhosa Language Learning and the Need for Bilingual Dictionaries in Health Sciences Education

Author: Gambushe Wanga
Ngwendu Amandla
Marquard Cael
Publication venue: African Association for Lexicography
Publication date: 01/01/2025
Field of study

This extended abstract describes a survey of dictionary culture in the Faculty of Health Sciences at UCT. It also proposes using the IsiXhosa.click dictionary as a way to mitigate the identified lack of dictionary usage

Story Generation with Large Language Models for African Languages

Author: Essuman Catherine Nana Nyaah
Buys Jan
Publication venue: Association for Computational Linguistics
Publication date: 01/01/2025
Field of study

The development of Large Language Models (LLMs) for African languages has been hindered by the lack of large-scale textual data. Previous research has shown that relatively small language models, when trained on synthetic data generated by larger models, can produce fluent, short English stories, providing a data-efficient alternative to large-scale pretraining. In this paper, we apply a similar approach to develop and evaluate small language models for generating children’s stories in isiZulu and Yoruba, using synthetic datasets created through translation and multilingual prompting. We train six language-specific models varying in dataset size and source, and based on the GPT-2 architecture. Our results show that models trained on synthetic low-resource data are capable of producing coherent and fluent short stories in isiZulu and Yoruba. Models trained on larger synthetic datasets generally perform better in terms of coherence and grammar, and also tend to generalize better, as seen by their lower evaluation perplexities. Models trained on datasets generated through prompting instead of translation generate similar or more coherent stories and display more creativity, but perform worse in terms of generalization to unseen data. In addition to the potential educational applications of the automated story generation, our approach has the potential to be used as the foundation for more data-efficient low-resource language models

Cross-Lingual Knowledge Augmentation for Mitigating Generic Overgeneralization in Multilingual Language Models

Author: Ralethe Sello
Buys Jan
Publication venue: Association for Computational Linguistics
Publication date: 01/01/2025
Field of study

Generic statements like “birds fly” or “lions have manes” express generalizations about kinds that allow exceptions, yet language models tend to overgeneralize them to universal claims. While previous work showed that ASCENT KB could reduce this effect in English by 30-40%, the effectiveness of broader knowledge sources and the cross-lingual nature of this phenomenon remain unexplored. We investigate generic overgeneralization across English and four South African languages (isiZulu, isiXhosa, Sepedi, SeSotho), comparing the impact of ConceptNet and DBpedia against the previously used ASCENT KB. Our experiments show that ConceptNet reduces overgeneralization by 45-52% for minority characteristic generics, while DBpedia achieves 48-58% for majority characteristics, with combined knowl- edge bases reaching 67% reduction. These improvements are consistent across all languages, though Nguni languages show higher baseline overgeneralization than Sotho-Tswana languages, potentially suggesting that morphological features may influence this semantic bias. Our findings demonstrate that commonsense and encyclopedic knowledge provide complementary benefits for multilingual semantic understanding, offering insights for developing NLP systems that capture nuanced semantics in low-resource languages

1,072

full texts

1,270

metadata records

Updated in last 30 days.

UCT Computer Science Research Document Archive

Access Repository Dashboard

Do you manage Open Research Online? Become a CORE Member to access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! 👇