Search CORE

1,720,980 research outputs found

Triplet entropy loss: improving the generalisation of short speech language identification systems

Author: Van Der Merwe Ruan Henry
Publication venue
Publication date: 2021
Field of study

Spoken language identification systems form an integral part in many speech recognition tools today. Over the years many techniques have been used to identify the language spoken, given just the audio input, but in recent years the trend has been to use end to end deep learning systems. Most of these techniques involve converting the audio signal into a spectrogram which can be fed into a Convolutional Neural Network which can then predict the spoken language. This technique performs very well when the data being fed to model originates from the same domain as the training examples, but as soon as the input comes from a different domain these systems tend to perform poorly. Examples could be when these systems were trained on WhatsApp recordings but are put into production in an environment where the system receives recordings from a phone line. The research presented investigates several methods to improve the generalisation of language identification systems to new speakers and to new domains. These methods involve Spectral augmentation, where spectrograms are masked in the frequency or time bands during training and CNN architectures that are pre-trained on the Imagenet dataset. The research also introduces the novel Triplet Entropy Loss training method. This training method involves training a network simultaneously using Cross Entropy and Triplet loss. Several tests were run with three different CNN architectures to investigate what the effect all three of these methods have on the generalisation of an LID system. The tests were done in a South African context on six languages, namely Afrikaans, English, Sepedi, Setswanna, Xhosa and Zulu. The two domains tested were data from the NCHLT speech corpus, used as the training domain, with the Lwazi speech corpus being the unseen domain. It was found that all three methods improved the generalisation of the models, though not significantly. Even though the models trained using Triplet Entropy Loss showed a better understanding of the languages and higher accuracies, it appears as though the models still memorise word patterns present in the spectrograms rather than learning the finer nuances of a language. The research shows that Triplet Entropy Loss has great potential and should be investigated further, but not only in language identification tasks but any classification task

Cape Town University OpenUCT

Object Detection and Size Determination of Pineapple Fruit at a Juicing Factory

Author: Harris Jessica
Publication venue
Publication date: 2022
Field of study

The aim of this thesis is to develop a method for determining pineapple fruit size from images. This was achieved by first detecting pineapples in each image using Mask Region-based Convolutional Neural Network (Mask R-CNN) and then extracting the pixel diameter and length measurements, and the projected areas, from the detected mask outputs. Various Mask R-CNNs were considered for the task of pineapple detection. The best-performing detector made use of MS COCO starting weights, a ResNet50 CNN backbone, and horizontal flipping data augmentation during the training process. This model (Model 4: COCO Fliplr Res50) achieved an average precision of 91.4% on the validation set and an average precision of 90.1% on the test set, and was used to predict masks for an unseen dataset containing images of pre-measured pineapples. The distributions of measurements extracted from the detected masks were compared to those of the manual measurements using two-sample Z-tests and Kolmogorov–Smirnov (KS) tests. There was sufficient similarity between the distributions, and it was therefore established that the reported method is appropriate for pineapple size determination in this context. All the data and code is available in a GitHub repository for reproducible research

Cape Town University OpenUCT

Unsupervised Machine Learning Application for the Identification of Kimberlite Ore Facie using Convolutional Neural Networks and Deep Embedded Clustering

Author: Langton Sean
Publication venue
Publication date: 2022
Field of study

Mining is a key economic contributor to many regions globally - especially those in developing nations. The design and operation of the processing plants associated with each of these mines is highly dependant on the composition of the feed material. The aim of this research is to demonstrate the viability of implementing a computer vision solution to provide online information of the composition of material entering the plant, thus allowing the plant operators to adjust equipment settings and process parameters accordingly. Data is collected in the form of high resolution images captured every couple of seconds of material on the main feed conveyor belt into the Kao Diamond Mine processing plant. The modelling phase of the research is implemented in two stages. The first stage involves the implementation of a Mask Region-based Convolutional Neural Network (Mask R-CNN) model with a ResNet 101 CNN backbone for instance segmentation of individual rocks from each image. These individual rock images are extracted and used for the second phase of the modelling pipeline - utilizing an unsupervised clustering method known as Convolutional Deep Embedded Clustering with Data Augmentation (ConvDEC-DA). The clustering phase of this research provides a method to group feed material rocks into their respective types or facie using features developed from the auto-encoder portion of the ConvDEC-DA modelling. While this research focuses on the clustering of Kimberlite rocks according to their respective facie, similar implementations are possible for a wide range of mining and rock types

Cape Town University OpenUCT

Hospital readmission risk

Author: Mugova Amos
Publication venue
Publication date: 2025
Field of study

Hospital readmissions are a significant challenge in healthcare, as they lead to in creased costs, higher risk of mortality, treatment complications, and patient dis tress. This minor dissertation, set within the South African healthcare framework, investigates the potential of both traditional clinical screening tools and advanced statistical learning methods for predicting hospital readmission risk. The meth ods considered include the LACE score, decision trees, logistic regression, random forests, gradient-boosting methods, and neural networks. The study uses data from South Africa's privately insured demographic, provided by a private insurer. It includes a comprehensive array of patient information such as demographics, prescribed medications, medical procedures undergone, and historical hospital usage. Feature selection methods were used to identify relevant variables for model training, and the effectiveness of these variables was assessed based on their ability to differentiate between patients at risk of hospital readmission within 30 days after discharge. The statistical learning methods' efficacy was measured using several performance indicators, such as prediction accuracy, F1 score, Area Under the Receiver Operating Characteristics Curve (AUC), Area Under the Precision-Recall Curve (AUC-PR), and the Matthews Correlation Coefficient (MCC). The study found that the neural network model outperformed the other statistical learning methods evaluated across various metrics. Moreover, the research extends the range of variables used to predict hospital read missions beyond the traditional LACE score, incorporating critical factors such as the frequency and costs of previous hospital visits, expenses related to specialist services, patient age, and the primary diagnosis category

Cape Town University OpenUCT

soMLier: A South African Wine Recommender System

Author: Redelinghuys Joshua
Publication venue
Publication date: 2023
Field of study

Though several commercial wine recommender systems exist, they are largely tailored to consumers outside of South Africa (SA). Consequently, these systems are of limited use to novice wine consumers in SA. To address this, the aim of this research is to develop a system for South African consumers that yields high-quality wine recommendations, maximises the accuracy of predicted ratings for those recommendations and provides insights into why those suggestions were made. To achieve this, a hybrid system “soMLier” (pronounced “sommelier”) is built in this thesis that makes use of two datasets. Firstly, a database containing several attributes of South African wines such as the chemical composition, style, aroma, price and description was supplied by wine.co.za (a SA wine retailer). Secondly, for each wine in that database, the numeric 5-star ratings and textual reviews made by users worldwide were further scraped from Vivino.com to serve as a dataset of user preferences. Together, these are used to develop and compare several systems, the most optimal of which are combined in the final system. Item-based collaborative filtering methods are investigated first along with model-based techniques (such as matrix factorisation and neural networks) when applied to the user rating dataset to generate wine recommendations through the ranking of rating predictions. Respectively, these methods are determined to excel at generating lists of relevant wine recommendations and producing accurate corresponding predicted ratings. Next, the wine attribute data is used to explore the efficacy of content-based systems. Numeric features (such as price) are compared along with categorical features (such as style) using various distance measures and the relationships between the textual descriptions of the wines are determined using natural language processing methods. These methods are found to be most appropriate for explaining wine recommendations. Hence, the final hybrid system makes use of collaborative filtering to generate recommendations, matrix factorisation to predict user ratings, and content-based techniques to rationalise the wine suggestions made. This thesis contributes the “soMLier” system that is of specific use to SA wine consumers as it bridges the gap between the technologies used by highly-developed existing systems and the SA wine market. Though this final system would benefit from more explicit user data to establish a richer model of user preferences, it can ultimately assist consumers in exploring unfamiliar wines, discovering wines they will likely enjoy, and understanding their preferences of SA wine

Cape Town University OpenUCT

Word Sense Disambiguation in the domain of Sentiment Analysis through Deep Learning

Author: Baiju Vedanth
Publication venue
Publication date: 2023
Field of study

Sentiment analysis forms part of a major component of Natural Language Processing (NLP), even though continuous improvements in NLP are being made, word disambiguation remains a complex problem within the domain of sentiment analysis (Navigli, 2009). Word Sense Disambiguation (WSD) is a problem that deals with identifying the correct sense of ambiguous words in a sentence. As such, various words can have multiple meanings depending on the context in which they are used. Although advances in deep learning continue to rise within the NLP domain, WSD is still a task in which deep learning is yet to be fully explored. Whilst there does exist research within WSD as a whole, there is limited research for WSD conducted within the domain of sentiment analysis (Seifollahi and Shajari, 2019). The proposed research explores the task of WSD in the domain of sentiment analysis through recent advances in deep neural networks with a specific focus on 1D Convolutional Neural Networks (CNN) and Long Short Term Memory (LSTM) algorithms. Sentiments expressed in text sourced from the Amazon product reviews data were analysed using 1D CNN and LSTM deep learning algorithms. The Amazon product reviews data is segmented according to the type of product category which is essentially a context category. The effectiveness of each algorithm was evaluated from a statistical performance and efficiency perspective. It was found that the inclusion of context as a model input, improves the model out of sample performance as compared to a model without context as an input. In addition to this, it was observed that including more context categories as an input had improved the out of sample performance for both 1D CNN and LSTM algorithms. Furthermore, the 1D CNN exhibited superior performance over the LSTM model from a statistical and efficiency stand-point. Given that there has not been a considerable amount of research which explores the application of deep learning to solving the problem of WSD within sentiment analysis, the findings of this research will aid in providing a base-level of knowledge on future potential exploration and applications for WSD relating to sentiment analysis

Cape Town University OpenUCT

Insurance recommendation engine using a combined collaborative filtering and neural network approach

Author: Pillay Prinavan
Publication venue
Publication date: 2021
Field of study

A recommendation engine for insurance modelling was designed, implemented and tested using a neural network and collaborative filtering approach. The recommendation engine aims to suggest suitable insurance products for new or existing customers, based on their features or selection history. The collaborative filtering approach used matrix factorization on an existing user base to provide recommendation scores for new products to existing users. The content based method used a neural network architecture which utilized user features to provide a product recommendation for new users. Both methods were deployed using the Tensorflow machine learning framework. The hybrid approach helps solve for cold start problems where users have no interaction history. The accuracy on the collaborative filtering produced 0.13 root mean square error based on implicit feedback rating of 0-1, and an overall Top-3 classification accuracy (ability to predict one of the top 3 choices of a customer) of 83.8%. The neural network system achieved an accuracy of 77.2% on Top-3 classification. The system thus achieved good training performance and given further modifications, could be used in a production environment

Cape Town University OpenUCT

Predicting district level HIV prevalence in South Africa using medicine ordering data

Author: Liebenberg Juandre
Publication venue
Publication date: 2025
Field of study

The Human Immunodeficiency Virus has been at the forefront of South Africa's public health challenges, placing the healthcare system under immense pressure. As a result of HIV planning by policymakers, more than 5.5 million People Living with HIV have access to antiretroviral treatment at present day. Dynamic, mechanistic models such as the Thembisa and Naomi Bayesian models have been used to generate provincial and district-level estimates such as HIV prevalence, People Living with HIV, and the number of residents on antiretroviral treatment. An alternative methodology for estimating drug utilisation and predicting HIV estimates was explored by using medicine ordering data as the primary input for analysis from 2020 to 2022. Two objectives were set out, the first being a drug utilisation analysis aimed at approximating the number of individuals per 1000 inhabitants per day taking antiretroviral drugs to determine if the adequate stock was ordered at district and provincial levels. The second was to predict HIV prevalence by fitting panel data and spatial linear models to predict district prevalence and People Living with HIV; the estimations for People Living with HIV were converted to prevalence to compare the direct estimation of prevalence to the calculated. Results from the drug utilisation analysis suggested that district municipalities hold insufficient stock to meet the demands of those inflicted with the disease. In contrast, larger metropolitan municipalities hold excess medication, implying that people travel across district boundaries to receive treatment. The fitted spatial models generated better prevalence estimates than fixed-effect panel data models for the predicted and calculated prevalence with root mean square error metrics of 0.009 (0.87%) and 0.012(1.24%) compared to that of 0.012(1.21%) and 0.015(1.53%) from the fixed-effect panel data models. The impact of high quantities of antiretroviral drugs ordered by metropolitan municipalities resulted in an underestimation of prevalence in those regions due to the negative relationship between the dependent variable Prevalence and the independent Quantity variable. From the spatial models fitted, the best performing spatial model accurately estimated the prevalence rates for 51 out of 52 districts, which fell within the acceptable range defined by the Naomi Model. The results of the study have shown that the use of ordering data to predict disease prevalence has the potential to serve as an alternative methodology in the absence of established models

Cape Town University OpenUCT

Cape Town Airbnb price prediction: an exploration of spatial statistic and machine learning methods

Author: Williams Courtney
Publication venue
Publication date: 2024
Field of study

This thesis predicts the prices of Airbnb listings in Cape Town, South Africa and in doing so, investigates the price determinants in the market. Using data from InsideAirbnb, traditional, spatial and machine learning models are compared and contrasted. The Cape Town Airbnb market has significant spatial correlation and heterogeneity, and traditional models such as OLS regression do not account for this spatial dependence, however, it is addressed by spatial models. By accounting for spatial effects, model predictive performance does improve, but not so much as to outperform non-spatial, non-linear machine learning model predictions. While Airbnb is a new and unique platform, the most important price determinants are consistent with those of traditional housing and accommodation markets such as property type, location and amenities

Cape Town University OpenUCT

Modelling attrition in the Eastern Cape public health system using multilevel survival analysis and machine learning methods

Author: Perrie Cailin
Publication venue
Publication date: 2024
Field of study

The size of South Africa's public health workforce is influenced by many factors including, but not limited to, inter-facility transfers, emigration, voluntary exits, illness, death and retirement. Understanding the rate at which public health workers exit or move within the public health system (i.e. the attrition rate), is essential for adequately formulating effective workforce policies and strategies. South Africa's public health system budget currently accounts for an annual 5% attrition rate for health facilities in general. This rate does not consider fluctuations in attrition rates between cadres, across facilities, or across districts. Presently, there are no guidelines or models for predicting attrition within the Eastern Cape (EC) public health care system from an individual, cadre, facility, or district level. As a result, staffing levels are determined entirely by the discretion of facility or departmental managers. The purpose of this investigation was, therefore, to explore and utilize human resource (HR) data within South Africa's public healthcare system, with specific focus on the EC province, to predict attrition rates within and across cadres, health facilities, and districts. The study places a large focus on using the findings of the study to improve budgeting and health care staffing levels. The study thus aims to develop predictive models that are capable of handling data that is hierarchical in nature, use these models to identify level specific factors that both negatively and positively impact annual attrition rates, and compare predictive models to determine the most effective model for predicting attrition rates in the EC public health sector. The study further aims to perform a historical data analysis on the HR data to identify areas of high concern regarding attrition. Based on a preliminary and historical exploratory data analysis (EDA) of the EC province's public heath HR data, the annual attrition rates between 2010 and 2020 have consistently exceeded this budgeted 5%, with the annual attrition rate in some years reaching as high as 15.65%. The preliminary analysis further indicated that attrition rates are subject to high variation when computed at different levels (i.e. cadre and facility level groupings) as well as across different years. Consequently, the Eastern Cape Department of Health (EC DOH) have been historically and holistically under budgeting for attrition. Additionally, by catering for attrition at a provincial level only, the department has been neglecting the effects that within and between-group variation in attrition has on budget formulation. The historical EDA further identified several cadres that consistently experienced high levels of attrition namely, the Medical Services, Nursing, and Primary Health Care cadres. The job titles that fall within these cadres (i.e. Medical Specialists, Clinic Specialists, and Nurses) are considered i critical to the functioning of any health facility as they are responsible for providing medical care to patients. The historically high attrition levels obtained in these cadres are, therefore, alarming as they suggest that the EC province can expect to consistently see the same or a degrading level of patient care in the years to come. The findings from the historical EDA, and the potential risks associated with over or under-budgeting for attrition, suggest that there is a financial incentive for the EC DOH to develop models capable of accurately predicting future attrition rates within and between multiple levels within the EC province. The application of both statistical and machine learning (ML) modelling techniques were thus explored in this investigation, however, only one statistical modelling method (multilevel discrete-time event models) and three ML modelling methods (multi-layer perceptron neural networks, generalized linear mixed-model trees, and tree-based mixed effect models) were explored. This was due to their potential ability to handle and, effectively model, the complex multilevel and longitudinal HR data available for use in this study. Unfortunately, all multilevel machine learning models explored failed to converge, resulted in excessive computational time forcing an abort, or simply resulted in poor model performance when evaluated on unseen data. Based on these findings, and within the limitations of the study scope, it is accepted that these three modelling methods are unable to outperform traditional multilevel statistical methods at this time. The multilevel discrete-time event models, however, are able to handle the complex data used in this investigation. Based on model performance metrics, the best multilevel discrete-time event model developed in this investigation is considered feasible for use in attrition prediction for the EC DOH. The model is further capable of being used to determine time-indicator and healthcare worker level variables influencing attrition. Overall, the insights gained from this investigation can be used to help guide intervention planning, optimize HR capacity planning processes and, in turn, improve overall budgeting for the EC health system. The findings and limitations of this investigation, however, open up opportunities for future work both as improvements to, or extensions of, the data preparation processes as well as model formulations and optimizations. Such follow-up work may include the exploration of different attrition definitions and the impact that has on the investigations findings, exploring methods for reducing HR healthcare data integrity issues, and provisioning or implementing re-sampling techniques, different cadre grouping strategies, or virtual machines to improve the performance of the machine learning models proposed

Cape Town University OpenUCT