1,720,966 research outputs found
Visual scene recognition with biologically relevant generative models
This research focuses on developing visual object categorization methodologies that are based on machine learning techniques and biologically inspired generative models of visual scene recognition. Modelling the statistical variability in visual patterns, in the space of features extracted from them by an appropriate low level signal processing technique, is an important matter of investigation for both humans and machines. To study this problem, we have examined in detail two recent probabilistic models of vision: a simple multivariate Gaussian model as suggested by (Karklin & Lewicki, 2009) and a restricted Boltzmann machine (RBM) proposed by (Hinton, 2002). Both the models have been widely used for visual object classification and scene analysis tasks before. This research highlights that these models on their own are not plausible enough to perform the classification task, and suggests Fisher kernel as a means of inducing discrimination into these models for classification power. Our empirical results on standard benchmark data sets reveal that the classification performance of these generative models could be significantly boosted near to the state of the art performance, by drawing a Fisher kernel from compact generative models that computes the data labels in a fraction of total computation time. We compare the proposed technique with other distance based and kernel based classifiers to show how computationally efficient the Fisher kernels are. To the best of our knowledge, Fisher kernel has not been drawn from the RBM before, so the work presented in the thesis is novel in terms of its idea and application to vision problem
Detecting moments of change and suicidal risks in longitudinal user texts using multi-task learning
This work describes the classification system proposed for the Computational Linguistics and Clinical Psychology (CLPsych) Shared Task 2022. We propose the use of multitask learning approach with a bidirectional long-short term memory (Bi-LSTM) model for predicting changes in user’s mood (Task A) and their suicidal risk level (Task B). The two classification tasks have been solved independently or in an augmented way previously, where the outputof one task is leveraged for learning another task, however this work proposes an ‘all-in-one’ framework that jointly learns the related mental health tasks. Our experimental results (ranked top for task A) suggest that the proposed multi-task framework outperforms the alternative single-task frameworks submitted to the challenge and evaluated via the timeline based and coverage based performance metrics shared by the organisers. We also assess the potential of using various types of feature embedding schemes that could prove useful in initialising the Bi-LSTM model for better multitask learning in the mental health domain
Revisiting deep fisher vectors: using fisher information to improve object classification
Although deep learning models have become the gold standard in achieving outstanding results on a large variety of computer vision and machine learning tasks, the use of kernel methods has still not gone out of trend because of its potential to beat deep learning performances at a number of occasions. Given the potential of kernel techniques, prior works have also proposed the use of hybrid approaches combining deep learning with kernel learning to complement their respective strengths and weaknesses. This work develops this idea further by introducing an improved version of Fisher kernels derived from the deep Boltzmann machines (DBM). Our improved deep Fisher kernel (IDFK) utilises an approximation of the Fisher information matrix to derive improved Fisher vectors. We show IDFK can be utilised to retain a high degree of class separability, making it appropriate for classification and retrieval tasks. The efficacy of the proposed approach is evaluated on three benchmark data sets: MNIST, USPS and Alphanumeric, showing an improvement in classification performance over existing kernel approaches, and comparable performance to deep learning methods, but with much reduced computational costs. Using explainable AI methods, we also demonstrate why our IDFK leads to better classification performance
Socio-technical trust For multi-modal hearing assistive technology
The landscape of opportunity is rapidly changing for audio-visual (AV) hearing assistive technology. While hearing assistive devices, such as hearing aids, have traditionally been developed for populations of deaf and hard of hearing (DHH) communities, the ubiquitous use of in-ear technology and recent advances in edge computing are reformulating what drives research and development in this domain. With that comes new challenges to consider from the perspective of multiple different stakeholders. In this position paper, we elaborate on seven key socio-technical challenges that may impede the adoption of trustworthy multi-modal hearing assistive technologies. We also draw upon a recent survey being piloted in the UK to examine perceptions of trust for audio systems in the context of human rights. We strongly encourage the research community to consider trust as a factor in developing new AV assistive hearing technologies, as trust may ultimately drive adoption of this technology within broader society
ConversationMoC: encoding conversational dynamics using multiplex network for identifying moment of change in mood and mental health classification
Understanding mental health conversation dynamics is crucial,yet prior studies often overlooked the intricate interplay of social interactions. This paper introduces a unique conversationlevel dataset and investigates the impact of conversational context in detecting Moments of Change (MoC) in individual emotions and classifying Mental Health (MH) topics in discourse. In this study, we differentiate between analyzing individual posts and studying entire conversations, using sequential and graph-based models to encode the complex conversation dynamics. Further, we incorporate emotion and sentiment dynamics with social interactions using a graph multiplex model driven by Graph Convolution Networks (GCN). Comparative evaluations consistently highlight the enhanced performance of the multiplex network, especially when combining reply, emotion, and sentiment network layers. This underscores the importance of understanding the intricate interplay between social interactions, emotional expressions, and sentiment patterns in conversations, especially within online mental health discussions. We are sharing our new dataset (ConversationMoC) and models with the broader research community to facilitate further research
Predicting acute pain levels implicitly from vocal features
Evaluating pain in speech represents a critical challenge in high-stakes clinical scenarios, from analgesia delivery to emergencytriage. Clinicians have predominantly relied on direct verbalcommunication of pain which is difficult for patients with com-munication barriers, such as those affected by stroke, autism,and learning difficulties. Many previous efforts have focusedon multimodal data which does not suit all clinical applications.Our work is the first to collect a new English speech datasetwherein we have induced acute pain in adults using a cold pres-sor task protocol and recorded subjects reading sentences outloud. We report pain discrimination performance as F1 scoresfrom binary (pain vs. no pain) and three-class (mild, moder-ate, severe) prediction tasks, and support our results with ex-plainable feature analysis. Our work is a step towards provid-ing medical decision support for pain evaluation from speech toimprove care across diverse and remote healthcare setting
Going Beyond Counting First Authors in Author Co-citation Analysis
The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation
counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings
are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that
only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into
account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed
Insights from explainable AI in oesophageal cancer team decisions
Background: clinician-led quality control into oncological decision-making is crucial for optimising patient care. Explainable artificial intelligence (XAI) techniques provide data-driven approaches to unravel how clinical variables influence this decision-making. We applied global XAI techniques to examine the impact of key clinical decision-drivers when mapped by a machine learning (ML) model, on the likelihood of receiving different oesophageal cancer (OC) treatment modalities by the multidisciplinary team (MDT).Methods: retrospective analysis of 893 OC patients managed between 2010 and 2022 at our tertiary unit, used a random forests (RF) classifier to predict four possible treatment pathways as determined by the MDT: neoadjuvant chemotherapy followed by surgery (NACT + S), neoadjuvant chemoradiotherapy followed by surgery (NACRT + S), surgery-alone, and palliative management. Variable importance and partial dependence (PD) analyses then examined the influence of targeted high-ranking clinical variables within the ML model on treatment decisions as a surrogate model of the MDT decision-making dynamic. Results: amongst guideline-variables known to determine treatments, such as Tumour-Node-Metastasis (TNM) staging, age also proved highly important to the RF model (16.1 % of total importance) on variable importance analysis. PD subsequently revealed that predicted probabilities for all treatment modalities change significantly after 75 years (p < 0.001). Likelihood of surgery-alone and palliative therapies increased for patients aged 75–85yrs but lowered for NACT/NACRT. Performance status divided patients into two clusters which influenced all predicted outcomes in conjunction with age. Conclusion: XAI techniques delineate the relationship between clinical factors and OC treatment decisions. These techniques identify advanced age as heavily influencing decisions based on our model with a greater role in patients with specific tumour characteristics. This study methodology provides the means for exploring conscious/subconscious bias and interrogating inconsistencies in team-based decision-making within the era of AI-driven decision support.</p
- …
