Treasures @ UT Dallas

Not a member yet

7697 research outputs found

Sort by

Error Category Recognition in Procedural Videos With Vision-language Models

Author: Gouripeddi Venkata Sesha Bhavya Alekhya
Publication venue
Publication date: 2025
Field of study

This study evaluates vision-language models (VLMs) for error category recognition in procedural videos. We use the CaptainCook4D (Peddi et al., 2023), an egocentric 4D dataset, to enhance AI systems’ understanding of procedural learning and error recognition in instructional videos. The comprehensive dataset, consisting of 384 long-form egocentric videos recorded in multiple real-world kitchen settings by various participants, is a testament to our meticulous data collection process. Unique in its inclusion of both error-free and error- prone videos, the dataset provides a robust resource for evaluating models on zero-shot error activity recognition. The data was collected using HoloLens2 and GoPro Hero 11 devices, capturing various sensory inputs, including depth and RGB video, hand and head tracking, and IMU data. Each recipe is represented as a task graph, detailing step-wise instructions and dependencies, aiding in the evaluation of AI models’ comprehension capabilities. Two Vision-Language Models (VLMs), Video-LLaVA (Lin et al., 2023) and TimeChat (Ren et al., 2024), were employed to assess the dataset’s utility in zero-shot error recognition tasks (Khattak et al., 2024). Video-LLaVA (Lin et al., 2023), an extension of the LLaVA model, integrates video processing for enhanced inference, while TimeChat (Ren et al., 2024) focuses on long video understanding through a time-sensitive multi-modal framework. Experiments involved prompt engineering using task graphs and a prompt-and-predict paradigm for error recognition, showcasing the models’ abilities to detect procedural mistakes for zero-shot evaluation. Evaluation metrics included precision, recall, and F1 scores, highlighting the models’ performance in classifying errors accurately. The study underscores the models’ potential to detect complex errors in procedural activities in egocentric videos

Selecting the Number of Components for Two-table Multivariate Methods

Author: Moraglia Luke E 1997-
Publication venue
Publication date: 2025
Field of study

Research in cognition and neuroscience often involves analyzing relationships between two sets of variables collected on the same set of observations. These “two-table” relationships are commonly analyzed using three related component-based methods—partial least squares correlation (PLSC), canonical correlation analysis (CCA), and redundancy analysis (RDA). However, selecting the appropriate number of components to retain in these methods remains a challenge. Several stopping rules—rules that determine the number of components to keep—have been developed for these two-table methods, but their performances have not been thoroughly evaluated. Further, many stopping rules have only been applied to one of the two-table methods despite their relevance for all three methods, and there has been little exploration into modifications that might improve the performance of these stopping rules. Additionally, many rules do not have easily accessible software implementations. To address these gaps, this dissertation evaluated four existing stopping rules and several new modifications to these rules by using simulated data with a known number of true components to estimate the Type I error rates and the power of the stopping rules. Out of 34 variations of these rules, four or five best rules were identified for each two-table method. The Type I error and power of these best rules were further examined in terms of various characteristics of the data, including the number of observations, variables, true components, and the strength of the relationships between the tables, in order to identify one or two rules with superior performance that are recommended for future use. Additionally, the most popular stopping rule—a permutation test using the singular values as test statistics—is not supported by this study because it showed high Type I error across the simulated data. As an illustrative example, a PLSC analysis was included for a real dataset (a subset of the publicly available LEMON dataset). This analysis explored relationships between participants’ cognitive performance and physiological measurements on two components selected by several of the best stopping rules. To facilitate future applications, an R package called componentts was developed. The package implements the stopping rules and data simulation so that researchers can use and test the stopping rules with additional simulated data beyond the data in this study, or test new stopping rules and easily compare their results to the stopping rules evaluated here

Shipboard Power Systems Load Monitoring and Energy Management for Resilience Enhancement

Author: Senemmar Soroush 1993-
Publication venue
Publication date: 2025
Field of study

With the increasing electrification and complexity of next-generation shipboard power systems (SPS), ensuring real-time decision-making for operational resilience and fault detection has become critical. This thesis presents a comprehensive framework that integrates non-intrusive load monitoring (NILM), fault detection, and autonomous reconfiguration of SPS using advanced machine learning techniques. The NILM system employs a discrete wavelet transform for signal processing and a convo- lutional neural network (CNN) for real-time load status monitoring in a four-zone medium voltage direct current (MVDC) SPS. The model achieves over 98% accuracy in monitor- ing, including the identification of pulsed loads, and maintains functionality under extreme conditions such as cyber/physical attacks and noisy inputs. Furthermore, a Wavelet Graph Neural Network (WGNN) is introduced for non-intrusive fault detection, demonstrating accuracies above 99% for intrusive faults and 97% for non-intrusive scenarios. The WGNN model’s robustness to pulse loads and noise is validated through hardware-in-the-loop testing, ensuring high fidelity and low latency in real-world applications. Additionally, the framework incorporates an autonomous reconfiguration system using graph- based reinforcement learning (RL), which models the SPS reconfiguration as a Markov decision process. A graph convolutional network (GCN) is employed within the RL policy network to optimize the switching control policies, ensuring maximum power availability to loads during faults. The proposed approach effectively enhances the operational resilience and autonomy of shipboard power networks, ensuring real-time performance in both normal and emergency conditions

The Disordered Spirit: A Portrait of Francisco Amighetti as Seen by Laura Goldstein

Author: Goldstein Laura
Publication venue
Publication date: 2025
Field of study

“The Disordered Spirit: Francisco Amighetti as seen by Laura Goldstein” addresses the paradox of the fragment and the whole inherent in the ideas of “essence” or “truth” through the work of Costa Rican artist and poet Francisco Amighetti, and offers some alternative perspectives to the anxiety of loss in literary translation. This investigation emphasizes the overuse of discussions on loss in translation and argues for the value of the fragment, particularly in Amighetti’s work, which falls in the Modernist period when poets and artists turned to the fragment through style, technique, and a confrontation with the past, but also more broadly to argue that the fragment has value through the choices of creative processes, language, and expression, the multiplicity of subjective experiences, order and disorder, and through the nature of memory and our universe. The dissertation also analyzes the creative work of Romanian-Brazilian writer Ștefan Baciu, who wrote poems responding to fragments of Amighetti’s poems, letters, and prose, and finally includes creative work by the author of the dissertation in the form of original poetry, poetry in translation, visual art (prints) and memoir, proposing that a translator can reveal the multiplicity of subjective experiences through the inclusion of their original creative work, especially when the translated poet is excluded from the canon as Francisco Amighetti and other Costa Rican poets have been

Hexaarylbenzene-based Monomers for Porous Polymers

Author: Afghani Michael Bijan 1999-
Publication venue
Publication date: 2025
Field of study

Porous polymers are a type of porous material in which monomers form 2D or 3D polymers containing angstrom- to nanometer-scale pores. Key to formation of porous polymers is the use of monomers with specific symmetries such that they connect in a repeating 2D or 3D pattern containing pores. The three main types of crystalline porous polymer are covalent organic frameworks (COFs), metal-organic frameworks (MOFs), and hydrogen-bonded organic frameworks (HOFs). All three have shown promising applications resulting from their porosity, including gas storage, gas separation, use in organic electronics, catalysis, water harvesting, and removal of environment pollutants. However, large-scale application is limited by scalability of synthesis, processability, and expense of synthesizing monomers. To address limitations of scalability and expense of monomers, one goal is to synthesize monomers that start from cheap building blocks and to use synthetic methods allowing larger scale purification, such as recrystallization instead of column chromatography. Limitations of processability can be addressed with HOFs, which are highly reprocessable due to the reversibility of hydrogen bonds. Another way to improve processability is by investigating covalent organic macrocycles and metal-organic macrocycles, which possess many porosity properties of COFs and HOFs with the advantage of easier processability due to solubility of macrocycles in organic solvents. Chapter 1 of this thesis is a literature review of the three most common types of crystalline porous polymers: covalent organic frameworks (COFs), metal-organic frameworks (MOFs), and hydrogen-bonded organic frameworks (HOFs). Then metal-organic macrocycles and covalent organic macrocycles are introduced, which form materials analogous to MOFs and COFs, with permanent porosity properties such as gas sorption, but with the added advantage of solution processability. Chapter 2 introduces various organic reactions used in the synthesis of monomers for porous polymers. Since porous polymers can be seen as extensions of the trend from atoms, to molecules, to porous polymers, extensive knowledge of organic chemistry is needed to assemble atoms into molecules before advancing to assembling those molecules into porous polymers. Chapter 3 then brings those organic reactions together to introduce synthetic pathways towards hexaarylbenzene- and hexa-peri-hexabenzocoronene-based monomers for porous polymers. Our group has been interested in these types of monomers for their cheap, scalable synthesis, versatility in adjusting appending functional groups, and excellent CO2 sorption when used in porous polymers. Chapter 3 also introduces our work into pathways for synthesizing hexaarylbenzene- based monomers, using sequential Suzuki reactions to a central mixed halobenzene. Chapter 4 gives synthetic procedures I have improved upon and developed for synthesis of HPB- 1,2-2A, and reports a method I found to grow single crystals of carboxylic acid-functionalized hexaarylbenzenes by vapor phase diffusion

Fabrication and Characterization of Metallic Nanowires Integrated Microfluidic Channel

Author: Theeda Sumanth 1997-
Publication venue
Publication date: 2025
Field of study

The rapid advancements in nano-manufacturing have accelerated the miniaturization of devices, giving rise to microfluidics — the science of manipulating fluids within channels at sub-millimeter scales. While early microfluidic devices focused on lab-on-chip systems, they were quickly implemented in wide range of applications such as sensors, drug delivery, organ-on-chip systems and microreactors. The integration of nanostructures, particularly nanowires, within microfluidic channels has further improved their functionality by leveraging increased surface area to optimize processes such as cell sorting, microfluidic mixing and thermal management. However, traditional fabrication techniques for nanowires such as Chemical Vapor Deposition and Vapor-Liquid-Solid growth are complex and involve repetitive steps, which limits their broader adoption. In this study, we present a novel fabrication technique for nanowires using a platinum (Pt) based metallic glass (Pt57.5Cu14.7Ni5.3P22.5) via thermoplastic drawing and integrating them into a microfluidic channel. Metallic glasses are amorphous alloys, known for their fluid-like behavior when heated above their glass transition temperature. A custom built thermo-mechanical setup was utilized to produce metallic nanowires from the cavities on a silicon substrate (mold). To selectively place the metallic nanowires on the mold, two distinctive approaches were employed. In the first approach, metal masks made from aluminum foil and laser cut stencils, were used to pattern nanowires in selective regions. In the second approach, maskless lithography was employed to create molds with cavities only in desired areas. The metallic nanowire-patterned mold was then bonded with a Polydimethylsiloxane (PDMS) block to form a sealed microfluidic channel. Flow characterization was conducted to understand the behavior of metallic nanowires on the fluid flow within the channel. Our findings revealed that long nanowires offered higher resistance at increased flowrates, while shorter cone-shaped nanowires offered greater resistance at lower flowrates. These results emphasize that nanowire length and morphology play a key role in determining the flow behavior. In addition to the flow studies, the potential for decorating these metallic nanowires with other materials such as carbon nanotubes and gold-palladium nanoparticles were explored using spin coating and deposition techniques. This opens new possibilities for functionalization of nanowires without the need for sophisticated fabrication processes. To address scalability, new fixtures were designed to extend the usable mold area, enabling the fabrication of metallic nanowires integrated microfluidic channels for large-scale applications

Toward Generalizable Models for Medical Imaging: Leveraging Topological Data Analysis and Deep Learning

Author: Yadav Ankur Rajendra 1996-
Publication venue
Publication date: 2025
Field of study

The rapid advancement in designing machine learning and artificial intelligence software for biomedical research necessitates the development of models that are not only accurate but also generalizable across various data sets. This dissertation addresses the challenge of creating such generalizable models by leveraging topological data analysis (TDA) and deep learning techniques. We propose a novel framework that integrates TDA with deep learning to enhance the interpretability and performance of biomedical image analysis models. Traditional histopathological image analysis is labor-intensive and subjective, leading to variability in diagnoses. To overcome these challenges, we develop a methodology and software that utilize Persistent Homology (PH) from TDA to capture topological features of digital biomedical images, providing a detailed representation of tissue morphology that conventional approaches might overlook. The topological features are used with deep learning models to improve classification accuracy and robustness. In addition to PH, we employ Transfer Learning from pre-trained models, adapting them to specific biomedical imaging tasks. This approach addresses the scarcity of labeled data in medical imaging, enhancing the performance and efficiency of our models. Furthermore, continuous learning techniques enable our models to adapt to new data while retaining previously learned information, ensuring long-term relevance and effectiveness. The experimental results demonstrate significant classification accuracy and interpretability improvements across multiple biomedical imaging tasks, including histopathology and medical image segmentation. Specifically, our models achieved a classification accuracy of 94.87% in histopathological image analysis, outperforming traditional methods by a substantial margin. Notably, in classifying breast cancer and prostate cancer histopathological images, our models achieved accuracies of 95.3% and 93.8%, respectively, using relatively small models that leverage PH. The integration of PH and Transfer Learning proved remarkably effective, with models trained using these techniques achieving a 15% increase in accuracy over baseline models. This research contributes to the field by offering a novel methodology for integrating topological features into deep learning, paving the way for more effective and versatile biomedical imaging solutions. The developed models provide a scalable and adaptable approach to biomedical image analysis, with potential applications in cancer diagnosis and treatment planning. By enhancing the accuracy, efficiency, and scalability of diagnostic tools, this dissertation aims to improve patient outcomes and advance biomedical research

Evaluating the Performance of Multi-Hop Wireless Networks Employing Collision-Free Binary Countdown MAC

Author: Mahana Charlie Brent 2003-
Publication venue
Publication date: 2025
Field of study

This thesis presents the development of a collision-free binary countdown MAC protocol for multi-hop wireless networks designed to ensure reliable communication while making stochastic performance guarantees. Our performance analysis of this protocol reveals that the performance of individual wireless nodes can only be meaningfully tuned by modifying the network topology. To address this challenge, we develop two modified versions of our binary countdown MAC protocol: node-weighted and flow-weighted binary countdown. In node-weighted binary countdown, it is possible to weigh the transmission probability of individual wireless nodes. In flow-weighted binary countdown, it is possible to weigh the transmission probability of individual flows at individual wireless nodes. These modifications provide flexible methods for tuning the performance of individual wireless nodes and flows in multi-hop networks employing our collision-free binary countdown MAC protocol

Depression and Associative Recognition Memory the Effects of Depression Symptoms

Author: Meldrum Sheila Renee 1988-
Publication venue
Publication date: 2025
Field of study

Depression is a leading cause of disability globally, impacting millions from adolescence through adulthood. Major Depressive Disorder (MDD), characterized by persistent sadness and anhedonia, also commonly presents with cognitive impairments, notably difficulties in memory and concentration. While traditional recall-based tasks reveal memory deficits in groups of depressed individuals, they frequently fail to capture the effects of individual differences in memory deficits due to depression severity. However, when recognition tasks are sensitive enough to detect memory impairments in depressed participants, their performance on these tasks remains high enough to reveal inter-individual differences. These recognition-based tasks can capture individual differences by detecting when lower effort/automatic cognitive strategies, such as a familiarity based one, are being used to complete the task. We hypothesized that low effort, familiarity-based strategies are utilized more by those who are experiencing higher levels of depression symptoms. The current study used an associative recognition task, to better characterize memory impairments in individuals with depression symptoms by assessing differences in performance based on depression symptom severity. Using an associative memory task, specifically a face-name recognition task, we were able to evaluate both associative and simple recognition performance. We hypothesized that participants with greater depression severity would rely more on inefficient familiarity-based strategies, leading to impaired performance on associative tasks (require greater effort/more efficient strategies) but preserved simple recognition abilities (low effort/omnificent strategies are sufficient). Results showed that individuals with severe and moderate depression exhibited deficits in associative recognition. However, their associative memory performance did not significantly differ from the non-depressed group, indicating that differences in depression symptom severity were not detected by performance on the associative recognition task. In contrast, participants’ simple recognition performance was higher than associative recognition performance for all three groups, indicating that simple recognition was not impaired by depression symptom severity as we predicted. We were also able to replicate previous research on depression and comorbid anxiety, depression and decreased attentional control, and depression and self-reported episodic memory impairments. These findings underscore the potential value of associative recognition tasks in identifying cognitive deficits linked to depression severity and highlight the intricate relationship between mood disorders and memory functioning

Making Active Learning Work in the Real World

Author: Beck Nathan A 1999-
Publication venue
Publication date: 2025
Field of study

Over the last decade, the advent of deep learning methods have achieved remarkable feats within the space of machine learning. Modern deep learning methods, however, require the use of immense quantities of data for training, which presents an immediate data acquisition cost. Of particular interest is active learning – a label-efficient paradigm – as a large barrier of entry for data acquisition is data annotation, which requires a tremendous amount of human effort. In active learning, one seeks to select a budget-constrained number of worthwhile unlabeled instances from a large source of unlabeled data that, when annotated, produces the largest gain in some performance metric after subsequent supervised training. To date, deep active learning methods have made tremendous progress in reducing annotation costs; however, an extremely large collection of active learning methods have only been proven on standard academic datasets, which tend to be relatively simplistic. Indeed, there are nuances in real-world data and real-world labeling that change how effective certain active learning strategies are. Hence, there is a need to better understand facets of how to apply active learning for realistic scenarios. In this work, we seek to develop label-efficient labeling strategies for real-world complications in the data environment. We first perform an evaluation of deep active learning on image classification tasks to elicit realistic facets that affect existing active learning methods. Motivated by the surprising finding that state-of-the-art active learning methods tend to no longer dominate long-lasting and very simple active learning methods when applying a number of common training techniques, we turn our focus towards methods that mitigate various complications in real-world data such as rare classes, data redundancy, streaming environments, and so forth, frequently utilizing the recently proposed submodular information measures. To better understand when utilizing submodular information measures will be effective, we derive bounds on multiple selection characteristics of submodular information measures to theoretically validate the use of submodular information measures as mechanisms for performing active learning selection. With these theoretical connections, we then utilize them to handle rare classes, data redundancy, out-of-distribution classes, and non-i.i.d. streaming environments. Additionally, the class of submodular mutual information functions provides useful weak labeling capabilities, serving as a powerful component in cold-start settings and in real-world labeling pipelines wherein the cost of labeling can be reduced with weak label suggestions. We conclude with an active learning toolkit that can and has been used in real-world active learning pipelines

2

full texts

7,697

metadata records

Updated in last 30 days.

Treasures @ UT Dallas

Access Repository Dashboard

Do you manage Open Research Online? Become a CORE Member to access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! 👇