80264 research outputs found
Sort by
Long-Context Sequence Models for Image Retrieval
Image retrieval is an important problem in computer vision with many applications. In general, retrieval is usually cast as a metric learning problem where a model is trained under a distance or similarity objective to compare pairs of inputs. In this thesis, we introduce Extractive Image Re-ranker, a solution that takes as input local features corresponding to an image query and a group of gallery images, and outputs a refined ranking list through a single forward pass. This model can be used for image retrieval where typically a query image is compared to a large database of images using global features, and then a retrieved gallery of images is re-ranked based on more refined local features. ExtReranker formulates the re-ranking problem as a span extraction task analogous to the text span extraction problem in natural language processing. In contrast to pair-wise correspondence learning, our approach leverages long-context sequence models to effectively capture the list-wise dependencies between query and gallery images at the local-feature level. Our approach achieves superior performance compared with other re-rankers on established image retrieval benchmarks (CUB-200, SOP, and In-Shop). ExtReranker also achieves state-of-the-art re-ranking performance to alternative methods on ROxford and RParis while using 10X fewer local descriptors and having 5X lower forward latency
A Career Stage Perspective on the Impact of Fully Remote and Hybrid Work: Pathways to Engagement and Exhaustion
Fully remote and hybrid work have increased in recent years; however, the extent to which these work arrangements impact employees based on their career stage remains relatively unexplored. The majority of research has examined hybrid employees, leaving much to be understood about the experience of fully remote workers. Drawing on job demands-resources theory (Demerouti et al., 2001) and career stage theory (Super, 1957; Super et al., 1988), this study assessed the extent to which fully remote employees, compared to hybrid employees, perceived role ambiguity and learning opportunities. Additionally, this study examined the extent to which role ambiguity and perceived learning opportunities related to exhaustion and engagement directly, in addition to examining the moderating effect of a person’s career stage (operationalized via occupational tenure). A survey-based study of 520 working adults with an average age of 39 years (SD = 10.2) recruited from Prolific (258 hybrid workers, 262 fully remote workers) was conducted to test the hypotheses. The results failed to show a significant relationship between a person’s work arrangement and perceived learning opportunities, and while a significant relationship was found between work arrangement and role ambiguity, the result was in the opposite direction than expected. Both perceived learning opportunities and role ambiguity were positively related to engagement and exhaustion, respectively, as hypothesized. Moreover, although work arrangement was not indirectly related to engagement via perceived learning opportunities, it was related to exhaustion via role ambiguity such that fully remote workers tended to be less exhausted as a result of experiencing less role ambiguity. Additionally, no evidence was found for a moderating effect of career stage when operationalized as occupational tenure. However, exploratory analyses revealed a significant interaction between age and role ambiguity. Finally, in addition to examining the psychometric properties of the Adult Career Concerns Inventory – Short Form (Perrone et al., 2003), exploratory analyses suggested that fully remote workers may experience less role ambiguity due to lower levels of job complexity. This study contributes to the literature by providing insight into the differences between fully remote and hybrid workers and examining the reliability and validity of an alternative to time-based proxies of career stage
Data-driven detection of mild traumatic brain injury from noninvasive sensors
Mild traumatic brain injuries (mTBIs) are the most common type of brain injury, accounting for the vast majority of head injuries. However, diagnosing mTBI remains a challenge, as clear indicators may not be present in symptom questionnaires or structural imaging modalities, which are not feasible for deployment at the site of the injury. Therefore, this study explores the development of a diagnostic algorithm that incorporates novel sets of biomarkers derived from noninvasive sensors, specifically electroencephalography (EEG) and electrocardiogram (ECG). We explore the relationship between the neurophysiological sensor data, the clinical symptoms, and the doctor’s diagnosis. By exploring these relationships, we aim to improve upon existing methods for detecting mTBI, uncover novel biomarkers that may indicate dysfunction, and establish a new benchmark performance on publicly available mTBI data. The dataset used in this study was collected by Baylor College Medicine and previously uploaded to the Federal Interagency Traumatic Brain Injury Research. We process this multivariate sensor data and extract features in the temporal, spectral, and spatial domains, extending advanced methods in network science and multivariate signal analysis. We evaluate an ensemble model of linear and nonlinear classifiers using cross-validation and further demonstrate the model’s performance on unseen holdout data. Integrating features from the noninvasive sensor data with clinical symptom questionnaires yields greater than 0.90 area under the receiver operating characteristic curve (AUROC) on holdout data, demonstrating the method’s utility as a diagnostic aid. Furthermore, our results indicate that EEG features alone achieve greater than 0.70 AUROC, an improvement over existing EEG-only approaches on this dataset. The proposed work advances existing knowledge on detecting mild traumatic brain injuries by establishing a new benchmark performance on publicly available data and examining novel noninvasive biomarkers that may indicate cognitive or neurological dysfunction
Feedback-Regulated Cell Factories for Enhanced Therapeutic Protein Manufacturing
The production of protein therapeutics is critical for maintaining a continuous supply of life-saving medications available in clinics. The production of large quantities of protein therapeutics, however, remains a major challenge in biomanufacturing. Overexpression of secretory proteins results in the accumulation of unfolded and misfolded off-pathway intermediates in the endoplasmic reticulum (ER), causing proteotoxic stress and, thus, affecting cell viability and protein productivity. Proteotoxic stress leads to the activation of the Unfolded Protein Response (UPR), a series of signal transduction pathways regulating protein quality control mechanisms aimed at restoring homeostasis. Sustained UPR activation culminates with the induction of apoptosis. Current strategies for enhancing the production of therapeutic proteins have focused on the deregulated modulation of key components of the UPR. These strategies have resulted in limited and often protein-specific improvements as they may lead to metabolic burden, disruption of homeostatic systems, and cell adaptation. Deregulated modulation of the UPR also does not account for the natural population heterogeneity characteristic of high protein expression systems. To address these limitations, I developed feedback-regulated cell factories that can sense proteotoxic stress and, in response, modulate the UPR to enhance stress attenuation and delay cell death. This work describes my efforts to engineer sophisticated genetic circuits that can interface with the innate signal transduction pathways of the UPR. To explore strategies for modulating the UPR in response to stress induced by overexpression of therapeutic proteins, I first investigated the dynamics of activation of the UPR signaling pathways mediating stress attenuation and apoptosis upon expression of different levels of a model secretory protein. I then developed a two-module system for modulating the UPR that consists of a UPR sensor and an actuator component. The UPR sensor was developed by genetically engineering cells to link the activation of an early marker of UPR stress to the expression of an orthogonal transcription factor, which translates the detection of UPR induction into activation of a genetic circuit mediating user-defined modulation of the UPR. I leveraged this cell engineering approach to generate three sense-and-respond systems designed to (1) enhance protein folding and secretion by amplifying the stress attenuation pathway of the UPR, (2) delay cell death by silencing the UPR-mediated pro-apoptotic response, and (3) combine amplification of stress attenuation and delay of apoptosis. I demonstrate that this cell engineering approach enabled dynamic UPR modulation upon induction of ER stress. I also show that combining stress attenuation with apoptosis delay enhanced the production of the therapeutic enzyme tissue plasminogen activator and the bispecific antibody blinatumomab. The feedback-responsive cell factories reported in this study are an innovative strategy to dynamically adjust the innate cellular capacity to cope with proteotoxic stress for enhancing therapeutic protein manufacturing
Beyond Dollars and Cents: Exploring Budgeting, Saving, and Financial Security in the Houston Area
This study explores Harris County residents’ financial security, looking at their budgeting and saving practices, barriers people face to budgeting and saving, and how these practices relate to someone being able to withstand economic shocks
Taming Data and Transformers for Audio Generation
Generating ambient sounds is a challenging task due to data scarcity and often insufficient caption quality, making it difficult to employ large-scale generative models for the task. In this work, we tackle this problem by introducing two new models. First, we propose AutoCap, a high-quality and efficient automatic audio captioning model. By using a compact audio representation and leveraging audio metadata, AutoCap substantially enhances caption quality, reaching a CIDEr score of 83.2, marking a 3.2% improvement from the best available captioning model at four times faster inference speed. Second, we propose GenAu, a scalable transformer-based audio generation architecture that we scale up to 1.25B parameters. Using AutoCap to generate caption clips from existing audio datasets, we demonstrate the benefits of data scaling with synthetic captions as well as model size scaling. When compared to state-of-the-art audio generators trained at similar size and data scale, GenAu obtains significant improvements of 4.7% in FAD score, 22.7% in IS, and 13.5% in CLAP score, indicating significantly improved quality of generated audio compared to previous works. Moreover, we propose an efficient and scalable pipeline for collecting audio datasets, enabling us to compile 57M ambient audio clips, forming AutoReCap-XL, the largest available audio-text dataset, at 90 times the scale of existing ones. Our code, model checkpoints, and dataset will be made publicly available upon acceptance
Towards Network-aware Sharing for Performance Interference Mitigation in Data Center Networks
In today’s data centers, cloud network resources are shared among different applications, services, and tenants. The sharing of the network is subjected to increasingly stringent performance requirements for high throughput, ultra-low latency, and large-scale deployment. However, different traffic can interfere with each other and lead to unpredictable network performance. There are two significant aspects of network issues resulting from traffic interference. On the one hand, partial or entire networks can be blocked; on the other hand, the network bottleneck link can be unfairly shared. The susceptibility of a network to traffic interference and the corresponding network issues depend on several factors, such as the design of the network, the protocols in use, and how traffic shares the network resources.
Network blocking, often caused by deadlocks, can severely degrade performance. Lossless Ethernet deployments, which aim to eliminate packet loss by using flow control protocols to pause data transmission and preserve buffer space, are particularly prone to deadlocks. To address the problem seamlessly, this thesis presents ITSY, a data plane system designed to detect and resolve deadlocks efficiently using initial triggers. It can detect deadlocks instantaneously with minimal overhead and prevent the recurrence of the same deadlock. Furthermore, ensuring fair network sharing among different traffic in data centers is challenging. Different traffic contributes to network congestion in varying degrees. Unfortunately, the network cannot differentiate these contributions and customize rate control for different traffic. As a result, malicious or selfish traffic can monopolize bandwidth and interfere with others, causing well-behaved traffic to suffer. In this thesis, we envision dealing with network unfairness in a two-pronged approach: providing bandwidth isolation among different traffic and mitigating burst occurrence to reduce latency interference. Specifically, we design Augmented Queue (AQ), a scalable in-network abstraction that provides precise bandwidth guarantees at the application, transport, and link layers. In addition, we propose Sentinel, a proactive and agile management mechanism to mitigate the adverse effects of microbursts in multi-tenant networks, thereby improving the application-level latency
Understanding the Crucial Role of Middle School Counselors in Providing Computer Science Opportunities to Rural Students: A Research-Practice Partnership's Journey Toward Leveraging the Expertise of RPP Team Members
Leveraged ETFs: The Key to Early Retirement?
In modern day retirement planning, many people manage their investments themselves, investing in index funds that track broad market indices such as the S&P 500. This project investigates whether investing in leveraged Exchange-Traded Funds (ETFs) can provide a viable strategy for retirement investing. The study evaluates different leverage strategies to determine the optimal ratio to balance risk and reward. In order to measure portfolio performance, backtesting, bootstrapping, and Monte Carlo simulations will be done on historical stock price data from 1965 to 2024). This project aims to determine if leveraged ETFs can amplify returns without excessive risk, offering insights into their potential role in retirement portfolios. The results from this project show that highly-leveraged portfolios are risky for the average investor, but if one has a longer investment timeframe, slightly leveraging up an investment portfolio would give increased returns with limited risk
Microfluidic Investigation of Surfactant Foam Flow in Porous Media
Foams are ubiquitous in everyday life, with applications ranging from detergents and beverages to firefighting. They are also essential in industrial applications, including enhanced oil recovery in the oil industry and carbon sequestration in hydraulic fracking sites. Therefore, there is significant interest in understanding the fundamental physicochemical processes governing foam to predict its behavior in natural porous media environments. Microfluidics have proven to be effective in visualizing small-scale events and processes that are otherwise challenging to observe in natural confined systems. The work presented in this thesis investigates liquid surfactant foam flow in microfluidic porous media, focusing on foam transport dynamics and stability.
In the first part of the thesis, the relationship between phase mobility and foam strength was investigated. By combining high-speed imaging and image processing, we provide an in-depth comprehension of foam texture, in relation to foam quality and flow. Additionally, the study probes the role of pore size in foam generation in porous media. The next body of work examines the interfacial viscoelasticity of different surfactant formulations and its role in foam generation in a single constriction device. Finally, carbon dioxide foam is studied at elevated pressure to assess the effect of gas solubility on foam stability, and surfactants with cationic, anionic, and nonionic head groups are screened for optimal foam strength. Overall, this work offers valuable insight into the generation, stability, and mobility of surfactant foam flow in porous media