21793 research outputs found
Sort by
Efficient and Interpretable Representations: From Medical Representation Learning to Vision-Language Multimodal Representation Engineering
Visual representation learning has achieved remarkable progress on natural image benchmarks, but faces critical challenges when deployed in specialized domains like medical imaging. This thesis addresses two interconnected problems: developing efficient architectures that maintain performance while aligning with domain expertise, and creating scalable frameworks for understanding what foundation models learn across different architectures.
We first investigate Vision Mamba architectures for medical applications. For histopathology, we adapt Vision Mamba within the DINO self-supervised learning framework, achieving an 8.21 AUC point improvement over Vision Transformers with comparable parameters on lymph node metastasis detection. Explainability analysis reveals that Vision Mamba focuses on diagnostically relevant cellular features, suggesting better alignment with clinical workflows. For breast ultrasound classification, we demonstrate through transfer learning that Mamba-based architectures achieve statistically significant improvements, with comprehensive analysis showing they are never significantly outperformed by traditional CNN or Vision Transformer baselines.
Our interpretability analysis of pathology foundation models using sparse autoencoders reveals a fundamental scalability problem: each model produces incompatible latent spaces that require separate expert analysis, creating exponential scaling in interpretability effort as foundation models proliferate. To address this limitation, we develop SPARC, a unified framework that enables interpretability analysis across multiple models simultaneously. SPARC introduces a Global TopK mechanism ensuring identical latent dimensions activate across models, and cross-reconstruction loss enforcing semantic consistency. Our evaluation demonstrates substantial improvements, achieving 84.4% neurons active across all streams compared to 43.6% with traditional approaches, and enabling new capabilities like text-guided spatial attention in vision-only models.
This work contributes efficient architectures for medical applications, identifies fundamental limitations in current interpretability paradigms, and provides a scalable solution that transforms cross-model interpretability from an exponentially scaling manual process into a systematic, unified approach. The results have implications for both medical AI deployment and broader interpretability research as foundation models continue to proliferate across specialized domains
Towards Reliable Image Classification: A Systematic Robustness Analysis of CNN and Classical Models Under Natural Corruptions
Machine learning models, particularly Convolutional Neural Networks (CNNs), dominate image classification tasks in critical domains such as medical imaging, autonomous driving, and insurance. However, despite high accuracy on clean benchmark datasets, these models often exhibit significant performance degradation under real-world corruptions like noise, blur, occlusion, or compression artifacts, leading to safety risks and operational failures. Existing robustness evaluations remain limited, focusing predominantly on deep neural networks, using narrow accuracy-based metrics, and overlooking classical machine learning approaches, uncertainty quantification, prediction stability, and computational efficiency.
This thesis presents a comprehensive evaluation of seven model families—ranging from classical (Logistic Regression, SVM, K-NN, Random Forest, MLP) to deep learning (Lenet-5, ResNet-18)—on MNIST and Fashion-MNIST. We propose a unified, multi-metric framework assessing accuracy, robustness (flip rate, label variation), uncertainty (Gini index, max probability), and efficiency (parameter count, training time) under clean, corrupted, and mixed-noise conditions.
Our findings offer practical insights into model reliability and highlight the trade-offs between performance, stability, and computational cost—supporting more informed choices in real-world deployments
A Ghost Letter
Amid the convulsions of China’s Cultural Revolution, A Ghost Letter follows Yisun, a sixteen-year-old in the coastal town of Huigang, Fujian Province, whose youth is marked by political violence, betrayal, and the silences of survival. His father, once a proud literature teacher, has been reduced to writing letters for illiterate villagers, the only link to relatives long stranded abroad. When Red Guards close Yisun’s school, he begins secretly writing unsent letters to his imprisoned mentor, Mr. Ding, pouring into them his grief over family tragedies and his father’s quiet complicity.
Through his father’s work, Yisun meets Sister Hong, a Malaysian exile who entrusts him with a letter home — one he leaves unfinished until her suicide fixes her forever in his memory. Seeking escape, he joins the “Down to the Countryside” campaign, as his father is later imprisoned for “enemy collaboration.” On his deathbed, Mr. Guo confesses to signing the false testimony that condemned Mr. Ding; Yisun cannot forgive him.
After the Cultural Revolution, Yisun becomes a legal document writer for the poor, the powerless, and for Sister Hong’s ghost, still unreturned to her homeland. Decades later, as his ancestral home falls to the machinery of a new era, he comes to understand his father’s silence, and at last forgives him. Despite enduring loss and the persistence of political power, he tends to a small, enduring flame of hope
Class Imbalance and Time-To-Detection in the Performance Analysis of Machine Learning-Based Intrusion Detection Systems
The increasing reliance on Industrial Control Systems (ICS) and Supervisory Control and Data Acquisition (SCADA) systems has raised critical concerns regarding their vulnerability to cyberattacks. While machine learning (ML) methods have emerged as effective tools for detecting such intrusions, their real-world applicability is challenged by two major issues: the imbalance in cybersecurity datasets and the limited focus on the time required to detect attacks—referred to as Time-To-Detection (TTD). To address these two issues and suggest better practices for ML-based IDS researchers, this thesis examines the gaps in the literature and, through two respective case studies, aims to suggest practices toward a more precise and practical performance assessment of ML-based intrusion detection systems (IDS).
First, the thesis examines the class imbalance problem prevalent in popular Information Technology (IT) and Operational Technology (OT) cybersecurity datasets, where normal traffic often vastly outnumbers attack instances. This imbalance leads to biased model performance and inflated accuracy scores, which can over- or under-assess a model’s ability to identify the minority classes correctly. Through a case study with several ML models on a realistic dataset, we demonstrate how imbalanced classes should be considered in the performance evaluation, and how imbalance learning techniques like resampling should be properly utilized for robust performance of ML-based IDS.
Second, this thesis examines TTD, a crucial but often understressed performance indicator that measures how promptly an ML-based detection system identifies the onset of an attack. In addition to traditional metrics that focus solely on classification performance in ML communities, this thesis proposed a TTD model based on real-world responsiveness in OT systems by defining vari
ous stages of the detection process for ML researchers to quantify and measure temporal overheads accordingly. We also demonstrate how the proposed TTD model can be applied to OT datasets through a case study, thereby suggesting it as a best practice for a more comprehensive evaluation of ML-based intrusion detectors in practical OT use cases.
Through the two studies above, the thesis offers a more comprehensive and practical approach to evaluating ML-based IDS. It demonstrates how thoughtful consideration and integration of class imbalance and detection timeliness in the development and assessment of ML-based IDS representation is essential to deploy trustworthy and efficient cybersecurity solutions in critical infrastructure systems
Machine Learning for EV Charging: Imputation, Disaggregation and Forecasting
The growing adoption of Electric Vehicles (EVs) presents new challenges to power management due to their high-power and irregular charging loads. These unpredictable demands often coincide with peak usage periods, causing sharp fluctuations in energy consumption that can strain grid infrastructure and complicate demand-side planning. As such, effectively managing EV charging behavior has become a critical component of modern energy systems.
High-quality, uninterrupted data is essential for energy management tasks such as load disaggregation and short-term forecasting. Disaggregation separates total consumption into appliance-level usage, including EVs, while forecasting enables proactive energy planning. However, data collected from IoT-based systems is often incomplete due to sensor faults or communication failures. Missing values can distort the underlying distribution and introduce bias, significantly degrading the performance of downstream models. To address this, we propose ResiDualNet, a dual path sequence-to-sequence model that reconstructs missing EV charging data using convolutional layers for local patterns and bidirectional LSTMs for long-term trends. Compared to common imputation methods, ResiDualNet achieves superior reconstruction accuracy. Importantly, forecasting models trained on ResiDualNet-imputed data yield results that are close to those trained on complete, uncorrupted data and significantly outperform models trained on data imputed by other approaches. Building on this, we propose MCD-NILM, a multi-scale clustering and decoding framework for appliance and EV energy disaggregation. It employs a soft clustering mechanism to group temporal features into appliance categories: long-cycle, short-cycle, and seasonal appliances and assigns each cluster to a dedicated decoder. This approach improves the separation of overlapping patterns, particularly in the presence of EVs. Our evaluation demonstrates that MCD-NILM outperforms several state-of-the-art NILM models on benchmark datasets.
Lastly, we present BiGRU-CNN-KAN, a hybrid model for short-term EV load forecasting. The architecture integrates bidirectional GRUs and convolutional layers with a Kolmogorov–Arnold Network (KAN)-based fusion mechanism to learn complex temporal patterns and nonlinear dependencies. The model is evaluated over 6, 12, and 24-hour forecasting horizons and consistently demonstrates superior performance compared to several state-of-the-art baseline models across multiple real-world datasets.
Together, these three components form a robust pipeline for EV-aware energy management
“Lean on me”: An international comparison of social support, subjective social status, and adolescent health
Perceived social support and subjective social status (SSS) are key social determinants of child and adolescent health. However, little is known about the extent to which they relate to one another or interact to influence health outcomes among youth. Additionally, the role of broader social context in shaping these associations remains underexplored. This dissertation addressed these research gaps via four manuscripts:
Manuscript 1 summarized the association between perceived social support and SSS in youth via meta-analysis. A small, positive association (r = .15) was found, consistent across support sources, measurement tools, and levels of cultural individualism. Stronger associations were observed in countries with lower income inequality. However, substantial heterogeneity remained.
Manuscript 2 investigated the associations between perceived social support and SSS among early adolescents using data from the 2013/4 Health Behaviour in School-aged Children (HBSC) Survey. It addressed limitations from Manuscript 1 and replicated findings of a small, positive association between social support and SSS. This association was robust regardless of demographic adjustment and moderation. Country-level income inequality, cultural individualism, and power distance were not significant moderators.
Manuscript 3 assessed the interactive effects of social support and SSS on self-reported health using the HBSC dataset. Higher social support buffered against the negative health effects of SSS for self-rated health, general health symptoms, life satisfaction, breakfast consumption, and substance use (but not physical activity).
Manuscript 4 examined country-level economic and cultural moderators of the buffering effect of social support against SSS on self-reported health outcomes and lifestyle behaviours. High social support was generally more protective against SSS in countries with higher GDP, lower income inequality, and more collectivistic, egalitarian cultures; direction varied across health measures.
Taken together, results of this dissertation offer novel insights about the associations between social support, SSS, and adolescent health across countries. Importantly, our findings point to the interactions between individual- and country-level social determinants of health. Specifically, individual social support and SSS interact to impact health, which is further moderated by country-level economic and cultural indicators. Results can be used to inform culturally sensitive biopsychosocial models of health among adolescents
Transformer-Based Models for Identifying Customer Needs in User-Generated Content: Performance Gaps, Unintended Bias, and Broader Implications
This thesis reviews and evaluates intelligent methods for identifying customer needs in user-generated content (UGC). It first surveys prior work and shows that many studies share generic goals yet overlook the complexity and taxonomy of needs in their evaluation setups. To clarify scope, the thesis distinguishes between using Machine Learning (ML) as a tool to support marketing workflows and treating customer-needs identification itself as an Natural Language Processing (NLP) task with clear definitions and constructs. Building on this perspective, a large experimental study assesses Transformer-based models for generalizability, robustness, fairness, and sample efficiency across varied settings. Results indicate competitive accuracy, with gains in F1 up to 18% over baselines, but also consistent limitations: shared error patterns, difficulty with rare or unseen needs, reliance on lexical cues that weakens cross-domain performance, and no guaranteed gains in sample efficiency from larger models. Cross-domain results benefit most from richer, diverse domain training, while adding more in-domain data does not improve transfer. Beyond technical metrics, the thesis highlights adoption barriers, costs, data constraints, task complexity, and ethical considerations and argues for evaluation frameworks that reflect taxonomy, transparency, and fairness. It concludes with practical guidance that bridges marketing theory and NLP practice to support responsible, reproducible deployment
Homonationalism after Homonationalism: Queer Politics under US Exceptionalism
This theory-based thesis presents an in-depth study of Jasbir Puar’s concept of homonationalism. This term illuminates how the selective inclusion of (some) queer subjects within the nationalist project happens at the expense of racialized populations. In positioning homonationalism within the alternate critical genealogies of queer of color critique and women of color feminism, I demonstrate that these theoretical contributions were essential to Puar’s theorization of homonationalism through their early critiques of the appropriation of LGBTQIA+ rights for nationalist ends. I explore the origins and manifestations of homonationalism through the discourses of sexual exceptionalism and the semiotic construction of the “monster-terrorist-fag” in relation to normative national queer subjects. In an effort to theorize our present moment marked by the rise of fascism and authoritarianism, I investigate phenomena that signal the persistence of homonationalism in our present moment, despite new anti-LGBTQIA+ legislation, what I call homonationalism after homonationalism. Paying attention to the resonances between the present, the War on Terror, and Trump’s first presidency, I attempt to identify the shifting figure of the “monster-terrorist-fag” defined through present-day processes of detention and deportation as well as through deployments of US sexual exceptionalism. Ultimately, this thesis is an effort to evaluate the salience of the concept of homonationalism in a shifting sociopolitical context
The Zeros of Dirichlet Series of Cubic Gauss Sums over Function Fields
Since the Gauss sums are not multiplicative, the Dirichlet series of th order Gauss sums do not have an Euler product. Therefore, they are not expected to satisfy the Riemann Hypothesis. Over , the Dirichlet series of cubic Gauss sums are polynomials in , which reduces the task of finding the zeros of the series to computing the roots of a polynomial. In this work, we will discuss the challenges that arise when computing these roots and present numerical data on them
The Role of the PFC in Stress-Induced Relapse to Heroin Seeking Following Voluntary Abstinence in Rats
The opioid crisis remains a critical public health concern, with relapse presenting a major challenge in the treatment of opioid use disorder. Stress during abstinence, such as hunger from caloric restriction, increases relapse vulnerability, yet the neural mechanisms underlying stress- induced relapse, particularly in the context of food deprivation, remain unclear. This thesis investigated behavioral and neural contributors to heroin use and relapse in rats, focusing on the orbitofrontal cortex (OFC) and prelimbic (PrL) cortex. First, we evaluated whether a 5-minute seek-take protocol effectively models intermittent access and compulsive heroin use in both male and female rats. After validating the model, we asked two key research questions: whether chemogenetic inhibition of (1) the OFC or (2) the PrL reduces stress-induced heroin seeking following punishment-imposed abstinence. Using an established intravenous heroin self- administration paradigm followed by punishment-imposed abstinence (via footshock), rats were tested for heroin seeking in either a sated or food-deprived state. Chemogenetic inhibition was achieved by expressing inhibitory Designer Receptors Exclusively Activated by Designer Drugs (DREADDs) in the OFC or PrL and activating them with systemic deschloroclozapine dihydrochloride (DCZ). We hypothesized that inhibiting these cortical regions would attenuate stress-induced relapse. Our results show that the 5-minute seek-take protocol successfully models intermittent and compulsive heroin use in both sexes. However, chemogenetic inhibition of the OFC or PrL cortex did not reduce heroin seeking during stress-induced relapse. These
findings suggest that the OFC and PrL might not be critically involved in stress-induced relapse to heroin seeking following punishment-imposed abstinence