University of Waterloo

University of Waterloo's Institutional Repository

Not a member yet

21090 research outputs found

Sort by

Label-free optical microscopy: Photon Absorption Remote Sensing (PARS) and other methods for label-free histopathological imaging of tissues

Author: Ecclestone Benjamin
Publication venue: University of Waterloo
Publication date: 04/01/2026
Field of study

Emerging label-free microscopy methods offer promising new avenues to view cells and tissues in their native environment, minimizing external influences. These label-free techniques are an exciting departure from gold standard methods for visualizing microscopic cellular and tissue structures, which rely on centuries-old chemical staining processes. In current practice, chemical labelling can unavoidably interfere with specimens’ physical and biochemical integrity. As a result, samples are effectively consumed by staining with only a single stain set normally applied to each sample. This limitation is especially impactful in applications such as clinical oncology and medical histopathology. In these settings, irreversible staining processes can severely limit the diagnostic utility of samples; especially when there is limited sample volume (e.g., brain tumor biopsies). As an alternative, label-free imaging techniques offer a potential avenue to visualize subcellular tissue anatomy while preserving samples in their entirety. Subsequently label-free microscopy methods have significant potential to greatly increase the diagnostic utility of each specimen, thereby enhancing patient outcomes. This thesis focuses on developing new methods for label-free microscopy, specifically emphasizing techniques for label-free histopathology. As a starting point the targeted objective is to develop a label-free analog to chemical hematoxylin and eosin (H&E) staining. This objective is chosen as H&E represents the gold standard contrast applied in effectively every clinical diagnostic case. Subsequent developments in this thesis can be broken into three major sections, which focus on (1) developing label-free microscopy methods for H&E-like imaging, (2) exploring the biomolecular specificity of developed methods to validate the label-free H&E-like contrast, and (3) producing a label-free microscopy architecture capable of meeting the imaging requirements necessary for clinical adoption. The first collection of works explores the development of a range of label-free microscopy methods. These studies establish new variations and combinations of optical absorption and scattering microscopes to visualize microscopic tissue anatomy label-free. These efforts ultimately resulted in the development of a new optical absorption microscopy modality, Photon Absorption Remote Sensing (PARS). This comprehensive technique provides biomolecule-specific visualizations characterizing the dominant photophysical effects caused when photons are absorbed by a biomolecule. As a direct result, novel PARS specific contrasts are developed as the total absorption (TA) and quantum efficiency ratio (QER). These PARS measurements may provide unique views into biomolecules’ excited state dynamics, accessing characteristics related to the quantum yield. By specifically probing specimens’ response to the absorption of deep ultraviolet light, PARS is shown to provide label-free contrast directly reminiscent of gold standard chemical H&E staining methods. As a proof of concept, the initial PARS architecture is applied to capture submicron resolution images of key H&E-like diagnostic markers across a variety of human and animal tissue specimens. The second section of this thesis expands the basis for PARS histopathology by validating PARS capacity to produce H&E-like visualizations. Two main avenues of exploration are pursued in this effort. The first endeavor explores the underlying biomolecular contrast of the PARS measurements. Established statistical methods are applied to develop characteristic PARS profiles for biomolecules. These PARS signatures are then applied to map the abundance of molecules label-free inside complex specimens. As a proof of concept, key diagnostic features including nuclei, red blood cells, and connective tissues are directly characterized and unmixed label-free. Resulting statistical abundance mappings are directly validated against chemically stained ground truth counterparts. The second endeavor introduces an end-to-end pipeline which uses deep learning-based image-to-image transforms to emulate chemical H&E visualizations from label-free PARS data. Resulting PARS emulated H&E-like visualizations are validated against chemical H&E staining through a clinical concordance study. In this diagnostic validation study, statistical analysis is applied to determine if pathologists produce the same diagnoses on both PARS and chemical H&E images. In this preliminary test, the PARS-based virtual staining method achieves > 90% concordance with very high statistical confidence (Kappa > 0.7) across all measured diagnostic tests. The final thesis section develops a new PARS architecture which achieves pragmatic imaging performance, nearing the requirements for clinical diagnostic settings. The presented system features a hybrid opto-mechanical scanning architecture which allows for high-speed MHz rate imaging. This results in imaging speeds which are more than an order of magnitude faster than earlier PARS embodiments developed in the PhotoMedicine Labs (at the University of Waterloo). This work simultaneously develops an end-to-end control system and imaging workflow which enables fully automated PARS imaging of whole specimens. Deep learning methods are applied to the resulting PARS images to produce virtual H&E-like visualizations. Qualitative and quantitative methods are applied to validate the imaging performance across a range of human and animal tissue samples. Results indicate the PARS virtual H&E images are largely indistinguishable from chemically H&E-stained ground truth images. Notably, the presented system forms the basis for a commercially available clinically ready prototype for label-free PARS histopathology imaging. In total, the findings presented across this thesis encompass the development of a new variation of microscopy technique (PARS). This method provides unique views into the absorption and scattering characteristics of specimens opening a new avenue of label-free contrast. For the presented histopathology application, PARS can provide powerful H&E-like images which may circumvent key challenges of chemical staining. In clinical histopathology, this method could enhance the diagnostic utility of tissue specimens directly improving patient outcomes. Beyond histopathology, the principles of PARS may be directly applicable to a wide range of imaging applications spanning material science, biological research, and clinical diagnostics. Overall, the methods developed in this thesis lays the groundwork for new label-free optical absorption microscopy techniques, which are already achieving real-world commercial and clinical success in histopathology applications

Influence of Boundary Conditions on the Sheared Edge Fracture Limits of a 3rd Generation Advanced High Strength Steel.

Author: Advaith Narayanan .
Publication venue: University of Waterloo
Publication date: 07/01/2026
Field of study

A fundamental trade-off between strength and ductility exists in advanced high strength steels (AHSS), particularly for sheared edge splitting in automotive forming operations. The widely used ISO16630 conical hole expansion test for edge stretchability is known to be a poor representation of the in-plane deformation modes that are the primary source of edge splitting in stamping, leading to an overestimation of formability in virtual tryouts. Additionally, virtual experiments rely upon the input of a single fracture strain value to predict edge cracking in stamped parts, disregarding the effects of deformation mode and element size. An efficient and reliable modeling approach for edge failure is required without having to simulate the shear cutting process. The present work addresses some of these challenges through four interrelated tasks aimed at developing guidelines to efficiently characterize the anisotropic plasticity behavior and edge fracture limits, to support reliable experimental assessment and finite-element modelling of sheared edge fracture in practical forming applications. There is a need to develop efficient strategies for anisotropic plasticity characterization of sheet materials to be able to accurately simulate the various tensile edge stretching modes ranging from splitting without necking to potential localization before fracture. To this end, the baseline plasticity characterization of four approximately pressure-independent aluminum alloys (AA) and steels with varying ductilities and anisotropy levels: AA5182-O, AA7075-T6, DC04, and 980GEN3 steels were performed using uniaxial tensile tests in multiple orientations. Using digital image correlation (DIC), the area strain at the neck center was monitored to measure the flow stress response to strain levels more than twice the uniform elongation, with the added advantage of probing anisotropic hardening effects. A hybrid inverse analysis procedure was further developed and applied to notch tensile tests to obtain the major stress under plane strain tension while constraining the minor-to-major principal stress ratio to remain near 1:2. Anisotropic yield functions were subsequently calibrated using data from a range of stress states with emphasis on plane strain tension. The calibrated yield functions and hardening responses were shown to accurately reproduce both the local and global behavior in flat punch hole expansion tests, which activate a wide range of tensile-dominated stress states. Flat punch hole expansion simulations using yield functions calibrated without plane strain data consistently deviated from the DIC in-plane strain magnitudes with absolute differences of up to 15% for DC04 steel. The proposed methods provide general guidelines for efficient calibration of anisotropic constitutive models for approximately pressure-independent materials that are accurate to large deformation levels. Next, the mechanics of the conical hole expansion test were examined to assess the role of necking and anisotropy and to develop methodologies for fracture strain estimation. Finite-element (FE) models of the test were created in LS-DYNA software for two AHSS grades with differing plastic strain anisotropies using hexahedral solid elements. An analysis of through-thickness stress and strain gradients from the numerical models revealed that localization is suppressed until a hole expansion ratio of 200%, with the outer hole edge exhibiting a proportional uniaxial tensile stress state. Any non-uniformity in hole shape or thickness around the circumference of the extruded hole was found to be a manifestation of the tensile plastic strain anisotropy distribution and not necking. The hole expansion ratio was found to be suboptimal for quantifying edge stretchability since the inner hole edge undergoes a non-linear strain path transitioning from compression to uniaxial tension. Furthermore, when using the HER as a fracture metric, the local outer hole edge element strains from FE simulations were underpredicted with absolute differences of up to 10%. An analytical technique was proposed to obtain the local major fracture strain from conical hole expansion using the outer hole diameter measured at the crack location, with the equivalent failure strain then obtained using plastic work equivalence. The strains obtained using the proposed method were in excellent agreement with the elemental strains from numerical models with a maximum difference of 4% for the highly anisotropic CP800, confirming its suitability for fracture strain measurement from the test. Subsequently, a novel four-point fixture and specimen geometry that promotes failure under the deformation mode of in-plane bending was developed to characterize the uniaxial fracture limits of moderate ductility materials. The in-plane bending mode is also representative of edge splitting at peripheral regions of stamped parts. Techniques to detect the onset of fracture and accurately measure the edge strains from the in-plane bend tests were proposed that is applicable to a wide range of material ductilities. The uniaxial fracture strain measured in the in-plane bend test conducted with a machined edge was found to agree closely with the conical hole expansion true fracture strain of 0.68 for a 3rd generation 980GEN3 advanced high strength steel. The in-plane bend test also showed promise for plastic strain anisotropy characterization under uniaxial tension and compression to strain levels much larger than the material uniform elongation. A gauge height-to-thickness ratio of 4.0 or lower is recommended as a specimen design guideline to mitigate buckling based on a comprehensive experimental study conducted on multiple materials and thicknesses. Finally, the influence of loading conditions on the sheared edge fracture limits of 980GEN3 steel punched with a 5.0 mm hole and 12% clearance was investigated using five different test methods that imposed different stress and strain gradients in the vicinity of the sheared edge. A convergent fracture strain value of approximately 0.30 was observed across the in-plane edge fracture tests, with the conical hole expansion test exhibiting a higher strain of 0.45 due to out-of-plane deformation and fracture being defined at through-thickness cracking. Differences in fracture strains between the in-plane tests were also magnified by the choice of DIC lengthscale or virtual strain gauge length, reflecting each test’s varying sensitivity to DIC strain averaging. Global stretchability metrics were proposed for each deformation mode, enabling edge crack assessment in industrial applications without the need for DIC. The global edge stretch metrics were also found to inform the appropriate choice of DIC lengthscale for design and FE modelling. Finally, FE simulations of the edge fracture tests were conducted using multiple mesh sizes, revealing that a boundary condition dependence can also manifest in simulations with the added influence of lengthscale sensitivity. The predicted major strains at the experimental fracture instant varied with mesh size, suggesting that a single strain value may be insufficient to describe edge fracture. The elemental thinning strain showed reduced dependence on mesh size, making it a more reliable parameter for assessment of edge fracture in simulations. Importantly, the simulations indicated that the edge fracture strain cannot be represented by a unique value but is rather a function of the imposed loading condition. The in-plane stretching mode exhibited the lowest engineering thinning strain limit of 8.8%, making it the critical deformation mode for edge crack initiation in 980GEN3 steel. A key outcome of this work is the quantitative understanding of the effect of boundary conditions and lengthscale on the edge fracture limits. Prediction of sheared edge fracture must account for both the imposed loading and numerical lengthscale, with thinning strain offering a more robust metric for use in simulations. The developed methodologies provide practical and efficient guidelines that can be implemented in industrial environments for edge crack assessment and prediction in stamping simulations

Practically Efficient Protocols for Private Computation using Homomorphic Encryption

Author: Akhavan Mahdavi Rasoul
Publication venue: University of Waterloo
Publication date: 20/01/2026
Field of study

Digital services have become an indispensable part of our daily lives, particularly services that interact with our most private and sensitive data. With the abundance of such services, users are left to make the difficult choice: can I safely use digital services and products, or does it necessarily come at the cost of my privacy. Private computation techniques empower service providers to perform computation over private data, without the need to observe the data. This not only provides privacy for clients while the data is being used but reduces the risk of incidents such as data leaks for service providers. One commonly used tool for private computation is Homomorphic Encryption (HE), which is a form of encryption that allows computation on data in encrypted form. While homomorphic encryption in theory permits arbitrary computation over encrypted data, in practice, a naive implementation of a desired functionality rarely yields a practical result. For example, one common obstacle when using homomorphic encryption is the high computation time and the large ciphertexts that incur high network costs. However, communication and computation costs are not the only metrics that need to be considered. In my work, we describe problems that arise when homomorphic encryption is used in applications and address these limitations by proposing new techniques and novel protocols. In these new constructions, we not only improve the performance compared to prior work in terms of communication and computation costs but also address additional problems that arise in the deployment of these protocols. Throughout the process, we draw insights on how to design protocols that can be applicable for developers, practitioners, and future researchers. For example, we enable homomorphic comparison of encrypted numbers with higher precision than previous work, using novel representation of numbers that is more suitable for homomorphic encryption. Using this and other building blocks, we propose efficient protocols for decision tree evaluation and private set intersection. Moreover, through our work on private information retrieval, we identify the challenges of using such a protocol in practice and propose novel protocols that are suited for deployment in real-world applications

Counterfactual Data Augmentation for Regression

Author: Mohebbi Hossein
Publication venue: University of Waterloo
Publication date: 20/01/2026
Field of study

Data-driven modeling in real-world regression tasks often suffers from limited training samples, high collection costs, and noisy observations. While data augmentation has revolutionized fields such as computer vision and natural language processing by leveraging domain-specific symmetries, effective techniques for tabular regression remain elusive. Existing approaches, ranging from geometric interpolation to deep generative models, often fail to preserve the underlying noise structure of the data, leading to the generation of unrealistic samples that can degrade predictive performance. This thesis proposes a novel framework called Counterfactual Residual Data Augmentation (CRDA). Our method is founded on the theoretical principle of Residual Invariance, which posits that once a regressor has modeled the systematic component of the data, the remaining residual noise often remains stable under small perturbations of carefully selected features. We exploit this invariance to synthesize valid counterfactual samples, which are data points with perturbed features but preserved residual noise. We formalize this process through the lens of structural causal models, establishing conditions under which the residual is conditionally independent of specific feature subsets. We provide a practical, model-agnostic algorithm that integrates feature selection heuristics and statistical safety checks to ensure augmentation is applied only when empirically beneficial. Through extensive evaluation across diverse benchmark datasets, we demonstrate that CRDA consistently reduces test error in data-scarce regimes. Specifically, our method reduces the Mean Squared Error (MSE) of Multi-Layer Perceptrons by an average of 22.9% and XGBoost regressors by 6.4%. Furthermore, comparisons against state-of-the-art baselines, including Mixup variants and diffusion-based generative models, reveal that CRDA offers a more robust and statistically grounded remedy for noise-prone, small-sample regression tasks. Finally, we provide a production-ready, open-source implementation of our framework to encourage applications in real-world tabular regression tasks

Pushing the Limit of Language-Agnostic Program Reduction

Author: Xu Zhenyang
Publication venue: University of Waterloo
Publication date: 16/01/2026
Field of study

Program reduction is a widely used technique for testing and debugging language processors. Given a program that triggers a bug in a language processor, program reduction searches for a canonical and minimal program that triggers the same bug, thereby facilitating bug deduplication and simplifying debugging. Among various reduction approaches, language-agnostic reducers (AGRs) have emerged as an important class of techniques because they do not rely on language-specific knowledge and can thus be applied across a wide range of programming languages. This generality makes AGRs especially valuable for languages lacking specialized reduction tools. However, previous AGRs support only a limited set of program transformations, which restricts their minimization and canonicalization capability and results in substantial performance gap compared to language-specific reducers (SPRs). This thesis aims to enhance both the canonicalization and minimization capabilities of AGRs, thereby narrowing the performance gap between AGRs and SPRs. It comprises the following three contributions. The first work aims to improve the reduction capability of AGRs by enabling them to integrate more transformations in an efficient way. As previously mentioned, previous AGRs support only a limited set of transformations. Once a 1-minimal result is obtained and no further transformation can reduce the program, the reduction process terminates. However, such a 1-minimal result may still contain excessive bug-irrelevant program elements. To address this limitation, this work proposes a framework named Vulcan. Vulcan employs an AGR as the main reducer and introduces a set of auxiliary reducers that perform diverse program transformations. When the main reducer can no longer make progress, Vulcan invokes one of its auxiliary reducers to create new reduction opportunities, and then re-applies the main reducer to further minimize the program. In addition to the framework, this work also presents three example program transformations: Identifier Replacement, Subtree Replacement, and Tree-Based Local Exhaustive Enumeration. Evaluation on a multilingual benchmark suite (referred to as Benchmark-Reduce) which includes C, Rust, and SMT-LIBv2 programs, demonstrates that Vulcan outperforms the state-of-the-art AGR, Perses, in terms of minimization. On average, Vulcan produces results with 33.55%, 21.61%, and 31.34% fewer tokens than Perses on C, Rust, and SMTLIBv2 benchmarks, respectively. The second work focuses on enhancing the canonicalization capability of AGRs. A reducer with strong canonicalization capability can minimize differences among programs that trigger the same bug, thereby greatly facilitating bug deduplication. However, prior AGRs exhibit poor canonicalization capability, primarily because they treat tokens as atomic and irreducible units. To address this limitation, this work proposes T-Rec, a fine-grained, lexical syntax–guided program reduction technique that can effectively reduce and canonicalize each token in a program. Evaluation results show that integrating T-Rec into Vulcan enables the elimination of 1,315 additional duplicates in a benchmark suite containing 3,796 programs that expose 46 unique bugs (referred to as Benchmark-Cano). Moreover, T-Rec further reduces the size of Vulcan’s results on Benchmark-Reduce by up to 53.73% in terms of bytes. The third work aims to further enhance both the minimization and canonicalization performance of AGRs by introducing additional program transformations. Specifically, this work proposes SFC, a novel syntax-guided transformation technique that has been overlooked by prior syntax-guided AGRs. To apply SFC effectively and efficiently in program reduction, three SFC-based reduction methods are designed: Smaller Structure Replacement, Identifier Elimination, and Structure Canonicalization. Evaluation results show that integrating these SFC-based methods into Vulcan yields an average 8.2% reduction in output size on Benchmark-Reduce. Moreover, when combined with T-Rec, the SFC-based methods enable Vulcan to eliminate an additional 435 duplicates in Benchmark-Cano. Collectively, these studies significantly advance the effectiveness of language-agnostic program reduction in both minimization and canonicalization. By integrating the proposed approaches, the prior state-of-the-art AGR, Perses, can produce results that are on average 43% smaller on Benchmark-Reduce and eliminate 1,750 additional duplicates in Benchmark-Cano

Time stepping methods for coupled fluid-rigid body simulation

Author: Gurditta Rikin
Publication venue: University of Waterloo
Publication date: 12/01/2026
Field of study

Interaction between fluids and solid objects is ubiquitous in everyday life, yet the resulting motion is too intricate for visual effects artists and animators to realistically depict by hand. Instead, artists turn to computer graphics applications that employ physics-based animation to simulate these complex phenomena. Some of these applications solve the incompressible Euler equations coupled with the rigid body equations to compute the motion of an incompressible fluid interacting with undeformable solids. Of particular interest is two-way coupling, in which the fluid and solids both affect each other’s motion. Many methods have been developed to improve the realism of fluid simulations, allowing them to simulate more compelling scenarios. There are several time stepping schemes for fluid simulation in the literature, presenting ways to evolve the motion of the fluid over time that may generate more energetic or more accurate results. In particular, we focus on the BDF2 and Advection-Reflection families of schemes due to their accuracy and their improved ability to preserve the kinetic energy of the fluid. Our goal in this thesis is to extend these time stepping schemes to two-way coupled fluid-rigid body simulation, to yield more compelling simulations of the interactions between these two types of materials. We catalogue some of the popular time stepping schemes for fluid simulation, and explain their relations to methods of solving ordinary differential equations. Then, taking as our starting point the popular method of Batty et al., we re-derive the time stepping scheme originally proposed for coupled systems, and derive new schemes for coupled systems corresponding to the previously discussed fluid schemes, along the way comparing to the coupled time stepping scheme proposed by Gibou and Min. We measure the accuracy, energy-preservation, and computational cost properties of each scheme implemented within a 2D simulation, presenting quantitative and qualitative results. We hope our work encourages further investigation into the theoretical basis as well as the qualitative properties of coupled fluid-rigid body simulation

GASTON: Graph-Aware Social Transformer for Online Networks

Author: Wloch Olha
Publication venue: University of Waterloo
Publication date: 12/01/2026
Field of study

Online communities have become essential digital third places for socialization and support, yet they also possess toxicity, echo chambers, and misinformation. Mitigating these harms requires computational models that can understand the nuance of online interactions to accurately detect harmful content such as toxicity and norm violation. This is difficult because the meaning of an individual post is rarely self-contained; it is dynamically constructed through the interplay of what is written (textual content) and where it is posted (social structure). We require models that effectively fuse these two signals to generate representations for online entities such as posts, users, and communities. Current approaches often treat these different signals in isolation: text-only models analyze content but miss the local social norms that define acceptable behavior, while structure-only models map relationships but ignore the semantic content of discussions. Recent hybrid approaches attempt to bridge this gap but some rely on simple text averaging mechanisms to represent a user and a community, and in so doing flatten the rich, norm-defining identity. To address this limitation, this thesis proposes GASTON (Graph-Aware Social Transformer for Online Networks), a graph learning framework designed to capture the essence of online social networks. It does so by modeling connections between all online entities, such as users, communities, and text. This makes it possible to ground user and text representations in their local norms, providing the necessary context to accurately classify behaviour in downstream tasks. The heart of our solution is a contrastive initialization strategy which pre-trains community representations based on user membership patterns, effectively capturing the unique signature of a community's user base before the model processes any text. This allows GASTON to distinguish between communities (e.g., a support group vs. a hate group) based on who interacts there, even if they share similar vocabulary. We evaluate GASTON across a diverse set of socially-aware downstream tasks, including mental health stress detection, toxicity scoring, and norm violation detection. Our experiments demonstrate that GASTON outperforms state-of-the-art baselines, particularly in tasks where social context is critical for classification, such as detecting norm violations. Furthermore, we illustrate that these learned representations provide interpretable insights, offering a path toward user-empowered transparency in online spaces

Pan-Arctic Weekly Sea-Ice Forecasts: A Large-Scale Baseline Study and Meta-Learning Ensemble Model

Author: McGuigan Kiernan
Publication venue: University of Waterloo
Publication date: 20/01/2026
Field of study

This thesis introduces a standardized benchmark for a diverse set of architectures for weekly Pan-Arctic sea-ice forecasts with a 13-week (91-day) horizon. This benchmark is introduced in combination with a meta-learning ensemble model which consistently produces improved ice forecasts. The diverse suite of deep learning baselines is trained and evaluated under a common protocol to predict sea-ice concentration (SIC), sea-ice thickness (SIT), and sea-ice presence (SIP) on a Pan-Arctic grid. Building on these baselines, the proposed Meta-Learner employs stacked generalization within a sequence-to-sequence forecasting setup, learning to fuse model outputs with spatio-temporal and lead-time context to improve robustness and reliability. Utilizing atmospheric reanalysis from ERA5 and oceanographic reanalysis from GLORYS12, the study assesses skill across short-, medium-, and long-lead regimes up to the 13-week horizon. Results demonstrate that the Meta-Learner outperforms the best individual baseline, with reductions in ice concentration MAE of 11% and a reduction in ice thickness MAE of 21% while reducing the cross-entropy in ice presence classification by 5%. Improvements are most pronounced in lower variance across random initializations for all tracked metrics, indicating enhanced stability. The bench- mark and ensemble framework provide a reproducible foundation for Pan-Arctic weekly forecasting and highlight learned ensembling as a practical pathway to more accurate and dependable SIC/SIT/SIP predictions at operationally relevant horizons

On the Generalizability of AI-Generated Text Detection

Author: David Amir
Publication venue: University of Waterloo
Publication date: 14/01/2026
Field of study

As large language models (LLMs) become ubiquitous, reliably distinguishing their outputs from human writing is critical for academic integrity, content moderation, and preventing model collapse from synthetic training data. This thesis examines the generalizability of LLM-text detectors across evolving model families and domains. We compiled a comprehensive evaluation dataset from commonly-used human corpora and generated corresponding samples using recent OpenAI and Anthropic models spanning multiple generations. Comparing the state-of-the-art zero-shot detector (Binoculars) against supervised RoBERTa/DeBERTa classifiers, we arrive at four main findings. First, zero-shot detection fails on newer models. Second, supervised detectors maintain high TPR in-distribution but exhibit asymmetric cross-generation transfer. Third, commonly reported metrics such as AUROC can obscure poor performance at deployment-relevant thresholds: detectors achieving high AUROC yield near-zero TPR at low FPR, and existing low-FPR evaluations often lack statistical reliability due to small sample sizes. Fourth, through tail-focused training and calibration, we reduce FPR by up to 4× (from ~1% to ~0.25%) while maintaining 90% TPR. Our results suggest that robust detection requires continually re-calibrated, model-aware pipelines rather than static universal detectors

Development of Hybrid Non-Enveloped Viral Vectors Using the Bacterial Miniphagemid Platform

Author: Hosseinali Mehraveh
Publication venue: University of Waterloo
Publication date: 30/01/2026
Field of study

Gene therapy holds significant promise for treating various diseases, with Adeno-associated virus (AAV) vectors being among the most widely used delivery systems. However, current standard AAV production methods relying on costly and inefficient mammalian cell culture limit scalability and clinical accessibility. A similar human virus, Torque teno virus (TTV), also holds great potential for gene therapy; however, it also suffers from problems in its production. To address this manufacturing bottleneck, this study aimed to develop a novel, cost-effective platform for hybrid viral vector production entirely within Escherichia coli. This work advances research on the use of miniphagemids, phages that package a minimal vector genome, to achieve in-bacterial assembly of novel hybrid AAV serotype 2 (AAV2)-based and TTV genotype 19 (TTV19)-based vectors. The hypothesis being tested is that the co-production of single stranded DNA using miniphagemid technology and key AAV or TTV proteins in Escherichia coli can result in AAV-based or TTV-based vectors. The key objectives were therefore: 1) showing recombinant expression of heterologous capsid proteins AAV2 VP1/VP2/VP3 and TTV19 ORF1 in E. coli; 2) producing ssDNA minigenomes flanked by either AAV2 inverted terminal repeat (ITR) or TTV19 untranslated terminal repeat (UTR) sequences; and 3) showing that co-producing protein(s) and ssDNA results in AAV2- or TTV19-based vectors. Results confirmed that AAV2 VP2 and VP3 could be produced in E. coli, albeit expressed primarily as insoluble inclusion bodies. Transformation of cells with a plasmid encoding VP1 resulted in reduced growth and no VP1 was recovered. Expression of TTV19 ORF1 in E. coli produced two histidine-tagged protein products approximately half the size of the expected protein (as previously reported). The ssDNA minigenomes were successfully produced and purified, exhibiting high purity (Objective 2). The central finding was the successful in-bacterial production and purification of functional hybrid vectors, termed AAV-based and TTV-based (Objective 3). Iodixanol gradient ultracentrifugation confirmed particle assembly and density separation. Subsequent qPCR quantification demonstrated high genomic titers in the purified fractions, providing strong evidence of successful ssDNA encapsulation by the heterologous capsids within the E. coli host. The study further found that the TTV19 UTRs likely enhance packaging efficiency in the TTV-based hybrid vector system. In conclusion, this research establishes a robust and scalable E. coli-based platform for producing non-enveloped hybrid viral vectors. This achievement represents a significant step toward revolutionizing gene therapy vector manufacturing, offering a pathway to highly purified, consistent, and affordable therapeutic vectors

17,602

full texts

21,090

metadata records

Updated in last 30 days.

University of Waterloo's Institutional Repository

Access Repository Dashboard

Do you manage Open Research Online? Become a CORE Member to access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! 👇