Virginia Tech - Wake Forest University School of Biomedical Engineering & Sciences
VTech Works (Virginia Tech)Not a member yet
122049 research outputs found
Sort by
RailEstate: An Interactive System for Metro Linked Property Trends
Access to metro systems plays a critical role in shaping urban housing markets by enhancing neighborhood accessibility and driving property demand.We present RailEstate, a novel web-based system that integrates spatial analytics, natural language interfaces, and interactive forecasting to analyze how proximity to metro stations influences residential property prices in the Washington metropolitan area. Unlike static mapping tools or generic listing platforms, RailEstate combines 25 years of historical housing data with transit infrastructure to support low-latency geospatial queries, time-series visualizations, and predictive modeling. Users can interactively explore ZIP-code-level price patterns, investigate longterm trends, and forecast future housing values around any metro station. A key innovation is our natural language chatbot, which translates plain-English questions (e.g., “What is the highest price in Falls Church in the year 2000?”) into executable SQL over a spatial database. This unified and interactive platform empowers urban planners, investors, and residents to derive actionable insights from metro-linked housing data—without requiring technical expertise. A demonstration video of the system is available at https://www.youtube.com/watch?v=ZLiz8S1UXsc.Published versio
The Quest to Humanize Policy Interventions: Exploring the Relationship between Neighborhood Revitalization Programs and Subjective Well-being of Low-Income Communities in the United States
This dissertation explores policy interventions used in many low-income American cities in their revitalization efforts and examines the impact of these policy programs on the resident's overall quality of life outcomes. This study uses an exploratory sequential mixed methods research design to explore the relationships between best practices, state aid incentives, neighborhood revitalization, and residents' subjective well-being (SWB) in distressed neighborhoods across multiple low-income communities. The use of state aid, such as tax credits for neighborhood revitalization programs has become a popular strategy for U.S. federal, state, and local governments seeking to revitalize disadvantaged neighborhoods. Many scholars have explained its popularity by noting that both they and policymakers frequently expect that such aid incentives will have trickle-down effects that improve the quality of life of neighborhood residents. Yet neither scholars nor policymakers typically have examined whether residents agree. That is, do neighborhood residents perceive that such programs have relatively positive influences on their quality-of-life (that is, their subjective well-being)? Residents in neighborhoods with some forms of revitalization programs may still experience little or no change in their perceived quality-of-life, more so, does municipalities experience improvement in their Best Practices (BP) scores in following a reduction in state aid incentives during a fiscal year?
Findings suggest that individual perceptions and evaluations of neighborhood characteristics play a key role in their quality-of-life experience. The results further highlight that socio-economic factors and respondent locations have direct and indirect effects on reported subjective well-being. This dissertation concludes by discussing how these findings might help scholars and policymakers in understanding how residents' interaction with public policy at the local level shape their quality of life experience, and most importantly, it suggests areas for further research that challenges traditional methods of measuring the impacts of policy interventions.Doctor of PhilosophyThe federal, state, and local governments have used tax credit policies to combat economic inequality and high levels of poverty across neighborhoods in low-income communities in the United States. Many social science researchers have contended that the use of tax credits will benefit the people living in those communities. Based on this expectation, government officials have continued to use state aid incentives, such as tax credits to help poor neighborhoods revive their economies. Despite the extensive use of these policies in poor communities, most residents living in these areas have yet to see significant improvements in their quality of life. Prior studies have neglected the role of subjective well-being (that is, residents' perceived quality of life) when examining program outcomes. This study examines the relationship between best practices, staid aid incentives, and neighborhood residents' subjective well-being. Findings from the study indicate that the socio-economic and spatial disparities have direct and indirect effects on reported subjective well-being. The dissertation concludes by discussing how these findings might help policymakers and scholars in understanding how residents perceive local policies and by suggesting topics for future research
Journal of Composites Science
Ultra-high-performance concrete (UHPC) is an advanced cementitious composite material with high durability and the strength properties exceeding those of conventional concrete. This paper presents the results of experimental testing assessing the freeze–thaw durability of UHPC specimens with varying fiber types (13 mm straight microfibers and 30 mm hooked-end fibers) and fiber percentages, as well as pre-existing cracks. The performance of all specimens was evaluated by measuring resonant frequency at intervals during testing and residual flexural strength after the completion of 350 freeze–thaw cycles. All specimens showed no degradation of resonant frequency over time. However, the pre-cracked specimens showed an increase in resonant frequency over the course of testing. The uncracked straight fibers specimens exposed to freeze–thaw cycles had the highest flexural strength, but the flexural resistance of the pre-cracked straight fibers specimens increased compared to the control specimens after 350 freeze–thaw cycles. The pre-cracked hooked fiber specimens showed higher first cracking strength and similar ultimate strength to the uncracked specimens after freeze–thaw exposure.Published versio
A framework for Improving Hydrologic and Water Quality Prediction in Urbanized Watersheds through Stakeholder Co-Design and Multi-Model Integration
Urban watersheds present a unique modeling challenge due to the complex interplay between natural and hydrologic processes, engineered infrastructure, and the diverse decision-making by multiple stakeholder groups. These interactions span multiple spatial and temporal scales, making it difficult for any single modelling approach to represent the system's full complexity or adequately address diverse stakeholder needs. Many existing modeling frameworks fail to align with stakeholder decision processes, reducing their relevance for applied watershed management. As the first objective, this dissertation introduces a stakeholder driven collaborative design process for developing the Occoquan Watershed Modeling Framework (OWMF), a multi-model, co-designed watershed modeling framework for simulating water quantity and quality applied within the Occoquan Watershed in Northern Virginia, USA. The framework represents a novel advancements in watershed modeling by addressing persistent design limitations in existing approaches by: (1) supporting multi-functional design objectives across hydrologic and water-quality domains, (2) embedding stakeholder priorities from the outset through an iterative, user-centered co-design process, (3) integrating scientifically rigorous, high-fidelity models, and (4) applying competency-based evaluation criteria to quantify performance, feasibility, and decision relevance. These design principles were operationalized through a structured, iterative co-design process that translated stakeholder priorities and model competencies into an implementable framework incorporating models such as GR4J-CemaNeige, SWAT, WAMRF and StormWise. The model selection was further validated by site-specific prototyping and performance evaluation. Following this, the finalized framework development plan was obtained through the co-design process. The analysis presented here provides a structured methodology to build stakeholder-driven, multi-model frameworks that can predict the short- and long-term impacts of natural and anthropogenic drivers that influence watershed resilience. The conclusions aim to bridge the gap between hydrologic modeling and watershed management, enabling a transparent, adaptive and transferable approach for enhancing watershed resilience.
Understanding how future land use – land cover (LULC) and climate change (CC) can alter watershed hydrology and water quality is critical for effective long-term watershed management and planning. With the second objective, this dissertation incorporated a multi-model approach for improving the watershed-scale impact assessment under rapid urbanization and climate change. First, high-resolution LULC and CC projections were developed for the year 2040. Second, the watershed-scale dynamics under baseline (present) and future (2040) scenarios were simulated using three models: SWAT, HSPF, and WARMF. Third, an inter-model comparison was conducted that related the differences in model architecture, spatial discretization, process characterization and calibration strategy to watershed responses under future scenarios. The differences in simulated streamflow, pollutant loads (e.g., nitrogen, phosphorus), and sediment loads were quantified across the three models and three future scenarios. Despite using the same forcing, the three models produced different magnitudes of change in streamflow, sediments and nutrient loading, reflecting the impact of model structure in affecting processes such as simulated runoff generation, sediment detachment, subsurface flow partitioning, phosphorous transport and nitrogen cycling. Moreover, the simulation timestep (hourly vs daily), calibration timestep (hourly vs daily vs monthly) and input data resolution directly impacted the sensitivity of these models to LULC change and climate variability. The inter-model comparison concluded that in addition to the model structure, the calibration methodology impacted how the models projected into the future. Whether the calibration was biased or unbiased towards extremes and which objective functions (streamflow, ET, nutrients etc.) were chosen for calibrating the baseline models had a profound impact on predicting the future watershed responses across the three models. The study showed that multi-model assessments should be the standard methodology for improving confidence in future watershed-scale hydrologic and water quality predictions under the influence of future variability in LULC and climate.
Seasonally shifting hydro-meteorological conditions can introduce substantial variability in watershed response, yet majority of rainfall–runoff models often rely on fixed parameter sets that do not adjust to these changes. The third objective incorporated a multi-pronged approach for improving seasonality representation by improving model parameterization and coupling it with data-driven modeling of hydrologic systems. This study leveraged both these approaches for representing seasonality in hydrologic models for improved streamflow prediction. Using the GR4J-CemaNeige model for the Occoquan Watershed in Northern Virginia, this study tested the application of a predefined four-season parameterization, followed by univariate and multivariate clustering to identify data-driven hydro-climatic patterns. Insights from these analyses informed the development of a hybrid dynamic parameterization in which model parameters varied continuously with time and varied with respect to local-scale potential evapotranspiration observations. Results showed that traditional four-season parameterization improved hydrologic performance only when seasonal boundaries coincided with actual hydrometeorological behavior. The univariate clustering analysis showed that temperature and evapotranspiration followed repeatable annual cycles, whereas precipitation and streamflow displayed irregular and highly variable seasonal behavior, including transitional months without consistent cluster identity. The multivariate clustering further demonstrated that the combined hydro-climatic variables did not align reliably with fixed patterns, reflecting the irregular timing of hydrologic conditions in the study basins. The dynamic formulation generated continuously evolving parameter trajectories and produced more consistent performance across evaluation periods. Collectively, the stepwise progression, from seasonal calibration to clustering-based diagnostics and dynamic parameterization provided a systematic framework for diagnosing seasonal hydrologic behavior and enhancing the temporal adaptability of conceptual hydrologic models.Doctor of PhilosophyUrban watersheds are shaped by shifting weather patterns, population growth, and impervious land development. Because all these factors overlap in complex ways, predicting how a watershed will behave today or in the future is difficult. Many modeling tools capture only part of the picture or are not designed around what decision-makers actually need.
This dissertation is the foundation of a state-of-the-art 'intelligent' and 'future-informed' watershed modeling framework that is being built for the Occoquan Watershed in Virginia. The first study introduces a collaborative, stakeholder-driven co-design approach to develop a "multi-model" watershed framework. Instead of relying on a single tool, the framework combines several specialized models so that each can contribute to its strengths. The selection of these models was guided by "competency questions" that emerged directly from the stakeholder collaboration process; questions that clarified what the framework needed to be able to answer. Because of this, the system is intentionally designed to evolve. The current model set reflects today's priorities and available tools, but it is not fixed; as needs and technologies change, new models may be added and others phased out. Stakeholders helped shape this adaptable design from the beginning, ensuring that the framework is transparent, practical, and aligned with real management decisions.
A subsequent phase of the research evaluates how urbanization and climate variability will affect the watershed by 2040. High-resolution future projections were developed and then tested across three different watershed models. Even when given the same future scenarios, the models produced different estimates of streamflow, sediment, and nutrient pollution. Comparing these models highlighted how differently they represent key hydrologic and biogeochemical processes, clarifying which processes each model emphasizes, how strongly they respond to future change, and where their ability to project long-term conditions is more limited.
The final part examines how seasonal weather patterns influence watershed behavior. Many hydrologic models use fixed parameters that do not adjust as conditions shift from winter to spring, or from dry spells to intense storms. To address this, the study tested a four-season calibration and used data-driven clustering to reveal hydro-climatic patterns. These analyses showed that some variables follow predictable seasonal cycles, while others shift in irregular ways. Using these insights, a dynamic approach was developed in which model parameters change continuously over time. This approach produced more stable performance and a clearer picture of how the watershed transitions between hydrologic states throughout the year, offering a more flexible way to represent seasonality in conceptual models.
Collectively, this research shows a practical way to better understand and manage a complex urban watershed. Integrating results from different models, long-term scenarios, and seasonal patterns creates a more nuanced understanding of watershed behavior. This helps decision-makers understand how the watershed might behave in the years ahead and plan in ways that remain effective as conditions evolve
Engineering Tumor-Targeting Bacteria and Characterizing Their Interactions with Tumor Cells in Therapy-Resistant Breast Cancers
Breast cancer (BC) accounts for one-third of malignancies among women in 157 countries and ~15% mortality among diagnosed cases, a burden projected to reach 1.1 million deaths per year by 2050. Five-year survival drops from >90% in early-stage BC to ~32% for therapy-resistant subtypes such as triple-negative (TNBC), hormone-receptor–variable or resistant ER+, and Luminal B tumors. For these high-risk subtypes, molecular heterogeneity undermines targeted therapies, and clinical management still relies on maximum tolerable doses of systemic chemotherapy, causing severe dose-limiting toxicities. Moreover, the dense collagen-rich extracellular matrix (ECM) of solid tumors restricts intratumoral drug transport, motivating strategies that function across BC subtypes, overcome ECM barriers, and inform lowered clinical dosing. Bacteria-based cancer therapy (BBCT) with cancer-selective bacteria combines motility, self-replication, and on-board biosynthesis with the programmability of synthetic biology, enabling local release of therapeutic factors within the tumor microenvironment. Attenuated Salmonella Typhimurium VNP20009 (ST) exhibits ~10³-10⁴-fold tumor selectivity with respect to liver and spleen. It has a favorable clinical safety profile but remains inefficacious due to poor colonization. Our lab previously showed that ECM-targeting ST with collagenase secretion improved tumor penetration without gross collagen disruption, but at the cost of reduced bacterial fitness and motility. In this dissertation, we hypothesized that a fitness-restored, ECM-targeting ST enhances bacterial intratumoral transport and colonization, as well as chemotherapy penetration. We evaluated the engineered strains in perfused 3D tumor models that represent in vivo intratumor transport properties and investigated cancer cells-neutrophil interactions in presence of bacterial factors. First, we developed a high-motility, fitness-improved collagenase-expressing strain (HM-CEST ΔydcP) that preserves 100% motility under sub-cytotoxic chemotherapy. This strain improves intratumoral transport, and reduces spheroid viability and tumor migration relative to chemotherapy alone while maintaining tumor specificity and safety in preclinical murine models in vivo. Second, we validated a perfusion-enabled microfluidic spheroid platform that supports at least 14-day culture of murine TNBC and ER+ spheroids, enabling long-term BBCT screening. We demonstrated that perfused spheroids require lower chemotherapy doses than static cultures. Third, we biophysically characterized the crosstalk between BBCT, neutrophils, and Luminal B BC cells, demonstrating that neutrophils in the presence of bacteria supernatant suppress cancer cell growth, viability, and migration. Collectively, this work delivers a fitness-restored ECM-targeting Salmonella chassis, validates a perfusion-enabled 3D tumor-spheroid microphysiological platform, and develops a quantitative framework for neutrophil–bacteria–cancer cell interactions, contributing to the design pipeline for next-generation BBCT and supporting the long-term goal of safer, more affordable, lower-dose, and more broadly applicable therapies for therapy-resistant breast cancers.Doctor of PhilosophyBreast cancer remains one of the most common and deadly cancers worldwide, with about 650,000 deaths reported in 2024. "Breast cancer" is not a single disease—some subtypes are much harder to treat than others. The hardest to treat forms often stop responding to hormone or targeted therapies and must be treated with toxic, high-dose chemotherapy that damages healthy cells in hair follicles, bone marrow, and the gut. This leads to hair loss, weakened immunity, gastrointestinal injury, and a high risk of infection, as well as long-term problems such as chronic fatigue, nerve damage, early menopause, and infertility, and increased risk of heart disease and secondary cancers. Despite aggressive chemotherapy, cancers are not always completely eliminated due to biological barriers present within tumors. Most solid cancers contain a dense, collagen-rich matrix that limits the penetration of drugs. Additionally, an immune cell population called neutrophils are "recruited" to the tumor to help it grow and shield cancer cells from chemotherapy. These barriers motivate new strategies that can work across breast cancer subtypes, overcome both physical and immune protection, and ultimately allow lower effective chemotherapy doses. Bacteria possess numerous anti-cancer properties. The overall goal of this doctoral dissertation builds on a clinically safe and cancer-targeting strain of Salmonella Typhimurium bacteria to self-propel in collagen-rich cancer environments while secreting a collagen-targeting enzyme. We show that in three-dimensional breast cancer models that mimic the dense tumor microenvironment, these bacteria penetrate more deeply and, when used briefly before chemotherapy, help a standard chemotherapeutic reduce tumor-cell survival and migration more effectively than chemotherapy alone, pointing toward the possibility of using lower, less toxic doses. To better reflect how tumors behave in the body, this dissertation validates a microfluidic "tumor-on-a-chip" device in which miniature breast tumors are contained and continuously supplied with nutrients and drugs. Through experiments on this platform, we show that improved treatment effects can be achieved with lower chemotherapy doses than in traditional no-flow cultures. Finally, this work examined how engineered bacteria influence neutrophils in a fast-growing breast cancer model, showing that cancer cells exposed to neutrophils plus bacterial effectors grow and move more slowly than untreated cancer cells. Together, this work shows how tumor-targeting bacteria and the body's own immune cells can be harnessed against therapy-resistant breast cancers, providing early evidence that such approaches could reduce our dependence on high-dose chemotherapy.
Statistical Evaluation of Deep Learning for Event Detection in Time Series: Quantifying Uncertainty, Efficiency, and Adaptation with Applications to Seismic Data
Rapid developments in deep learning have led to their widespread use in domains that rely on time series, largely because of their strong performance and flexibility. Yet evaluation practices have not kept pace. Deep learning models are often assessed using a few performance metrics computed on benchmark datasets, which ignores important questions about how predictive performance varies with data availability, how uncertainty is communicated in both predictions and aggregate metrics, and how shifting data distributions impact model reliability.
Presented as three studies, this dissertation develops principled statistical approaches for deep learning model evaluation that addresses these challenges in the context of time-series-based, scientific problems. The first study introduces an evaluation framework for seismic deep learning models where I assess learning efficiency while mitigating data leakage and quantify benchmark uncertainty by attributing variation to both training stochasticity and data sampling through an expansive design of experiments. The second study compares meta-learning techniques across data regimes and analyzes how consistently they perform under data shift. As part of this study, I contribute SeisTask, a semi-synthetic benchmark dataset with controlled, physically meaningful sources of shift for future study on adaptive learning approaches. The third study provides an empirical comparison of meta-learning and hierarchical Bayesian modeling and highlights their theoretical connection. I compare these methods in terms of interpretability, performance under shift, and predictive uncertainty.
In combination, these studies offer statistically grounded evaluations of deep learning models for event detection in time series and show how uncertainty, data requirements, and distributional shift influence model behavior in physical science applications.Doctor of PhilosophyDeep learning models are widely used in the physical sciences to detect important events in time series, such as earthquakes recorded by seismic sensors. Although these models often perform well, it is not always clear how reliable they are, how much data they truly need, or how they behave when data change in unforeseen ways. These concerns matter in many scientific fields and have practical implications for earthquake monitoring, infrastructure safety, and other applications where accurate event detection is important.
This dissertation studies how to evaluate deep learning models in more careful and informative ways. Here, evaluation refers to checking how well a model works, how stable its predictions are, and whether it can be trusted when conditions change. I examine how much model performance varies and why, how efficiently models learn from different amounts of data, and how consistently they perform when the test data no longer resemble what they saw during training. For example, a model trained on signals from one region may struggle when applied to another region with different noise conditions. I also build new datasets and tools that allow these questions to be explored in controlled and meaningful ways. Through a set of case studies in seismology, I compare modern modeling approaches and show which ones are more reliable and stable under different conditions.
Overall, this work shows principled ways to compare deep learning models across relevant criteria and highlights how a single performance number hides much of what matters. Since deep learning models form the core of many modern artificial intelligence systems, their evaluation must reflect that nuance. By accounting for uncertainty, data requirements, and changes in the environment, we can better understand when these models can be trusted and how they can be improved for scientific applications
Carbon Balance and Management
Background: To understand how genetic variation among varieties and stand density affect carbon (C), we assessed C stocks, fluxes, and partitioning in Pinus taeda L. plantations in Southeast Brazil. We measured the annual C balance in two consecutive years (from 7 to 9 years after planting) in four different clonal varieties with distinct crown structures (C1-medium, C2-broad, C3-narrow, and C4-broad) and an OP (open-pollinated) family. From age 7 to 8 years, the C balance was assessed for all five varieties at a stand density of 1894 trees ha− 1. From age 8 to 9 years, the C balance was assessed for three varieties (C2, C3, and OP) at two stand densities (low density (LD): 613 trees ha− 1 and high density (HD): 1894 trees ha− 1).
Results: At age 7–8, the total C stock (above- and belowground plus the litter layer) among varieties ranged from 168 Mg C ha− 1 (C3) to 186 Mg C m− 2 (C1), with the bole as the largest pool (68%). Aboveground net primary production (ANPP) ranged from 1.9 to 3.1 kg C m− 2 year− 1, and total belowground carbon flux (TBCF) from 2.0 to 2.9 kg C m− 2 year− 1. The partitioning of GPP (Gross Primary Production) to ANPP and TBCF reached a maximum value of 35% and 41%, respectively. At age 8–9 years, the C stock was greater in the HD stands than in the LD stands across all varieties. Overall, C stock reached between 103.5 and 184.6 Mg C ha− 1. ANPP under HD was 1.9 kg C m−² year−¹ compared with 0.62 kg C m−² year−¹ under LD. There were no significant differences in TBCF between the HD and LD stands. The partitioning of GPP to ANPP was lower and to TBCF was higher under LD compared with HD.
Conclusion: Relationship between crown structure and the C stock, fluxes, and partitioning is not clear and should be used with caution for management prescriptions related to C sequestration. Also, no differences in the bole C stock and sequestration were found across varieties within the same planting density. Finally, the genetic variation among varieties and stand density significantly affected stand productivity, with stand density showing greater effect.Published versio
Explainable AI for Social Good: Applications in Mental Health, Public Health Risk, and Environmental Traceability
The ubiquitous use of machine learning and AI technology in human-centered domains such as social networks, public health, sustainable trade, and environmental forensics indicates a significant need for an adaptive, interpretable, and generalizable approach in predictive modeling. With the increasing availability of user-generated data, environmental samples, and public health records, AI-driven tools have played a significant role in predictive analysis. However, a persistent challenge remains: in domains with significant societal implications, the availability of data is often inconsistent, unstructured, and lacks fine-grained labels. Furthermore, in these application areas, understanding the prediction becomes as important as the prediction itself, as they guide a more informed intervention strategy. Most of the existing work in this domain struggles to meet this requirement by approaching it from either one size fits all modeling approach or by adapting to a very problem-specific, fine-tuned algorithm that fails to learn the inter-task dependency while not particularly focusing on explainability. Therefore, in real-world scenarios, these tools show an increased risk for practical applicability due to their black-box nature, which leads to a lack of intuitive interpretability for domain experts. These methods can often neglect underlying conditions such as spatial dynamics, socioeconomic disparities, and uncertainty. In fields like population health management and stable isotope forensics, such limitations hinder practical deployment and erode trust. Compounding this issue is the widespread adoption of large language models (LLMs), which, despite their power, are prone to hallucinations and toxicity, undermining their reliability in sensitive domains.This thesis employs active learning, multi-task learning, ante-hoc explainability, post-hoc explanations, and probabilistic Gaussian process modeling to tackle several domains of social computing that range from population mental health, epidemiological outbreak, and forensic environmental tracability analysis. The first work introduces AMMNet, a multi-task active learning model for detecting depression and anxiety from Reddit data. It combines topic-based embeddings and joint task training to improve interpretability and data efficiency over conventional LLM-based classifiers. The main contributions are: 1. It tackles the lack of a fine-grained labeled dataset for Reddit that extends beyond topic-specific subreddits by first curating a labeled dataset and then employing an active learning strategy to help with the training; 2. It proposes a novel multi-task learning model, AMMNet, that outperforms baseline models in the prediction of mental health conditions. 3. A novel model-level explanation behind our prediction due to the introduction of the task-specific feature selector in the task-specific module; and 4. It shows through extensive experiments that for domain-specific classification tasks such as this, a combination of document-level embedding and topic distribution gives the best performance across all the tasks. In the second work, DeMHeM, a multi-task model for identifying bipolar disorder and its comorbidities, is introduced. Through soft parameter sharing and focal loss, the model robustly detects nuanced mental health states and facilitates deeper community-level insight via keyphrase analysis. The main contributions are: 1. development of a novel multitask learning framework for mental health predictions; 2. implementation of a novel and effective multitask optimization algorithm; and 3. exploring post-hoc analysis using the trained model for a more fine-grained understanding of bipolar disorder and its comorbidity. The third work proposes GC-Explainer, an explainable Graph Neural Network for forecasting COVID-19 outbreak severity using only static population features. The model integrates explainability directly into its architecture, enabling transparency without post-hoc methods, and avoids the reliance on real-time or temporal data. The main contributions of the work are: 1. Unlike post-hoc methods for GNN explanation, this work proposes a novel framework, Graph-Covid-Explainer, that simultaneously gives predictions for high-risk areas as well as insights about the most important features during the training of the model. 2. It introduces a novel problem setting that tackles the paucity of historical data to identify high-risk areas during the initial outbreak that can help authorities in better preparing for future crises, and 3. it applies Graph-COVID-Explainer(GCExplainer) on real-world COVID-19 data to show that static features about mobility, socioeconomic status and spatial dependency among regions can be used to make an explainable prediction about the varied degree of severity during the early part of the outbreak, without using historical pandemic data as features.The fourth work proposes to deliver a deployed pipeline that combines Stable Isotope Ratio Analysis (SIRA) and environmental variables using a multi-task Gaussian Process framework. It provides origin tracing for timber samples with predictive uncertainty, significantly improving upon traditional spatial regression approaches. The main contributions are: (1) It presents a comprehensive multi-task Gaussian process modeling framework that supports the incorporation of auxiliary data, such as climate layers, to support origin determination. This enables the incorporation of environmental factors, imputing uncertainty to predictions, and multimodal feature integration; (2) This work is a deployed machine learning pipeline wherein physical samples are collected, subject to tests, and injected into our model to help European enforcement agencies in combating illegal timber trade by demonstrating that a claimed harvest location other than Russia is not viable; and (3) It demonstrates accuracy profiles of our approach in a controlled experiment that illustrates the interplay between SIRA values and atmospheric variables and how they affect our ability to reveal harvest location misrepresentation. This goes beyond traditional ML pipelines that only predict isotope values into an end-to-end approach that supports decision-making by enforcement agencies.
The final work combines the concepts of matrix sparsification to extract feature importance with epistemic uncertainty arising from Gaussian processes to make explainable spatial prediction of health outcome in the form of type-2 diabetes. Through a rigorous experimental design, the novel end-to-end Machine Learning framework Deep Graph Gaussian Health Net(DDHG-Net) demonstrates the effectiveness of the model compared to state-of-the-art across different metrics while also providing feature-level insights and uncertainty-aware prediction, making it more suitable for real-world applicability. Our case study on Virginia demonstrates this effectively by identifying a highly prevalent cluster of counties more accurately with high confidence, while uncertain predictions also give insight about which geographical area should conduct a more careful data collection. Overall, the proposed methodological approach laid out in this dissertation promises to be effective in different real-world application domains where explainability is paramount, and the immediate impact of these works lies in greater community welfare.Doctor of PhilosophyWith rapid digitization of the world around us and the proliferation of Artificial Intelligence, we stand at a crossroads of utilizing Artificial Intelligence for good or for the commodification of human experience. While the potential of artificial intelligence can be realized to help society tackle many of its problems in tackling mental health, community health, or environmental sustainability issues, we remain limited by a lack of labeled data and the proliferation of black box solutions that largely disregard explainability. However, in many real-world applications, transparency and explanation of a prediction are often as important as the prediction itself. In this dissertation, we explore how explainability can be incorporated in the modeling of predictive modeling of different real-world data through a more specific focus on feature-level importance and uncertainty estimation
Viscoelastic Fluid Modeling for Geophysical Applications in an Eulerian Framework
Geological processes in the Earth's lithosphere and upper mantle exhibit elastic behavior on short timescales and viscous behavior on long timescales, motivating the use of viscoelastic models for intermediate regimes. Such models are essential for capturing both stress accumulation and permanent deformation but are highly nonlinear and numerically challenging. Classical approaches in geophysical viscoelastic modeling often favor Lagrangian or mixed Eulerian--Lagrangian frameworks, which can provide high physical fidelity at the cost of increased computational complexity.
In this work, we develop viscoelastic models within a fully Eulerian framework, derived from continuum mechanics principles under the assumptions of incompressible, creeping flow. The resulting system of nonlinear partial differential equations resembles a Stokes-like formulation coupled to a Maxwell viscoelastic constitutive relation. Material objectivity is enforced through the Jaumann (Corotational) time derivative, whose numerical behavior is examined and compared with related formulations from classical viscoelastic fluid modeling. Material failure is incorporated into the models through nonlinear yielding via the von Mises yield criterion.
The resulting nonlinear systems are solved using semi-implicit time integration and Newton's method. Numerical verification and validation, including the method of manufactured solutions, reveal challenges in achieving mesh-convergent solutions under uniform refinement. Finally, we compare the proposed models against classical viscoelastic benchmark problems, providing insight into the effects of modeling and numerical choices in Eulerian geophysical viscoelasticity.Master of ScienceGeological processes in the Earth's lithosphere and upper mantle behave like elastic solids over short timescales and flow like viscous fluids over long timescales. Accurately modeling this intermediate, viscoelastic behavior is essential for understanding how stresses build up and are released, such as during subduction and earthquakes. In this work, we derive and implement a geophysically realistic model for viscoelastic flow using techniques adapted from classical computational fluid dynamics. The model also accounts for material failure through established physical criteria.
We solve the resulting models numerically and assess their accuracy and robustness using verification tests and established benchmark problems. This work contributes to ongoing efforts to develop efficient and reliable tools for modeling viscoelastic processes in the Earth