Columbia University

Columbia University Academic Commons
Not a member yet
    49755 research outputs found

    Privacy-Preserving Techniques for Genotype-Phenotype Association Studies

    No full text
    Genotype-phenotype association studies play a central role in precision medicine by enabling researchers to identify genetic variants that influence complex traits and diseases with applications ranging from disease risk prediction to treatment stratification. Notable approaches include genome-wide association studies (GWAS), quantitative trait loci (QTL) mapping, and increasingly, machine learning models that learn predictive relationships between genotype and phenotype data. The power of these approaches grows substantially in collaborative research settings, where pooling data across cohorts increases the statistical power for variant discovery, and sharing trained machine learning models enables institutions with limited computational resources to perform inference on their own cohorts. However, these collaborative gains come with a fundamental privacy-utility tradeoff: sharing raw genomic and phenotypic data can expose sensitive information about study participants, and sharing machine learning models introduces the risk of information leakage through model parameters. These concerns are amplified by strict data governance policies and institutional silos that restrict data movement, underscoring the urgent need for privacy-preserving frameworks to enable secure, collaborative analysis while maintaining statistical rigor and practical scalability. This dissertation develops end-to-end, cryptography-based solutions for privacy-preserving genotype-phenotype association studies, and addresses the entire analysis pipeline from preprocessing and harmonization of phenotype data to secure multi-site eQTL mapping to inference on sensitive data using machine learning models. The overarching goal is to design frameworks that are designed to (i) provide strong cryptographic security guarantees, (ii) be computationally scalable to large multi-institution cohorts, and (iii) be statistically robust by preserving accuracy comparable to non-secure baselines. In Chapter 3, we design a privacy-preserving gene expression preprocessing framework that addresses key bottlenecks in federated transcriptomic studies. Our framework provides two secure normalization options — quantile normalization (QN) and relative log expression (RLE) — to allow flexibility depending on data sharing standards, a secure multiparty computation (MPC)-based protocol for inverse normal transformation, and a scalable local principal component analysis (PCA)-based hidden covariate correction strategy. We validate our approach using both simulated multi-institution datasets and real-world gene expression data to show that our methods achieve phenotype correction accuracy comparable to centralized, non-secure pipelines while maintaining privacy of individual-level data. These results demonstrate that federated preprocessing with local computation is feasible and effective for collaborative studies. Building on this foundation, Chapter 4 introduces privateQTL, a secure and scalable framework for multi-center eQTL mapping. privateQTL implements practical genotype and phenotype correction strategies, including genotype population stratification via projection on public reference panels and the privacy-preserving gene expression preprocessing discussed in Chapter 3. We further propose a one-shot matrix multiplication approach that enables efficient nominal association testing and permutation-based false discovery control without repeated communication rounds, significantly reducing runtime. Our evaluation compares privateQTL against meta-analyses and centralized pipelines across multiple axes — eGene and eVariant discovery rates, robustness to batch effects, statistical power, runtime, and memory footprint — using both simulated federated datasets and real-world multi-site data with known batch heterogeneity. Our results demonstrate that privateQTL achieves superior discovery rates compared to meta-analyses, particularly in heterogeneous data settings, while maintaining strong privacy guarantees under a semi-honest adversary model. Finally, in Chapter 5, we address the emerging challenge of using secure machine learning inference for sensitive genomic data. We present two HE-based frameworks: (i) a secure inference method for linear models where both inputs and model weights are encrypted, enabling end-to-end confidential inference without compromising predictive performance, and (ii) a method for secure inference on transformer architectures using approximations for non-linear functions. For linear model inference, we introduce an efficient encoding method that improves computational efficiency during encrypted dot product computation, and for transformer inference, we develop polynomial approximations for nonlinear functions such as softmax, ReLU, and layer normalization to balance computational feasibility with model accuracy. We validate our linear model inference framework on both continuous and binary phenotype prediction tasks using simulated and real data, achieving performance comparable to plaintext inference. For transformer inference, we discuss the challenges of implementing our approximations in a practical and scalable setting for large scale transformer inference and lay the grounds for future work. Collectively, this dissertation makes significant contributions to the field of privacy-preserving biomedical informatics. By providing scalable, modular, and cryptographically sound methods for phenotype preprocessing, federated eQTL mapping, and secure machine learning inference, this work enables collaborative genomic research while rigorously protecting sensitive participant data. The frameworks and findings presented here create a foundation for future developments in privacy-aware collaborative studies, advancing the realization of precision medicine in a manner that respects individual privacy, complies with regulatory requirements, and preserves scientific reproducibility

    AgMIP Policy Brief: Advancing Climate Resilient Agriculture In Ghana

    No full text
    The study titled “AgMIP Demand-Scoping Study in Sub-Saharan Africa,” conducted by the Agricultural Model Intercomparison and Improvement Project (AgMIP) and led by the University of Ghana, engaged representatives from government, including the Ministry of Food and Agriculture and the National Development Planning Commission, research institutions, academia, farmer-based organizations, and development partners through interviews and a workshop held in Accra, Ghana, in May 2025. The interviews asked policy and decision makers what science-based information they need. The workshops prioritized the science needs communicated in interviews and discussed how to improve collaborations between researchers and stakeholders to co-produce the science. This policy brief summarizes insights from the interviews and the workshop. It illustrates how AgMIP's Integrated National to Regional Assessments (INaRA) framework can be advanced by integrating the Rural Investment and Policy Analysis (RIAPA), a national economic model developed by the International Food Policy Research Institute (IFPRI). This enhanced framework can provide the evidence base that policy and decision makers are asking for to help advance policy and investment responses to the urgent climate crisis and growing food insecurity challenges

    Climate Predictability Tool version 18.8.7

    No full text
    The Climate Predictability Tool (CPT) is a software package for constructing a seasonal climate forecast model, performing model validation, and producing forecasts given updated data. Its design has been tailored for producing seasonal climate forecasts using model output statistic (MOS) corrections to climate predictions from general circulation model (GCM), or for producing forecasts using fields of sea-surface temperatures or similar predictors. Although the software is specifically tailored for these applications, it can be used in more general settings to perform canonical correlation analysis (CCA), principal components regression (PCR), or multiple linear regression (MLR) on any data, and for any application

    A Light in South America: The Impact of the International Research Network on Climate Resilience

    No full text
    The International Climate Resilience Research Network (RIPERC) is a transdisciplinary initiative founded in 2019 to strengthen climate resilience and sustainable development in the trinational region of Brazil, Argentina, and Paraguay. Coordinated by the State University of Western Paraná (UNIOESTE), RIPERC connects universities, research centers, public institutions, and civil society to generate open-access social, economic, and environmental indicators that support evidence-based public policy and community action aligned with the UN Sustainable Development Goals. This case study focuses on pilot activities in Cascavel and Foz do Iguaçu, territories marked by intensive agribusiness and tourism alongside significant socio-environmental challenges, including greenhouse gas emissions, pesticide contamination, and shared watershed vulnerabilities. By promoting cross-border integration, environmental education, and continuous monitoring, RIPERC demonstrates the value of international research networks in addressing transboundary climate risks and fostering locally grounded, scalable resilience strategies

    Treatment Effects with Targeting Instruments

    No full text
    Multivalued treatments are commonplace in applications. We explore the use of discrete-valued instruments to control for selection bias in this setting. Our discussion revolves around the concept of targeting: which instruments target which treatments. It allows us to establish conditions under which counterfactual averages and treatment effects are point- or partially-identified for composite complier groups. We explore the additional identifying power of a positive selection assumption. We illustrate its usefulness by revisiting the findings of Kline and Walters (2016) on the Head Start Impact Study. We derive informative bounds that suggest less beneficial effects of Head Start expansions than their parametric estimates. Keywords: identification, selection, multivalued treatment

    Governing AI Infrastructure: How India Can Steer the Boom Towards Sustainability and Security

    No full text
    Across the European Union, protests against data centres have become a common sight. In Ireland, grid operators have frozen new connections. In the Netherlands, municipalities have restricted where facilities can be constructed. In Germany, lawmakers now mandate waste-heat reuse and renewable power for large facilities. The European Union, once content to let cloud infrastructure grow silently in the background, is imposing restrictive guidelines under its expanding AI governance framework—explicitly linking artificial intelligence to energy security, climate targets, and public scrutiny through policy programmes such as the AI Continent Action Plan and the “Apply AI” strategy

    Complete Bibliography of Cited Works for David M. Carr, Unmaking Eden: Genesis and the Domestication of the World (Cambridge University Press, 2026)

    No full text
    This file is an initial typescript version of the complete bibliography of all works cited in David M. Carr's book, Unmaking Eden: Genesis and the Domestication of the World (Cambridge: Cambridge University Press, 2026). The book's bibliography only contains a selection of these works

    Essays on Informational Efficiency of the Financial Markets

    No full text
    This dissertation studies the effects of informational frictions and information acquisition on asset prices and investor decision-making. Chapter 1 build models for capital market opening based on the noisy rational expectations equilibrium (NREE) framework. To reconcile the mixed empirical evidence on whether domestic investors have an advantage over foreign investors in domestic stock market tradings, I distinguish soft information, which is qualitative and cannot be transmitted if the receiver is not physically present, and hard information, which can be stored in the form of data, purchased, and transmitted regardless of the presence of the receiver. The models feature a mismatch of data processing technology level and availability of soft information. Furthermore, the partial opening model also features a mismatch between the accuracy of hard information and foreign investability of shares. The equivalent data amount and the information spillover rate governs domestic and foreign institutional investors' data choices, respectively. Sufficiently, when the unit precision of hard information for foreign investable shares directly from processing data for them exceeds that indirectly acquired through processing data for foreign non-investable shares, foreign institutional investors would exhaust data processing capacity on foreign investable shares. The proportion of foreign institutional investors and their technology level both need to be large enough to overcome the dilution of soft information. Given that the signal-to-noise ratio increases, if the domestic investors learn more about foreign investable shares or the proportion of foreign investors is larger than a threshold, then the average precision of dividend innovation for foreign investable shares increases. In the full opening model, the average posterior precision of dividend innovation changes in the same direction as the signal-to-noise ratio for all shares. The case of China is characterized by a very small proportion of foreign investors, a large proportion of domestic individual investors, a large disparity between domestic and foreign data processing technology, and high precision of soft information. Foreign institutional investors would allocate their data processing capacity in both type of shares, and the signal-to-noise ratio for foreign investable shares decreases. Domestic institutional investors are also induced to learn slightly less about those shares, and thus the average posterior precision of dividend innovation decreases. In the full opening counterfactual, the signal-to-noise raios for both type of shares increase. Chapter 2 fills a gap in the theory literature of NREE models by studying the effects of learning about future demand shocks. Learning about future demand shocks makes price more informative in the future but less informative now. Ex-ante utility improves through what I term as the future uncertainty channel: on the aggregate level, fundamental information acquisition decreases, which makes dividends more risky and leads to higher risk premium; on the individual level, the investor is more certain about the asset payoff in the future due to the signal about future demand shocks she acquires today. As a result, her ex-ante utility improves. Chapter 3 zooms in on the learning process and highlights a specific subprocess---the perception process. Agents take the same sensory information as inputs, direct them through different perceptual sets and thus form different perceptual information, which is then used to update posterior beliefs. Biased perceptual sets thus lead to persistent disagreement among agents, while set revision corrects any perceptual bias and allows agreement to be reached. Depending on the intensity of perceptual shocks, agents' sensitivity to perceptual shocks and their initial entrenchment levels, there could be overreaction or underreaction in the price to sensory information in the markets. When overreaction happens, the trading volume also increases

    Neoliberalismo global: arte, tiempo y feminismos en Argentina. 1990s-2000s

    No full text
    Esta investigación examina las relaciones entre el neoliberalismo, la globalización y las artes en Argentina durante el auge de las políticas neoliberales en los años noventa y hacia su desestabilización con la crisis social y económica de comienzos del siglo XXI. En un contexto artístico sin un mercado consolidado ni industria cultural a gran escala, el trabajo propone un acercamiento teórico y analítico a los efectos micropolíticos del neoliberalismo global, centrado en la formación de subjetividades, de modos y de prácticas de vida en la sociedad y en las artes. El estudio aborda diversos ejes: la circulación de discursos sobre globalización y tecnología en periódicos y revistas culturales; el surgimiento de espacios de formación artística independientes de la educación oficial y en diálogo con el ámbito internacional; un conjunto de prácticas artísticas impulsadas por premisas feministas; y los impactos de la crisis económica sobre estas dinámicas. Entre los casos analizados se encuentran la Beca Kuitca y el Taller de Barracas, surgidos entre 1991 y 1994, orientados al formato de clínica de arte y a promover el intercambio entre artistas y alianzas internacionales. El estudio se concentra en las obras allí producidas, las discusiones entre artistas y los encuentros impartidos por figuras como Roberto Jacoby y Jorge Gumier Maier, quienes cuestionan los fundamentos del modelo de clínica. Asimismo, la tesis se enfoca especialmente en la producción de las artistas Alicia Herrero y Ana Gallardo. El análisis aborda un entramado crítico, basado en premisas feministas, que en la obra múltiple de las artistas vincula colonización, patriarcado y contemporaneidad mediante prácticas pictóricas, objetuales, performáticas y de autogestión. La investigación traza así el despliegue de una visión y una práctica conflictivas y críticas dirigidas a los procesos de globalización neoliberal desde Argentina y revela un mapa marcado por contradicciones asociadas al monopolio contemporáneo y colonial del saber, a la configuración consumista de la sociedad neoliberal, a la historia y el tiempo como entidades mutables y a la precarización atravesada por desigualdades de género

    The Reinvention of the Sexual: Radical Sexual Cultures, Politics, and Queer Marxism in Cold War Brazil and Cuba

    No full text
    The Cold War in Latin America is marked as much by the spread of leftist revolutionary political organizations and Cuban-inspired guerrillas as by the emergence of radical sexual cultures and politics throughout the region. From Brazilian alternative media and activisms to Cuban painting, sex education, and cinema, in my thesis I study homosexual and gender and sexual non-normative intellectual and artistic production, forms of political action, sex education initiatives, and queer Marxist cultural and political practices in Brazil and Cuba between the 1970s and the early 1980s. In Chapter One, I examine the centrality that the critique of guerrilla sexual politics and socialist cisheteronormativity had in pioneering 1970s homosexual intellectual and cultural production in Brazil and Cuba. I build an archive for this chapter structured around the Brazilian homosexual newspaper Lampião da Esquina and the work of Cuban painter Servando Cabrera Moreno. In Chapter Two, I move on to an analysis of the actual existing alternative radical sexual politics emerging in the late 1970s in both countries and aiming to transform Brazilian and Cuban societies. My focus in this chapter is on a memoir written by Monika Krause, the head of the Cuban Sex Education Program, on the relationship between Cuba and East Germany in the field of sex education, and on the theoretical production and politics of the Homosexual Faction of the Brazilian Trotskyist organization Socialist Convergence. In Chapter Three, I examine why the interrogation of the relation between cisheteronormativity, racialization, and the production of labor took central stage in the formulations of rising black activists and artists in Brazil and Cuba. In this chapter, I focus on the early 1980s intellectual production of the Brazilian black homosexual activist group Adé Dúdu, and on One Way or Another, the posthumous feature-length film of Cuban revolutionary filmmaker Sara Gómez. Queer Marxism plays a double role in my thesis. It is the main field of research in which I situate my dissertation and an object of inquiry within my own archive. Above all, I show in my thesis that the questioning of continuities between twentieth-century socialist gender and sexual epistemologies and politics with bourgeois idealism, of patriarchal social reproduction, and of a racialized brand of cisheteronormativity structuring the division of labor weaves together my archive. Finally, I also argue that the liberationist-form of cultural and political action is not enough to trace the radical sexual cultures and politics of Latin America, nor the broader struggle for sexual freedom that overcame Cold War divisions between capital and twentieth-century socialism. Often informed by existing internationalist networks and shared political commitments, my argument is that a dematerialized consideration of gay liberationism fails to grasp how local dynamics of accumulation and dispossession and revolutionary challenges to them distinctively shaped the political imaginary of the radical sexual cultures and politics emerging in Cold War Latin America

    35,413

    full texts

    49,755

    metadata records
    Updated in last 30 days.
    Columbia University Academic Commons is based in United States
    Access Repository Dashboard
    Do you manage Open Research Online? Become a CORE Member to access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! 👇