Princeton University

Princeton University Open Access Repository

Not a member yet

9682 research outputs found

Sort by

Data-Driven Incentive Alignment in Capitation Schemes

Author: Braverman Mark
Chassang Sylvain
Publication venue
Publication date: 2022
Field of study

This paper explores whether big data, taking the form of extensive high dimensional records, can reduce the cost of adverse selection by private insurers in government-run capitation schemes, such as Medicare Advantage. We argue that using data to improve the ex ante precision of capitation regressions is unlikely to be helpful. Even if types become essentially observable, the high dimensionality of covariates makes it infeasible to precisely estimate the cost of serving a given type: big data makes types observable, but not necessarily interpretable. This gives an informed private operator scope to select types that are relatively cheap to serve. Instead, we argue that data can be used to align incentives by forming unbiased and non-manipulable ex post estimates of a private operator’s gains from selection

Belayer: Modeling discrete and continuous spatial variation in gene expression from spatially resolved transcriptomics

Author: Ma Cong
Chitra Uthsav
Zhang Shirley
Raphael Benjamin J.
Publication venue
Publication date: 19/10/2022
Field of study

Spatially resolved transcriptomics (SRT) technologies measure gene expression at known locations in a tissue slice, enabling the identification of spatially varying genes or cell types. Current approaches for these tasks assume either that gene expression varies continuously across a tissue or that a tissue contains a small number of regions with distinct cellular composition. We propose a model for SRT data from layered tissues that includes both continuous and discrete spatial variation in expression and an algorithm, Belayer, to learn the parameters of this model. Belayer models gene expression as a piecewise linear function of the relative depth of a tissue layer with possible discontinuities at layer boundaries. We use conformal maps to model relative depth and derive a dynamic programming algorithm to infer layer boundaries and gene expression functions. Belayer accurately identifies tissue layers and biologically meaningful spatially varying genes in SRT data from the brain and skin

Machine Learning Assisted Security Analysis of 5G-Network-Connected Systems

Author: Saha Tanujay
Aaraj Najwa
Jha Niraj K
Publication venue
Publication date: 02/02/2022
Field of study

The core network architecture of telecommunication systems has undergone a paradigm shift in the fifth-generation (5G) networks. 5G networks have transitioned to software-defined infrastructures, thereby reducing their dependence on hardware-based network functions. New technologies, like network function virtualization and software-defined networking, have been incorporated in the 5G core network (5GCN) architecture to enable this transition. This has resulted in significant improvements in efficiency, performance, and robustness of the networks. However, this has also made the core network more vulnerable, as software systems are generally easier to compromise than hardware systems. In this article, we present a comprehensive security analysis framework for the 5GCN. The novelty of this approach lies in the creation and analysis of attack graphs of the software-defined and virtualized 5GCN through machine learning. This analysis points to 119 novel possible exploits in the 5GCN. We demonstrate that these possible exploits of 5GCN vulnerabilities generate five novel attacks on the 5G Authentication and Key Agreement protocol. We combine the attacks at the network, protocol, and the application layers to generate complex attack vectors. In a case study, we use these attack vectors to find four novel security loopholes in WhatsApp running on a 5G network

INFORM: Inverse Design Methodology for Constrained Multi-objective Optimization

Author: Terway Prerit
Jha Niraj K
Publication venue
Publication date: 01/01/2022
Field of study

Many system design methods use population-based optimization or a surrogate model for solving constrained multi-objective optimization. When designing a system with multiple objectives and constraints, the designer may first be interested in understanding the trade-offs among different objectives from a small number of simulations. In the next step, the designer may focus on specific regions of interest in the design space near a set of non-dominated solutions to further improve performance on the targeted objectives. This may help make the search process sample-efficient. We propose INFORM: a two-step approach for sample-efficient constrained multi-objective optimization of real-world nonlinear systems. In the first step, we modify a genetic algorithm (GA) to make the design process sample-efficient. We inject candidate solutions into the GA population using inverse design methods instead of determining the candidate solutions for the next generation using only crossover and mutation, as is done in standard GA. We present three types of inverse design techniques based on a (i) neural network verifier, (ii) neural network, and (iii) Gaussian mixture model. The candidate solutions for the next generation are thus a mix of those generated using crossover/mutation and solutions generated using inverse design. At the end of the first step, we obtain a set of non-dominated solutions. In the second step, we choose the regions of interest around the non-dominated solutions to further improve the objective function values using inverse design methods. We demonstrate the efficacy of INFORM through synthesis of nonlinear systems and analog circuits. The experimental results show that INFORM reduces synthesis time by up to 29× and improves the value of the objective function by up to 33% compared to a state-of-the-art baseline design methodology

Water Adsorption on Mica Surfaces with Hydrophilicity Tuned by Counterion Types (Na, K, and Cs) and Structural Fluorination

Author: Koishi Ayumi
Lee Sang Soo
Fenter Paul
Fernandez-Martinez Alejandro
Bourg Ian C
Publication venue
Publication date: 20/09/2022
Field of study

The stability of adsorbed water films on mineral surfaces has far-reaching implications in the Earth, environmental, and materials sciences. Here, we use the basal plane of phlogopite mica, an atomically smooth surface of a natural mineral, to investigate water film structure and stability as a function of two features that modulate surface hydrophilicity: the type of adsorbed counterion (Na, K, Cs) and the substitution of structural OH groups by F atoms. We use molecular dynamics (MD) simulations combined with in situ high-resolution X-ray reflectivity to examine surface hydration over a range of water loadings, from the adsorption of isolated water molecules to the formation of clusters and films. We identify four regimes characterized by distinct adsorption energetics and different sensitivities to cation type and mineral fluorination: from 0 to ½ monolayer film thickness, the hydration of adsorbed ions; from ½ to 1 monolayer, the hydration of uncharged regions of the siloxane surface; from 1 to 1½ monolayer, the attachment of isolated water molecules on the surface of the first monolayer; and for > 1½ monolayer, the formation of an incipient electrical double layer at the mineral-water interface

Powerful Organic Molecular Oxidants and Reductants Enable Ambipolar Injection in a Large-Gap Organic Homojunction Diode

Author: Smith Hannah L
Dull Jordan T
Mohapatra Swagat K
Al Kurdi Khaled
Barlow Stephen
Marder Seth R
Rand Barry P
Kahn Antoine
Publication venue
Publication date: 03/01/2022
Field of study

Doping has proven to be a critical tool for enhancing the performance of organic semiconductors in devices like organic light-emitting diodes. However, the challenge in working with high-ionization-energy (IE) organic semiconductors is to find p-dopants with correspondingly high electron affinity (EA) that will improve the conductivity and charge carrier transport in a film. Here, we use an oxidant that has been recently recognized to be a very strong p-type dopant, hexacyano-1,2,3-trimethylene-cyclopropane (CN6-CP). The EA of CN6-CP has been previously estimated via cyclic voltammetry to be 5.87 eV, almost 300 meV higher than other known high-EA organic molecular oxidants. We measure the frontier orbitals of CN6-CP using ultraviolet and inverse photoemission spectroscopy techniques and confirm a high EA value of 5.88 eV in the condensed phase. The introduction of CN6-CP in a film of large-band-gap, large-IE phenyldi(pyren-1-yl)phosphine oxide (POPy2) leads to a significant shift of the Fermi level toward the highest occupied molecular orbital and a 2 orders of magnitude increase in conductivity. Using CN6-CP and n-dopant (pentamethylcyclopentadienyl)(1,3,5-trimethylbenzene)ruthenium (RuCp*Mes)2, we fabricate a POPy2-based rectifying p–i–n homojunction diode with a 2.9 V built-in potential. Blue light emission is achieved under forward bias. This effect demonstrates the dopant-enabled hole injection from the CN6-CP-doped layer and electron injection from the (RuCp*Mes)2-doped layer in the diode

Ensembles of realistic power distribution networks

Author: Meyur Rounak
Vullikanti Anil
Swarup Samarth
Mortveit Henning S
Centeno Virgilio
Phadke Arun
Poor H Vincent
Marathe Madhav V
Publication venue
Publication date: 10/10/2022
Field of study

The power grid is going through significant changes with the introduction of renewable energy sources and the incorporation of smart grid technologies. These rapid advancements necessitate new models and analyses to keep up with the various emergent phenomena they induce. A major prerequisite of such work is the acquisition of well-constructed and accurate network datasets for the power grid infrastructure. In this paper, we propose a robust, scalable framework to synthesize power distribution networks that resemble their physical counterparts for a given region. We use openly available information about interdependent road and building infrastructures to construct the networks. In contrast to prior work based on network statistics, we incorporate engineering and economic constraints to create the networks. Additionally, we provide a framework to create ensembles of power distribution networks to generate multiple possible instances of the network for a given region. The comprehensive dataset consists of nodes with attributes, such as geocoordinates; type of node (residence, transformer, or substation); and edges with attributes, such as geometry, type of line (feeder lines, primary or secondary), and line parameters. For validation, we provide detailed comparisons of the generated networks with actual distribution networks. The generated datasets represent realistic test systems (as compared with standard test cases published by Institute of Electrical and Electronics Engineers (IEEE)) that can be used by network scientists to analyze complex events in power grids and to perform detailed sensitivity and statistical analyses over ensembles of networks

Induction of broadly neutralizing antibodies using a secreted form of the hepatitis C virus E1E2 heterodimer as a vaccine candidate

Author: Wang Ruixue
Suzuki Saori
Guest Johnathan D
Heller Brigitte L
Almeda Maricar
Andrianov Alexander K
Marin Alexander
Mariuzza Roy A
Keck Zhen-Yong
Foung Steven KH
Yunus Abdul S
Pierce Brian G
Toth Eric A
Ploss Alexander
Fuerst Thomas R
Publication venue
Publication date: 09/03/2022
Field of study

Hepatitis C virus (HCV) is a global disease burden, and a preventive vaccine is needed to control or eradicate the virus. Despite the advent of effective antiviral therapy, this treatment is not accessible to many patients and does not prevent reinfection, making chronic hepatitis C an ongoing global health problem. Thus, development of a prophylactic vaccine will represent a significant step toward global eradication of HCV. HCV exhibits high genetic variability, which leads frequently to immune escape. However, a considerable challenge faced in HCV vaccine development is designing an antigen that elicits broadly neutralizing antibodies. Here, we characterized the immunogenicity of a vaccine based on a soluble, secreted form of the E1E2 envelope heterodimer (sE1E2.LZ). Sera from mice immunized with sE1E2.LZ exhibited an anti-E1E2–specific response comparable to mice immunized with membrane-bound E1E2 (mbE1E2) or a soluble E2 ectodomain (sE2). In competition-inhibition ELISA using antigenic domain-specific neutralizing and nonneutralizing antibodies, sera from sE1E2.LZ-immunized mice showed nearly identical or stronger competition toward neutralizing antibodies when compared with mbE1E2. In contrast, sera from mice immunized with sE2, and to a lesser extent mbE1E2, competed more effectively with nonneutralizing antibodies. An assessment of neutralization activity using both HCV pseudoparticles and cell culture–derived infectious HCV showed that immunization with sE1E2.LZ elicited the broadest neutralization activity of the three antigens, and sE1E2.LZ induced neutralization activity against all genotypes. These results indicate that our native-like soluble glycoprotein design, sE1E2.LZ, induces broadly neutralizing antibodies and serves as a promising vaccine candidate for further development

Improved Information-Theoretic Generalization Bounds for Distributed, Federated, and Iterative Learning

Author: Barnes Leighton Pate
Dytso Alex
Poor Harold Vincent
Publication venue
Publication date: 24/08/2022
Field of study

We consider information-theoretic bounds on the expected generalization error for statistical learning problems in a network setting. In this setting, there are K nodes, each with its own independent dataset, and the models from the K nodes have to be aggregated into a final centralized model. We consider both simple averaging of the models as well as more complicated multi-round algorithms. We give upper bounds on the expected generalization error for a variety of problems, such as those with Bregman divergence or Lipschitz continuous losses, that demonstrate an improved dependence of 1/K on the number of nodes. These “per node” bounds are in terms of the mutual information between the training dataset and the trained weights at each node and are therefore useful in describing the generalization properties inherent to having communication or privacy constraints at each node

Fundamental limitations on efficiently forecasting certain epidemic measures in network models

Author: Rosenkrantz Daniel J
Vullikanti Anil
Ravi SS
Stearns Richard E
Levin Simon
Poor H Vincent
Marathe Madhav V
Publication venue
Publication date: 19/01/2022
Field of study

The ongoing COVID-19 pandemic underscores the importance of developing reliable forecasts that would allow decision makers to devise appropriate response strategies. Despite much recent research on the topic, epidemic forecasting remains poorly understood. Researchers have attributed the difficulty of forecasting contagion dynamics to a multitude of factors, including complex behavioral responses, uncertainty in data, the stochastic nature of the underlying process, and the high sensitivity of the disease parameters to changes in the environment. We offer a rigorous explanation of the difficulty of short-term forecasting on networked populations using ideas from computational complexity. Specifically, we show that several forecasting problems (e.g., the probability that at least a given number of people will get infected at a given time and the probability that the number of infections will reach a peak at a given time) are computationally intractable. For instance, efficient solvability of such problems would imply that the number of satisfying assignments of an arbitrary Boolean formula in conjunctive normal form can be computed efficiently, violating a widely believed hypothesis in computational complexity. This intractability result holds even under the ideal situation, where all the disease parameters are known and are assumed to be insensitive to changes in the environment. From a computational complexity viewpoint, our results, which show that contagion dynamics become unpredictable for both macroscopic and individual properties, bring out some fundamental difficulties of predicting disease parameters. On the positive side, we develop efficient algorithms or approximation algorithms for restricted versions of forecasting problems

0

full texts

9,682

metadata records

Updated in last 30 days.

Princeton University Open Access Repository is based in United States

Access Repository Dashboard

Do you manage Open Research Online? Become a CORE Member to access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! 👇