Knowledge UChicago

University of Chicago

Knowledge UChicago
Not a member yet
    15064 research outputs found

    Steering Model Robustness via Minimal Training Data Modification

    No full text
    Data has been the core for training machine learning models because it provides the foundation where the model learns patterns, features and structures.Recent advances in machine learning have resulted in models of increasing size and complexity. For example, recent generative models are trained on billions of samples and feature billions of parameters. Given the massive amounts of training data, many assume that these large models are inherently robust to training-time attacks, because it would require modifying a significant portion of the training data to compromise the model's robustness. This dissertation challenges this prevailing assumption, and asks: "Is it possible to steer a model's security behavior by injecting minimal yet strategically optimized data to its training data?" This dissertation addresses this question by empirically validating and analytically verifying its feasibility through defenses on deep neural networks (DNNs) and attacks on large text-to-image generative models. For DNNs, existing work proposes to include intentionally modified samples in the training data to defend against inference-time attacks. However, the fundamental tension between robustness and accuracy remained unsolved. Moreover, practitioners do not have any practical tool to flexibly control the model's security property once a deployed model is breached. Towards these, my work develops a theoretical understanding on the robustness-accuracy tradeoff in training-time defenses by characterizing the optimal loss in training robust models. To further improve model robustness post deployment, I propose a fast and robust model versioning mechanism through injecting minimal task-irrelevant data to the training data, which allows model owners to recover from model breaches. My work also explores the feasibility of manipulating performance of generative models through poisoning attacks against large text-to-image models. My work shows that large text-to-image models, although trained on billions of samples, are surprisingly vulnerable to low-volume optimized attacks against specific prompt during training. My research analyzes the culprit of such vulnerability, which lies in the tension between model's architecture design and data complexity. This line of work has turned into a practical protection tool for human creatives against training on unauthorized data. This dissertation provides an understanding of model robustness under optimized training data modification from both empirical studies and theoretical analysis. The feasibility to steer model robustness with minimal data enables continued control for model owners and proactive protection for data contributors

    Polymorphism in Self-Assembly of Short Peptoid Sequences

    No full text
    Due to various applications enabled by diverse morphologies of self-assembled sequence-defined polymers, controlling the self-assembly of synthetic peptidomimetics into designed morphologies has emerged as a promising route for the development of bioinspired functional materials. Herein, we report morphological control over the assembly of a series of short peptoids, or poly-N-substituted glycines, that contain asymmetric hydrophobic domains. We demonstrate that the inherent flexibility of amphiphilic peptoid bilayers drives assembly polymorphism, resulting in the coexistence of nanosheets, twisted ribbons, and nanofibers─three distinct morphologies. By tuning peptoid molecular interactions through variations in sequence design, solution pH, and temperature, we demonstrate precise control over the twisting and folding of peptoid bilayers, enabling the formation of well-defined nanosheets and nanohelices. Molecular dynamics simulations further unravel how the introduction of asymmetric hydrophobic domains enables the flexibility of peptoid bilayers and results in peptoid assembly polymorphism. By tuning peptoid molecular interactions through heating, we further demonstrate the transformation of nanosheets into nanohelices. We envision that our mechanistic investigation of peptoid assembly polymorphism provides a strong foundation for leveraging peptoid sequences and chemistries to achieve controlled molecular interactions, driving the creation of biomimetic materials with tailored morphologies and functionalities

    Sound communities

    No full text
    Bilingualism researchers have intensively studied how learning and using multiple languages affects all levels of linguistic structure. In this strand, examining diversity in the bilingual experience and the extent to which variables like language dominance regulate crosslinguistic interaction has been of special interest. However, most studies sample small groups of bilinguals from a single research site, creating a twofold generalizability problem. First, with small samples it is unlikely that researchers will be able to fully capture and quantify the range of variables known to affect findings. Second, when bilinguals are recruited from a single site, it is impossible to determine if findings are site-specific or apply to bilinguals more broadly. To address these issues, we propose a large(r)-scale, multisite approach to bilingualism research. We believe that such an approach, when informed by open science practices, has the potential to significantly advance the state of the art

    Detecting Fake People in Historical Records

    No full text
    Data quality is a key input in efforts to link individuals across census records. We examine the extreme case of low data quality by identifying US census enumerators who fabricated entire families. We provide clear evidence of fake people included in the 1920 US Census for Homestead, Pennsylvania. We use the features of this case study to identify other places where information in the census may have been falsified. We develop an automated approach that identifies census sheets that have much lower match rates to other census records than would be expected, given the characteristics of the people recorded on each sheet. We perform a hand-check on the suspicious sheets using standard genealogy tools and identify at least 90 sheets where the entire census sheet appears to have been fabricated

    Ion-Specific Effects on Phase Separation of Polyelectrolytes and Polyzwitterions

    No full text
    Polyelectrolytes and polyzwitterions have shown great promise for a wide range of applications, including ion-separation technologies for the recovery of critical minerals, like the Rare-Earth Elements (REE). In this field, one potential strategy is the use of polymers that form reversible crosslinks in the presence of specific multivalent ions. Despite significant advances in the characterization of charged polymers in different hydrated environments, the mechanism by which local ionic structure and polymer architecture affects selective ion capture has yet to be established. In this dissertation, we discuss the interactions between charged polymers—two polyanions and a polyzwitterion—with Rare-Earth Element (REE) cations. Specifically, this work examines the phase separation behavior of almost identical, fully ionized, carboxylate-bearing polymers upon the addition of trivalent REE ionic species. The effects of ion identity and monomer chemical structure on polymer phase behavior are systematically studied using optical microscopy, inductively coupled plasma mass spectrometry (ICP-MS), and small angle X-ray scattering (SAXS). We show that carboxylate-bearing polymers phase separate upon the addition of REE3+ ions, systematically taking up stoichiometric amounts of ion in the process. Phase separation proceeds in a similar manner regardless of the trivalent cation used and is unaffected by other non-trivalent ionic species. In the presence of multiple REE3+ ions, the studied polymers show preferential uptake of specific elements into the precipitate, primarily samarium and adjacent elements. In slightly basic environments, the polyzwitterion exhibits higher affinity for heavier REEs compared to the polyanions. These selectivity trends are linked to energetic costs of ion dehydration and microstructural characteristics of the formed precipitates. Phase separation can be reversed by the addition of acid, catalyzing the release of ions back into solution. These results provide a framework for the design and development of more efficient and sustainable selective ion-separation systems for REE recovery

    The Impact of IP Version on Household Internet Speed: A Comparative Study

    No full text
    Despite extensive research measuring broadband quality, limited work has considered the interaction between residential Internet throughput and IP protocol. Public speed tests used to determine Internet quality do not, by and large, control for IP version; results, analysis, and recommendations are made using a hybrid of IPv4 and IPv6 data. Given the range of protocol designs, software stacks, and network infrastructure differences between the protocols, combined with the recent significant increase in IPv6 adoption, it is critical to understand what role the IP protocol plays in measuring Internet speeds. In this work, we systematically compare IPv4 and IPv6 speeds in residential access networks, examining differences in throughput experienced by households depending on IP version. Our findings demonstrate that IPv4 and IPv6 throughput differ in many instances and motivate a large-scale re-evaluation of our assumptions on IP version in future speed test analysis. Specifically, we find that IPv4 and IPv6 speeds differ in a significant number of cases, with up to 18.3% of our measurements differing by over 5%. Our findings indicate that substantial speed differences between IP versions can be driven by provider-specific factors such as speed tiers. Furthermore, we observe differences in IPv4 and IPv6 data depending on the speed-test software and testing infrastructure. Thus, this work guides future research about Internet speeds on how to consider and control for IP version.</p

    Progeny, Fertility, and Divine Blessing: An Analysis of Pentateuchal Texts in Their Iron Age Context

    No full text
    My dissertation analyzes portrayals of infertility and anxiety related to reproduction in the Priestly (P), Elohistic (E), and Yahwistic (J) sources of the Pentateuch alongside material evidence from the Iron Age Levant. By situating the composition of the pentateuchal texts in the Neo-Assyrian period, I engage with questions of how biblical authors responded to, described, and solved problems of survival during the period of Assyrian hegemony, an era characterized by death and destruction. In the first section of the project, utilizing the Neo-documentary hypothesis approach, I provide an updated source division and translation of pentateuchal texts employing a philologically rigorous, historical-critical method alongside insights from gender studies and disability studies. I analyze J, E, and P as independent, coherent narratives, arguing that each source differs in its portrayals of divine intervention and male and female autonomy to overcome infertility and challenges to survival of offspring. While synchronic approaches tend to emphasize a singular divine promise of land and progeny, I demonstrate that these sources contain distinct ideological claims about how the Israelite deity intervenes to ensure population growth. In the latter section of the project, I turn to evidence from art and archaeology of the Levant with a focus on Judean Pillar Figurines (JPFs), which emerge as a phenomenon during the Neo-Assyrian period. I argue that written and visual evidence of how Ištar’s cult was adopted, adapted, and integrated alongside local religious symbols offers an important lens for examining JPFs as religious objects. I analyze nine complete figurines and the archaeological contexts in which they were excavated, which include which include domestic, funerary, and administrative loci. By discussing the figurines as evidence of cultural discourse related to reproduction, I conclude that they offer a point of contrast to biblical ideas, indicating a plurality of beliefs and practices among scribal elites and members of non-elite households in Judah during this period. As biblical writers may have written their accounts to gain authoritative status or to stand apart from alternate views, my project analyzes archaeological evidence as an “additional voice,” representing a perspective that existed alongside the variety of ideas contained in the biblical texts

    Open data for PIHD manuscript

    No full text
    This project contains de-identified raw data underlying analyses reported in the paper "Identifying psychiatrist characteristics associated with likelihood of recommending involuntary hospitalization for patients using a novel tool to assess decision-making". Includes worksheet with raw data and worksheet with codes for variables

    Editor’s Foreword

    No full text
    The subject of this special issue of Mamlūk Studies Review is “The Languages of the Mamluk Empire.” It brings together several papers presented at the Ninth Conference of the School of Mamluk Studies, which was held at Brown University in Providence, Rhode Island (USA), in June 2023

    13,029

    full texts

    15,064

    metadata records
    Updated in last 30 days.
    Knowledge UChicago is based in United States
    Access Repository Dashboard
    Do you manage Open Research Online? Become a CORE Member to access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! 👇