Rockefeller University

The Rockefeller University

Not a member yet

5430 research outputs found

Sort by

A Neuronal Circuit Motif for Leaky Vector Integration

Author: Janke Abby
Publication venue: Digital Commons @ RU
Publication date: 2025
Field of study

To efficiently navigate their environments, animals ranging from insects to humans can keep track of their trajectory by continuously summing their direction and distance traveled over time. It is not clear how brains perform this fundamental computation––known as path integration––even though neural correlates and computer models abound. In arthropods, spatial-cognition related processes such as path integration are thought to be implemented in the central complex. Experimenters have shown that contiguously arrayed populations of neurons in the Drosophila fan-shaped body, a substructure of the central complex, display sinusoidally shaped bumps of calcium activity that signal vectors relevant for spatial navigation. The phase of the bump signal represents the angle of a vector, and the amplitude represents the length. To date, the sinusoidal bumps discovered in the fan-shaped body signal real-time variables, such as a fly\u27s traveling or goal directions, rather than a vector that is integrated over time. Here we characterize a class of neurons in the fan-shaped body, called h∆G cells, whose activity drops dramatically when a fly encounters a food source (i.e., a sugar drop). After this reset, the mean activity of h∆G neurons rises over many minutes and the amplitude of a sinusoidally shaped bump signal grows and shrinks on the seconds timescale. We show that the amplitude of the bump tracks the distance the fly has walked in the past few seconds. We further argue––based on calcium imaging and optogenetic perturbations––that the bump signal in hΔG neurons is built via integration of synaptic input from v∆E cells. Based on these data, we developed a formal model for the v∆E-hΔG integration process that combines (1) a minutes-long rise in baseline activity, (2) a vΔE-driven slow-down of this rise and (3) a continuous leak of the h∆G signal. This leak helps preserve the sinusoidal shape of the bump signal but also causes the h∆G signal to not reflect a perfect (i.e., leak-free) path integral. Our experimental results focus on only vΔE and hΔG neurons, but the fly\u27s connectome reveals multiple vΔ-hΔ circuits, spanning different layers of the fan-shaped body. By varying leak rates and ramp rates across different vΔ-hΔ circuits, the fly brain could build a repertoire of path-integrated signals with spatiotemporal dynamics suitable for a range of navigational tasks

A Scent of Direction: Exploring Adaptive Olfactory Navigation in Drosophila

Author: Morton Chad
Publication venue: Digital Commons @ RU
Publication date: 2025
Field of study

Animals rely on environmental cues, such as odors, to navigate through complex and often dynamic landscapes. While olfactory navigation has traditionally been modeled as a reflexive behavior driven by immediate sensory cues, whether flies can leverage more sophisticated strategies, integrating sensory input with spatial memory and internal representations of their surroundings, remains unclear. A key challenge in studying olfactory navigation lies in the invisible and unpredictable nature of odor plumes, which has made it challenging to accurately assess the ongoing sensory experiences of an animal. To address this challenge, we developed a novel virtual reality (VR) system that head-fixed Drosophila to navigate spatially structured chemical landscapes, enabling precise control and manipulation of their sensory experience. Using this platform, we demonstrate that Drosophila employs a robust navigational strategy known as edge-tracking, in which flies ascend along the boundaries of an odor plume through a repeated pattern of rapid counter-turning to exit the plume and biased exploration outside of the plume in order to return. Our findings suggest that edge-tracking is a flexible, memory-guided behavior, wherein flies continuously update their spatial representation of the plume\u27s boundary based on sensory feedback. We further demonstrate that edge-tracking relies on dopaminergic reinforcement pathways within the mushroom body, a brain center that mediates associative olfactory learning. Anatomically, the mushroom body is well-positioned to modulate edge-tracking behavior due to its many connections with other key brain regions involved in spatial navigation and motor control, including the central complex. Our results show that optogenetic manipulations disrupting dopaminergic input to the mushroom body can promote or disrupt edge tracking, underscoring the importance of these reinforcement signals during ongoing navigation. Through a combination of behavioral experiments, modeling, and neural circuit manipulations, this work provides new insights into the mechanisms that support olfactory navigation in complex environments. By studying their behavior in a controlled virtual environment, we demonstrate that flies engage in sophisticated olfactory navigational strategies that extend beyond simple reflexive responses to sensory cues. Instead, they rely on the integration of sensory input with spatial memory to navigate through their environment

Germline Genetic Modulator of Cancer Metastasis

Author: Mei Wenbin
Publication venue: Digital Commons @ RU
Publication date: 2025
Field of study

Metastasis formation is the most critical determinant of cancer survival outcome. Identifying patients at risk for metastatic relapse facilitates clinical decision and the use of appropriate adjuvant therapy, thus having a profound impact on patient survival. A central dogma in cancer is that during tumorigenesis, cancer cells acquire somatic mutations that regulate metastatic likelihood during evolution of the primary tumor. However, despite extensive tumor sequencing efforts, such causal somatic metastasis driver mutations have not been found, limiting the discovery of biomarkers, mechanisms and drug targets of cancer metastasis. In this thesis, I report a germline genetic driver of breast cancer metastasis, suggesting that germline genetic differences between individuals underlie distinct metastatic outcomes. I identified a common missense germline variant in PCSK9 (rs562556, V474I) that associates with reduced survival in breast cancer cohorts from multiple countries. This highly prevalent variant is homozygous in ~70% of people of European ancestry. Genetic modeling of this gain-of-function single nucleotide variant in mice revealed that it causally promotes breast cancer metastasis. Conversely, host PCSK9 deletion reduced metastatic colonization in multiple breast cancer models. Host PCSK9 promoted metastatic initiation events in lung and enhanced metastatic proliferative competence by targeting tumoral LRP1 receptors, which signaled to the nucleus and repressed metastasis-promoting genes XAF1 and USP18. Mechanistically, the V474I mutation may enhance PCSK9\u27s suppression of LRP1 by increasing its binding affinity. Antibody-mediated therapeutic inhibition of PCSK9 suppressed breast cancer metastasis in a variety of murine, human, transplantable, and genetically initiated models. These findings reveal that a commonly inherited genetic alteration governs breast cancer metastasis and predicts survival—uncovering a hereditary basis underlying breast cancer metastasis. My work also highlighted the therapeutic potential of PCSK9-inhibitory therapy, which has been approved by more than 75 countries, for the prevention and treatment of metastatic breast cancer

Decoding Dynamic Gene Regulation in Hair Cell Development and Regeneration

Author: Reagor Caleb C
Publication venue: Digital Commons @ RU
Publication date: 2025
Field of study

Identifying the causal interactions between genes and their proteins during the differentiation of specialized cells such as mechanosensory hair cells in vertebrates\u27 inner ears and fishes\u27 lateral lines requires an accurate description of the time-lagged relationships between transcription factors and their target genes. Here I describe Depicting Lagged Causality (DELAY), a convolutional neural network for the inference of gene-regulatory relationships across pseudotime-ordered single-cell trajectories. I first show that combining supervised deep learning with joint probability matrices of pseudotime-lagged trajectories allows the neural network to overcome important limitations of ordinary Granger causality-based methods, for example, the inability to infer cyclic relationships such as feedback loops. The algorithm outperforms several common methods for inferring gene regulation and, when given partial ground-truth labels, predicts novel gene-regulatory networks from single-cell RNA sequencing and single-cell ATAC sequencing data sets. To validate this approach, I use DELAY to identify important genes and modules in the regulatory network for auditory hair cell development in the murine inner ear, as well as likely DNA-binding partners for two hair cell cofactors (Hist1h1c and Ccnd1) and a novel DNA-binding sequence for the transcription factor Fiz1. In zebrafish, lateral-line neuromasts can regenerate damaged hair cells by expressing genes such as atoh1a—the master regulator of hair cell fate—in progenitors known as supporting cells. To identify adaptations that promote the rapid regeneration of hair cells in larval zebrafish, I also use DELAY to infer regenerating neuromasts\u27 early gene-regulatory network. The central hub in the network, Y-box binding protein 1 (ybx1), is highly expressed in hair cell progenitors and young hair cells and its protein can recognize binding sites in the candidate regeneration-responsive promoter element for atoh1a. I show that neuromasts from ybx1 mutant zebrafish larvae display consistent, regeneration-specific deficits in hair cell number and initiate both hair cell regeneration and atoh1a expression 20% slower than in siblings. By demonstrating that ybx1 promotes rapid hair cell regeneration in neuromasts through early atoh1a upregulation, these results strongly support DELAY\u27s ability to identify key regulators of gene expression dynamics. I provide a user-friendly implementation of DELAY under an open-source license at https://github.com/calebclayreagor/DELAY

Making Molecular Movies: Using Single-Molecule Techniques to Unveil Hidden Features of Protein-Chromatin Interactions

Author: Chua Gabriella N. L.
Publication venue: Digital Commons @ RU
Publication date: 2025
Field of study

Beyond the genetic code embedded within a sequence of DNA, the physical features of the double helix as a semi-flexible polymer is a critical dimension that regulates the function of every DNA-binding protein that exists. These proteins can assemble, translocate, and change conformation while bound to DNA and can even alter its physical state, which are often crucial to their physiological roles in the cell. Eukaryotic genomic DNA is also packaged into units called nucleosomes, which offer another layer of physical regulation for proteins operating on chromatin. Previously, tracking these protein behaviors on DNA and chromatin had been technically challenging, dependent on tools that could access short-lived, heterogenous, and out-of-equilibrium interactions. It is now possible to observe these activities in real-time, owing to the advent of single-molecule fluorescence- and force-based techniques that have enabled dynamic and quantitative measurements of protein-chromatin interactions central to nuclear biology. This thesis describes my work, in collaboration with others, at harnessing single-molecule techniques to answer several pressing questions in biology. I hope it serves to highlight the utility and power of single-molecule microscopy for achieving never-before-seen multidimensional, real-time data of protein behavior potentially key to their roles in the cell. Chapter 1 intends to provide an introduction to the tool of choice, including a brief history of the technology and how its been successfully utilized prior in the literature to access mechanistic biology. It then offers a brief presentation into projects discussed in the thesis body, which contain several investigations into a sampling of diverse but all chromatin-related questions, whether surrounding the basic science of disease-prone processes or clinically relevant mutational data. Indeed, the chapter may stand alone to describe the broad utility of correlative single-molecule force and fluorescence microscopy (smCFFM) for investigating nuclear biology. Chapter 2 offers a discussion dedicated towards understanding the function of the epigentic reader, methyl-CpG-binding protein 2 (MeCP2), a protein whose mutations are solely responsible for causing Rett syndrome, a severe neurological disorder. Rett syndrome primarily affects young girls and does not currently have cure. MeCP2 is most well-known for being a reader of CpG methylation on DNA, yet it binds pervasively across the neuronal genome, in part due to its extensive disorder and multivalent binding. It is classically viewed as a transcriptional repressor, but several studies have reported its association with active genes. Finally, there are many clinically reported mutations, but how they lead to dysfunction and disease remain elusive. In sum, MeCP2\u27s substrate preferences, biological function, and dysregulation in the context of Rett syndrome remain challenging to study and an ongoing mystery for a disease-motivated field. To address these challenges, we studied MeCP2 from a previously-unexamined angle of biophysics. Using smCFFM, we found that MeCP2 uses differential dynamics on DNA and chromatin to specify methylation- and nucleosome-specific functions, which are altered when mutant forms of MeCP2 are present. We also discovered that MeCP2 preferentially binds nucleosomes over bare DNA, contrasting its canonical role as a DNA reader and providing a new therapeutic opportunity to modulate its genomic distribution in vivo. Overall, our study provides a new perspective for understanding MeCP2 function that may help clarify its role in disease. Chapter 3 takes a break from biological investigation and presents a new method for loading nucleosomes across DNA directly within a smCFFM instrument. Beyond packaging, nucleosomes serve as interactive hotspots for many important chromatin-binding proteins. The mainstream method for reconstituting nucleosome templates for single-molecule techniques has been assembling nucleosomes on custom DNA templates by salt dialysis; however, this approach suffers from biases stemming from non-native nucleosome positioning sequences, requirement for high amounts of DNA and octamers, and a lengthy timeline of one to several days of work. To address these disadvantages, we report a new method that utilizes the histone chaperone Nap1 to directly assemble nucleosomes on DNA substrates stably tethered within a smCFFM instrument. In addition to more closely resembling the physiological pathway of nucleosome formation in vivo, this method allows users to reconstitute nucleosomes on non-specific DNA sequences, easily adjust nucleosome density, and utilize significantly fewer amounts of reagents all within minutes. When uniform and specific nucleosome positioning is not needed, this protocol provides a useful way to investigate nucleosome mechanics or protein behavior on chromatin, for which example experiements are also described. Chapter 4 discusses a study on DNA replication, specifically the dynamics of the eukaryotic sliding clamp, proliferating nuclear cell antigen (PCNA). PCNA is an essential ring-shaped protein required for replication in addition to diverse cellular processes including DNA repair, chromatin maintenance, and sister chromatid cohesion. To perform its functions, it must be loaded onto and encircle DNA, which is performed by its canonical loader, replication factor C (RFC). Despite rigorous biochemical studies that have characterized RFC and PCNA\u27s substrate preferences and mechanism of loading, it had remained unknown how RFC navigates and then loads PCNA to its target DNA sites. To answer this question, we employed smCFFM to visualize RFC-PCNA complexes on varying DNA substrates, which revealed RFC frequently remains bound to PCNA after DNA loading. This is in contrast to the prevailing model in the field hypothesizing RFC is immediately ejected at this step. Additionally, we found that RFC-PCNA complexes are active for fill-in synthesis and can assemble with the lagging strand polymerase δ (Polδ). Finally, we show that this activity is dependent on the BRCT domain of Rfc1 and that deficient PCNA-Polδ fill-in can be rescued by another PCNA-binding partner and flap endonuclease, FEN1. Together our findings show how PCNA-enabled DNA synthesis is regulated by functions of its binding partners that are separate from their own catatlytic activities and assigns a role for the previously elusive RFC BRCT domain. This study is in close collaboration with Michael O\u27Donnell\u27s lab at Rockefeller. Chapter 5 examines a unique mycobacterial helicase, Lhr whose activities are required to confer cellular resistance to DNA damage. Specifically, biochemical assays have revealed that Lhr translocates along single-stranded DNA in the 3\u27 to 5\u27 direction, unwinding DNA:DNA or RNA:DNA duplexes en route. They have also shown that its helicase and tetramerization activities are dependent on an intact C-terminal domain (CTD), which are all required to confer mycobacterial resistance; however, the mechanisms behind the CTD and what its role is in resistance had remained unknown. We used smCFFM to demonstrate the CTD is required to grip Lhr on to the DNA upon 5\u27 engagement during ssDNA translocation, which explains its requirement for helicase activity. Hidden previously from bulk experiments, we also found Lhr at the 3\u27 junction exhibits single-stranded DNA reeling activity. Finally, we show that reeling frequency is increased by an intact CTD, which confers Lhr preference for binding both the 3\u27 and 5\u27 junctions as compared to single-stranded DNA. Our findings inform the mechanistic details of Lhr function at DNA and may help explain its role in DNA repair in vivo. This study is in close collaboration with Stewart Shuman\u27s lab at Memorial Sloan Kettering Cancer Center. Finally, chapter 6 presents a study surrounding the chromatin-binding protein, linker histone H1, which has been well-known for compacting nucleosomes but whose interaction with DNA has remained under-studied. H1 is a highly disordered protein and undergoes liquid-liquid phase separation with chromatin; nevertheless, the biophysical basis and biological relevance of these condensates had remained unknown. Using smCFFM, we found H1 exhibits enhanced phase separation with single-stranded DNA, which relies on multivalent and transient engagement with each other. Cellular imaging led us to propose H1 accumlates on nascent ssDNA that is formed after DNA damage, potentially protecting it from degradation. Our results highlight H1\u27s multifaceted roles at chromatin and provides a new role for the protein in the context of DNA damage. This study is in close collaboration with Yael David\u27s lab at Memorial Sloan Kettering Cancer Center

The Evolution of Ribosome Assembly: Making Ribosomes in the Mitochondria

Author: Burnside Chloe
Publication venue: Digital Commons @ RU
Publication date: 2025
Field of study

Mitochondrial ribosomes (mitoribosomes) synthesize proteins encoded within the mitochondrial genome that are assembled into oxidative phosphorylation complexes. As such, the biogenesis of mitoribosomes is essential for ATP production and cellular metabolism. Despite significant progress in elucidating the structures of mature mitoribosomes, a comprehensive understanding of their assembly pathways remains elusive. This thesis presents the first structural characterization of assembly intermediates of the mitochondrial small subunit (mtSSU) in eukaryotes. To gain mechanistic insight into the process of SSU assembly in the mitochondria, I set out to enrich and characterize intermediates of SSU assembly in the mitochondria of S. cerevisiae. By employing a combination of endogenous tagging, affinity purification, mass spectrometry, and cryo-electron microscopy, three distinct assembly intermediates (States 1-3) were successfully visualized at resolutions ranging from 3.2 to 3.8 Å, revealing the stepwise progression of mtSSU biogenesis. These structures unveiled the roles of previously uncharacterized assembly factors Rsm22 and Ccm1. Initial biochemical analysis of additional intermediates suggests the role of factors Mtg3 and Rmd9 at earlier stages of assembly (Chapter 2). Parallel work in H. sapiens was completed by Nathan Harper allowing the visualization of six assembly intermediates at resolutions ranging from 2.4 Å to 3.0 Å. These structures provide insight into the functions of assembly factors NOA1, TFB1M, RBFA, ERAL1, METTL17, and MCAT and allow visualization of the early stages of mtSSU assembly in humans for the first time (Chapter 3). Comparative analysis of assembly pathways in these 2 key species (Chapter 4) reveal conserved and unique features of mtSSU assembly in both species. By comparing conserved mechanisms of assembly within the mitochondria of these 2 species to their bacterial homologs, we shed light into the evolution of ribosome assembly in the mitochondria. This thesis significantly advances our understanding of mtSSU assembly mechanisms in yeast and humans and lays the foundation for understanding the evolutionary trajectory of SSU assembly across different species

Recording Cellular Interactions in Germinal Centers and Beyond

Author: Nakandakari-Higa Sandra
Publication venue: Digital Commons @ RU
Publication date: 2025
Field of study

In multicellular organisms, each cell has a distinct identity characterized by unique transcriptional and chromatin accessibility profiles. However, to truly understand the organization and function of such systems, it is not enough to merely characterize individual components; we must also investigate how these components interact and communicate through direct cellular interactions. This is especially true in the immune system, where cell-cell interactions are fundamental. For instance, the priming of antigen-specific T cells relies on their selective engagement with antigen-presenting cells (APCs) displaying the T cell\u27s cognate antigen, distinguishing them from a pool of non-specific counterparts. Similarly, the production of antibodies by plasma cells depends on B cell receptors capturing antigen for presentation to helper T cells. These processes exemplify the immune system\u27s dependence on its cells\u27 ability to locate their interacting partners and exchange critical information. During infection, the affinity of circulating serum antibodies increases over time in a process known as antibody affinity maturation. This phenomenon is driven by the positive selection of B cells expressing higher-affinity B cell receptors (BCRs) to expand and occurs within specialized structures called germinal centers (GCs). While it is well-established that T-B interactions are essential for the formation and maintenance of GCs, the precise role of T cell help as the primary driver of positive selection remains debated. In this work, I introduce a reingineered method, LIPSTICv2, to address this long-standing question. By integrating interaction-based single-cell transcriptomics, I provide clear evidence that T cell help is preferentially delivered to B cells expressing higher-affinity BCRs. Furthermore, the magnitude of this T cell help dictates the extent of upregulation of transcriptional programs associated with B cell clonal expansion and positive selection. Building on this work, I then developed uLIPSTIC, an enhanced version of LIPSTIC designed to enable quantitative and unbiased analysis of cellular interactions. Whereas earlier iterations of LIPSTIC were confined to interactions mediated by CD40 and CD40L, uLIPSTIC eliminates this limitation by anchoring Sortase A (SrtA) and its oligoglycine acceptor non-specifically to the cell membrane at high density. This innovation allows enzymatic labeling between any membranes that come into close proximity, vastly expanding the range of potential applications. By coupling uLIPSTIC with droplet-based scRNA-seq, quantitative interaction-based transcriptomics is achieved. This approach has two distinct applications: First, in an \u27atlas\u27 mode, it allows for the comprehensive mapping of the cellular interactome within a given donor cell population. Second, in a \u27mechanistic\u27 mode, uLIPSTIC signal intensity can be correlated with specific gene expression profiles or signatures, enabling the elucidation of molecular pathways underlying specific interactions. This body of work underscores the transformative potential of adding interaction resolution to our understanding of cellular function. Both LIPSTICv2 and uLIPSTIC represent cutting-edge technologies that allow for the detailed study of cellular interactions in vivo, offering new insights into the molecular dynamics that govern immune responses and beyond

Aromatic Amino Acid Metabolome In The Human Gut Microbiota

Author: Hsieh David Chun-cheng
Publication venue: Digital Commons @ RU
Publication date: 2025
Field of study

Simple metabolites derived from common substrates are key candidates for host-microbiota crosstalk due to the potential for convergent biosynthetic pathways. Several human signaling molecules are simple structures derived from aromatic amino acids (phenylalanine, tryptophan, and tyrosine). In this thesis, I employed targeted and untargeted mass spectrometry-based approaches to identify novel bacterial metabolites that derived from aromatic amino acids. In chapter 2, I used a 3-step mass spectrometry pipeline to identify microbiota-dependent aromatic amino acid-derived signaling molecules in vivo, identify their commensal microbial producers and identify their biosynthetic genes. This led to the identification of Enterococci and Streptococci as commensal sources of LacPhe, an exercise-inducible metabolite that regulates appetite. Monocolonization of germ-free mice with Streptococcus restored the physiological concentration of LacPhe in the ileum, and analysis of human microbiome datasets revealed the microbial LacPhe biosynthetic gene, pepV, was abundant in the GI tract and correlated inversely with obesity. This study provides an example of cross-kingdom metabolite overlap and suggests that the microbiota\u27s impact on LacPhe metabolism should be considered when examining the effect this metabolite has on appetite and obesity. In charpter 3, I described an untargeted metabolomics approach to profile the aromatic amino acid-derived metabolome of the gut microbiota. By feeding individual bacterial species with each aromatic amino acids, I discovered that these aromatic amino acids generate a diverse metabolome, much of which is absent from the current metabolite database. Among the 80 strains of human-isolated bacteria, C. difficile produces the largest number of metabolites, primarily beloning to Nacyl amino acids. Through comparison with synthetic standards, I identified 28 novel C. difficile-specific metabolites. Furthermore, C. difficile demonstrated the highest production levels of phenylacetic acid, a precursor of negative allosteric modulator (NAM) for β2-adrenergic receptor (β2AR). This work highlights several interesting metabolites specific to C. difficile, with potential biological roles that warrant further exploration

Towards Models Mesoscale Chromatin Structure and Radiative DNA Damage via Computational Simulation

Author: West Devany Walsh
Publication venue: Digital Commons @ RU
Publication date: 2025
Field of study

The spatial organization of chromatin fiber at the level of several nucleosomes–the mesoscale–is an area of active study. Recent results have shown that it differs between functional states, affects higher orders of chromosome organization, and is likely involved in transcriptional control. However, the heterogeneity of nucleosome positioning and histone variant composition in cells as well as the high density of chromatin in situ make the mesoscale difficult to study. Recent methodological advances have made it possible to derive nucleosome-resolution three-dimensional contact information directly from cells. We therefore sought to develop a structural inference tool for inferring mesoscale chromatin structures consistent with such contact data sets. We have built a Bayesian inference framework that combines a simplified worm-like chain model of DNA and steric interactions between nucleosomes with a pseudopotential that represents the structure\u27s fit to experimental data. We show how this framework can be used to fit oligonucleosome structures to high-coverage Region Capture Micro-C data at several-kilobase regions of interest, evaluate the goodness-of-fit, and discuss how the framework\u27s modularity could lead to promising future work with other types of experimental data. Additionally, we show preliminary data from radiation simulations that will be used to develop models for Radiation-Induced Correlated Cleavage experiments, which will also be incorporated into the structural inference model in future work

Genome - Wide Regulatory Principles of Essential Transcription Factors from Mycobacterium Tuberculosis

Author: Froom Ruby
Publication venue: Digital Commons @ RU
Publication date: 2025
Field of study

All life requires the expression of the genetic information stored in DNA into RNA, a process known as transcription. Hence, DNA-dependent RNA polymerases (RNAPs) perform the first step of gene expression, thereby dictating the amounts of gene products that proceed to downstream cellular reactions. The activity of RNAP can be modulated by numerous molecules including protein transcription factors (TFs), ligands, and antibiotics. Since bacteria only contain a single RNAP to express all genes, it is a prime therapeutic target. Most notably, the antibiotic rifampicin targets RNAP in the bacterial pathogen Mycobacterium tuberculosis (Mtb) and has therefore been a cornerstone of the frontline therapeutic regimen to treat tuberculosis (TB) since the 1960s. However, the rise of rifampicin-resistance necessitates new strategies to target RNAP and combat TB, which remains the leading cause of death from infectious disease worldwide. Efficient Mtb transcription requires multiple TFs that are essential for bacterial viability, pointing to specific steps of the transcription process that may be useful to target therapeutically. Yet, the direct genomic targets for most Mtb TFs are not known, limiting our understanding of Mtb gene expression. Obstacles include the lack of easily-predicted binding motifs, the degree of motif degeneracy tolerated, and the compensatory regulatory cascades triggered by perturbation of TFs in cells. In addition, the lack of tools to study non-model TFs, particularly those from difficult-to-culture microbes like Mtb, exacerbates these challenges. Specifically, we found that a major hindrance to identifying direct TF targets was the methodological gap between genomics and biochemistry. In cellulo genomics (e.g., ChIP-seq, RNA-seq) can provide genome-wide information, but pleiotropic indirect effects frequently obscure primary TF effects in cells, particularly when a TF is essential for viability or a global regulator. Conversely, in vitro transcription assays using purified RNAP can measure direct TF effects on RNA synthesis for a single gene, but these assays are too low-throughput for genome-scale transcription measurements and discovery of general principles. Therefore, the central aim of this thesis was the development of a novel cell-free genomics (CFG) method to bridge biochemistry and genomics. We reconstitute genome-wide transcription in vitro using purified components. We then quantify the output RNA using single-nucleotide-resolution RNA-seq and robust statistical analysis to count RNA 5\u27 ends (to study transcription initiation) or RNA 3\u27 ends (to study transcription elongation/termination) in the presence versus absence of essential TFs from tMb. CFG thus permits us to identify promoters and terminators whose expression is directly affected by TFs. We first validated CFG by counting the 5\u27 ends of transcripts in the presence versus absence of the cyclic AMP receptor protein (CRP), the archetypal TF that originated the study of transcription regulation in 1970. CFG revealed 90 promoters where Mtb CRP alone is sufficient to modulate transcription initiation levels. From these direct promoter targets, we re-discover the known Mtb CRP binding motif and reveal that the predicted strength of CRP binding to its consensus site is a quantitative predictor of its effect size on a given promoter. We also identify known target genes found in cellulo. Integration of the CFG-derived sufficiency regulon of CRP with the necessity regulon previously determined using RNA-seq and ChIP-seq in exponentially growing Mtb cells revealed where CRP can act autonomously; where it requires other cellular regulators to modulate transcription; and where it exerts indirect transcriptional effects. Our interdisciplinary synthesis thus provides a roadmap to gain unprecedented resolution of transcription regulatory networks. We next applied CFG to identify promoters regulated by the actinobacteria-specific transcription initiation factor holo-WhiB1 in Mtb. Holo-WhiB1 was previously intractable by other approaches, and thus its direct genomic targets were unknown in any species. We first performed a whiB1 knockdown RNA-seq time course in Mtb cells, demonstrating its essentiality and revealing its global effects on the Mtb transcriptome. We next used CFG to identify the direct effects of holo-WhiB1 on transcription initiation genome-wide. Integration of in cellulo and cell-free hits permitted the identification of promoters that are directly regulated by holo-WhiB1 in exponentially growing Mtb cells, revealing that holo-WhiB1 activates numerous genes involved in translation and fatty acid biosynthesis. CFG revealed that unlike CRP, holo-WhiB1 does not appear to bind to a consensus motif at the position where it contacts promoter DNA (directly upstream of the housekeeping –35 promoter element). Rather, holo-WhiB1 activates transcription of promoters that have suboptimal –35 elements and represses transcription from promoters that have strong –35 contacts. We validate direct holo-WhiB1 activation and repression using in vitro transcription initiation assays with minimal promoter DNA constructs, allowing further mechanistic dissection of holo-WhiB1\u27s effects using single-particle cryo-EM and functional mutagenesis. Lastly, we apply CFG to quantify the 3\u27 ends of transcripts, permitting the first genome scale in vitro quantification of transcription termination in a cell-free system. We chose to study the essential pro-termination TFs Mtb NusA (conserved in all bacteria and archaea) and Mtb NusG (the only transcription factor conserved in all three domains of life), alone and in combination. Their mechanisms have remained elusive in part because neither has a predicted nucleic acid binding motif. We validate new NusA and NusG terminator targets in the Mtb genome using gold standard in vitro transcription termination assays and reveal a novel selectivity mechanism for how NusA and NusG regulate some terminators but not others. Specifically, we find distinct contacts between NusA versus NusG on pre-termination complexes likely favor distinct mechanisms of RNA release from the RNAP active site. NusG contacts both RNAP and DNA, but not RNA; therefore, NusG fails to stimulate terminators that have AT-rich downstream DNA and thus likely favor forward translocation of RNAP to stimulate RNA release. In contrast, NusA contacts both RNA and RNAP, but not DNA; therefore, NusA fails to stimulate terminators with RNA terminator hairpins that are predicted to invade the RNA–DNA hybrid and thus likely require a concomitant rotational wrenching of the hairpin away from RNAP to stimulate RNA release. Since these NusA/G contacts with transcription complexes are highly conserved, aspects of our model may generalize across cellular life. In sum, we propose that cell-free genomics comprehensively fills a critical methodological gap in the field of gene expression. CFG facilitates the study of TFs with various modes of action, including those intractable to study by other approaches, those lacking clear bioinformatic predictors like binding motifs, or both. CFG also holds promise for broader applications, including higher-order experimental designs, other transcriptional perturbations (e.g. ligands, antibiotics), and diverse species. By complementing existing approaches, CFG brings biochemistry and enzymology into the genomics era to transform our understanding of fundamental transcriptional regulation

2,052

full texts

5,430

metadata records

Updated in last 30 days.

The Rockefeller University

Access Repository Dashboard

Do you manage Open Research Online? Become a CORE Member to access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! 👇