1,720,984 research outputs found

    Development of computational methods to infer cell-cell communication using single-cell RNA sequencing data

    No full text
    In the era of big data, data science and bioinformatics have a pivotal role in analyzing and interpreting complex biological data. This requires the development of advanced mathematical models and efficient algorithms to extract meaningful insights from high-volume data, such as those involved in cell-cell communication. Cell-cell communication is a vital process through which cells constantly send and receive signals with each other to coordinate and regulate numerous biological processes, from maintaining homeostasis and driving cell development to mediating disease progression. Deciphering cellular communication mechanisms is of paramount importance in comprehending the molecular basis of the physiopathology of a living organism. Advancements in next-generation sequencing technologies have revolutionized our understating of molecular biology. In this endeavor, single-cell RNA sequencing technology has emerged as powerful and invaluable tool to reveal cellular heterogeneity in gene expression patterns in a high-throughput manner and at the individual cell level, with a resolution never possible before. This technological breakthrough has led to the development of numerous computational tools designed to systematically infer cellular communication mechanisms from single-cell RNA sequencing data. However, despite the availability of various computational methods, the bioinformatics analysis of cell-cell communication remains a relatively young and rapidly evolving research area, with large room for improvement in the methodological and computational area. The aim of this Ph.D. research project was to design and develop novel computational methods to advance the analysis of cellular communication from single-cell RNA sequencing data. First, a complete characterization of the cellular communication inference landscape was performed, identifying state-of-the-art methods and highlighting the complexity of the biological questions and the diversification of the approaches proposed in literature. In response to key challenges, three computational tools, namely scSeqComm, scSeqCommDiff, and CellGOSSIP, were developed. Each tool addresses specific methodological and computational challenges associated with different aspects of cellular communication, including intercellular and intracellular signaling, as well as differential analysis across conditions. The validation and assessment of these methods demonstrated their robustness and reliability in providing accurate and biologically meaningful results. Overall, this thesis advances the state-of-the-art in the analysis of cell-cell communication, offering novel computational methods that enhance our understanding of the complexity of cellular communication mechanisms in diverse biological contexts

    Identify, quantify and characterize cellular communication from single cell RNA sequencing data with scSeqComm

    Full text link
    Motivation: Recently, single cell RNA-seq (scRNA-seq) data have been used to study cellular communication. Most bioinformatics methods infer only the intercellular signaling between groups of cells, mainly exploiting ligand-receptor expression levels. Only few methods consider the entire intercellular+intracellular signaling, mainly inferring lists/networks of signaling involved genes. Results: Here we present scSeqComm, a computational method to identify and quantify the evidence of ongoing intercellular and intracellular signaling from scRNA-seq data, and at the same time providing a functional characterization of the inferred cellular communication.The possibility to quantify the evidence of ongoing communication assists the prioritization of the results, while the combined evidence of both intercellular and intracellular signaling increase the reliability of inferred communication.The application to a scRNA-seq dataset of tumor microenvironment, the agreement with independent bioinformatics analysis, the validation using spatial transcriptomics data and the comparison with state-of-the-art intercellular scoring schemes confirmed the robustness and reliability of the proposed method. Availability: scSeqComm R package is freely available at https://gitlab.com/sysbiobig/scseqcomm and https://sysbiobig.dei.unipd.it/software/#scSeqComm. Submitted software version and test data are available in Zenodo, at https://dx.doi.org/10.5281/zenodo.5833298. Supplementary information: Supplementary data are available at Bioinformatics online

    quickSparseM: a library for memory- and time-efficient computation on large, sparse matrices with application to omics data

    No full text
    Omics data have revolutionized molecular biology by introducing large-scale data analysis, pushing the field into the realm of big data and presenting substantial challenges in data storage and analysis. Despite describing distinct aspects of molecular biology, most omics data share common characteristics, such as being representable as large, sparse matrices, and requiring similar computational approaches, mainly involving embarrassing parallel tasks across rows or columns. While R is a popular choice for omics analysis, it encounters performance bottlenecks when handling large datasets due to its reliance on dense data formats and constraints like 32-bit indexing in some structures. Even when sparse representations are utilized, the inherent limitations of R lead to inefficiencies. Additionally, its lack of native support for shared-memory parallelism prevents it from fully utilizing modern parallel computing architectures. Similarly, many other data-intensive fields that rely on R face similar challenges with large, sparse data requiring fast and memory-efficient row-wise and column-wise operations. To address these challenges, we introduce quickSparseM, a time- and memory-efficient library for storing and processing large, sparse matrices, available as an R package. Developed in C++ with OpenMP for parallelism, quickSparseM achieves efficient performance while remaining compatible with existing R-based workflows. The library utilizes the R dgCMatrix format to represent sparse matrices in a compressed, column-oriented format and provide functions to compute basic statistics and operations commonly used in omics analyses. Experiments varying dataset sizes and core counts, as well as two case studies using omics data, demonstrate the library’s efficiency and scalability. The results indicate that quickSparseM outperforms state-of-the-art R packages for sparse matrix computation in terms of time, memory usage, and scalability

    Going Beyond Counting First Authors in Author Co-citation Analysis

    Full text link
    The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed

    Variations on the Author

    Full text link
    “Variations on the Author” discusses two of Eduardo Coutinho’s recent films (Um Dia na Vida, from 2010, and Últimas Conversas, posthumously released in 2015) and their contribution to the general question of documentary authorship. The director’s filmography is characterized by a consistent yet self-effacing form of authorial self-inscription: Coutinho often features as an interviewer that rather than express opinions propels discourses; an interviewer that is good at listening. This mode of self-inscription characterizes him as an author who is not expressive but who is nonetheless markedly present on the screen. In Um Dia na Vida, however, Coutinho is completely absent form the image, while Últimas Conversas, on the contrary, includes a confessional prologue that moves the director from the margins to the center of his films. This article examines the ways in which these works stand out in the filmography of a director who offers new insights into the notion of cinematic authorship
    corecore