1,721,149 research outputs found
Working with Ontologies
Ontologies are powerful and popular tools to encode data in a structured format and manage knowledge. A large variety of existing ontologies offer users access to biomedical knowledge. This chapter contains a short theoretical background of ontologies and introduces two notable examples: The Gene Ontology and the ontology for Biological Pathways Exchange. For both ontologies a short overview and working bioinformatic applications, i.e., Gene Ontology enrichment analyses and pathway data visualization, are provided
mully: an R package to create, modify and visualize multilayered graphs
The modelling of complex biological networks such as pathways has been a necessity for scientists over the last decades. The study of these networks also imposes a need to investigate different aspects of nodes or edges within the networks, or other biomedical knowledge related to it. Our aim is to provide a generic modelling framework to integrate multiple pathway types and further knowledge sources influencing these networks. This framework is defined by a multi-layered model allowing automatic network transformations and documentation. By providing a tool that generates this model, we aim to facilitate the data integration, boost the reproducibility and increase the interoperability between different sources and databases in the field of pathways. We present mully R package that allows the user to create, modify and visualize graphs with multi-layers. The package is implemented with features to specifically handle multilayered graphs
A simulation framework for correlated count data of features subsets in high-throughput sequencing or proteomics experiments
As part of the data processing of high-throughput-sequencing experiments count data are produced representing the amount of reads that map to specific genomic regions. Count data also arise in mass spectrometric experiments for the detection of protein-protein interactions. For evaluating new computational methods for the analysis of sequencing count data or spectral count data from proteomics experiments artificial count data is thus required. Although, some methods for the generation of artificial sequencing count data have been proposed, all of them simulate single sequencing runs, omitting thus the correlation structure between the individual genomic features, or they are limited to specific structures. We propose to draw correlated data from the multivariate normal distribution and round these continuous data in order to obtain discrete counts. In our approach, the required distribution parameters can either be constructed in different ways or estimated from real count data. Because rounding affects the correlation structure we evaluate the use of shrinkage estimators that have already been used in the context of artificial expression data from DNA microarrays. Our approach turned out to be useful for the simulation of counts for defined subsets of features such as individual pathways or GO categories
Utilizing Molecular Network Information via Graph Convolutional Neural Networks to Predict Metastatic Event in Breast Cancer
Gene expression data is commonly available in cancer research and provides a snapshot of the molecular status of a specific tumor tissue. This high-dimensional data can be analyzed for diagnoses, prognoses, and to suggest treatment options. Machine learning based methods are widely used for such analysis. Recently, a set of deep learning techniques was successfully applied in different domains including bioinformatics. One of these prominent techniques are convolutional neural networks (CNN). Currently, CNNs are extending to non-Euclidean domains like graphs. Molecular networks are commonly represented as graphs detailing interactions between molecules. Gene expression data can be assigned to the vertices of these graphs, and the edges can depict interactions, regulations and signal flow. In other words, gene expression data can be structured by utilizing molecular network information as prior knowledge. Here, we applied graph CNN to gene expression data of breast cancer patients to predict the occurrence of metastatic events. To structure the data we utilized a protein-protein interaction network. We show that the graph CNN exploiting the prior knowledge is able to provide classification improvements for the prediction of metastatic events compared to existing methods
rBiopaxParser — an R package to parse, modify and visualize BioPAX data
Motivation: Biological pathway data, stored in structured databases, is a useful source of knowledge for a wide range of bioinformatics algorithms and tools. The Biological Pathway Exchange (BioPAX) language has been established as a standard to store and annotate pathway information. However, use of these data within statistical analyses can be tedious. On the other hand, the statistical computing environment R has become the standard for bioinformatics analysis of large-scale genomics data. With this package, we hope to enable R users to work with BioPAX data and make use of the always increasing amount of biological pathway knowledge within data analysis methods. Results: rBiopaxParser is a software package that provides a comprehensive set of functions for parsing, viewing and modifying BioPAX pathway data within R. These functions enable the user to access and modify specific parts of the BioPAX model. Furthermore, it allows to generate and layout regulatory graphs of controlling interactions and to visualize BioPAX pathways
Going Beyond Counting First Authors in Author Co-citation Analysis
The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation
counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings
are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that
only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into
account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed
New perspectives: systems medicine in cardiovascular disease
Background
Cardiovascular diseases (CVD) represent one of the most important causes of morbidity and mortality worldwide. Innovative approaches to increase the understanding of the underpinnings of CVD promise to enhance CVD risk assessment and might pave the way to tailored therapies. Within the last years, systems medicine has emerged as a novel tool to study the genetic, molecular and physiological interactions.
Conclusion
In this review, we provide an overview of the current molecular-experimental, epidemiological and bioinformatical tools applied in systems medicine in the cardiovascular field. We will discuss the status and challenges in implementing interdisciplinary systems medicine approaches in CVD
Variations on the Author
“Variations on the Author” discusses two of Eduardo Coutinho’s recent films (Um Dia na Vida, from 2010, and Últimas Conversas, posthumously released in 2015) and their contribution to the general question of documentary authorship. The director’s filmography is characterized by a consistent yet self-effacing form of authorial self-inscription: Coutinho often features as an interviewer that rather than express opinions propels discourses; an interviewer that is good at listening. This mode of self-inscription characterizes him as an author who is not expressive but who is nonetheless markedly present on the screen. In Um Dia na Vida, however, Coutinho is completely absent form the image, while Últimas Conversas, on the contrary, includes a confessional prologue that moves the director from the margins to the center of his films. This article examines the ways in which these works stand out in the filmography of a director who offers new insights into the notion of cinematic authorship
Appropriate Similarity Measures for Author Cocitation Analysis
We provide a number of new insights into the methodological discussion about author cocitation analysis. We first argue that the use of the Pearson correlation for measuring the similarity between authors’ cocitation profiles is not very satisfactory. We then discuss what kind of similarity measures may be used as an alternative to the Pearson correlation. We consider three similarity measures in particular. One is the well-known cosine. The other two similarity measures have not been used before in the bibliometric literature. Finally, we show by means of an example that our findings have a high practical relevance.information science;Pearson correlation;cosine;similarity measure;author cocitation analysis
- …
