1,720,964 research outputs found
Discriminating Graph Pattern Miningfrom Gene Expression Data
We consider the problem of mining gene expression data in order to single out interesting features characterizing healthy/unhealthy samples of an input dataset. We present an approach based on a network model of the input gene expression data, where there is a labelled graph for each sample. To the best of our knowledge, this is the first attempt to build a different graph for each sample and, then, to have a database of graphs for representing a sample set. Our main goal is that of singling out interesting differences between healthy and unhealthy samples, through the extraction of "discriminative patterns" among graphs belonging to the two different sample sets. Differently from the other approaches presented in the literature, our techniques is able to take into account important local similarities, and also collaborative effects involving interactions between multiple genes. In particular, we use edge-labelled graphs and we measure the discriminative power of a pattern based on such edge weights, which are representative of how much relevant is the co-expression between two gene
Discriminative pattern discovery for the characterization of different network populations
Motivation: An interesting problem is to study how gene co-expression varies in two different populations, associated with healthy and unhealthy individuals, respectively. To this aim, two important aspects should be taken into account: (i) in some cases, pairs/groups of genes show collaborative attitudes, emerging in the study of disorders and diseases; (ii) information coming from each single individual may be crucial to capture specific details, at the basis of complex cellular mechanisms; therefore, it is important avoiding to miss potentially powerful information, associated with the single samples. Results: Here, a novel approach is proposed, such that two different input populations are considered, and represented by two datasets of edge-labeled graphs. Each graph is associated to an individual, and the edge label is the co-expression value between the two genes associated to the nodes. Discriminative patterns among graphs belonging to different sample sets are searched for, based on a statistical notion of 'relevance' able to take into account important local similarities, and also collaborative effects, involving the co-expression among multiple genes. Four different gene expression datasets have been analyzed by the proposed approach, each associated to a different disease. An extensive set of experiments show that the extracted patterns significantly characterize important differences between healthy and unhealthy samples, both in the cooperation and in the biological functionality of the involved genes/proteins. Moreover, the provided analysis confirms some results already presented in the literature on genes with a central role for the considered diseases, still allowing to identify novel and useful insights on this aspect
Mining sponge phenomena in RNA expression data
In the last few years, the interactions among competing endogenous RNAs (ceRNAs) have been recognized as a key post-transcriptional regulatory mechanism in cell differentiation, tissue development, and disease. Notably, such sponge phenomena substracting active microRNAs from their silencing targets have been recognized as having a potential oncosuppressive, or oncogenic, role in several cancer types. Hence, the ability to predict sponges from the analysis of large expression data sets (e.g. from international cancer projects) has become an important data mining task in bioinformatics. We present a technique designed to mine sponge phenomena whose presence or absence may discriminate between healthy and unhealthy populations of samples in tumoral or normal expression data sets, thus providing lists of candidates potentially relevant in the pathology. With this aim, we search for pairs of elements acting as ceRNA for a given miRNA, namely, we aim at discovering miRNA-RNA pairs involved in phenomena which are clearly present in one population and almost absent in the other one. The results on tumoral expression data, concerning five different cancer types, confirmed the effectiveness of the approach in mining interesting knowledge. Indeed, 32 out of 33 miRNAs and 22 out of 25 protein-coding genes identified as top scoring in our analysis are corroborated by having been similarly associated with cancer processes in independent studies. In fact, the subset of miRNAs selected by the sponge analysis results in a significant enrichment of annotation for the KEGG32 pathway "microRNAs in cancer"when tested with the commonly used bioinformatic resource DAVID. Moreover, often the cancer datasets where our sponge analysis identified a miRNA as top scoring match the one reported already in the pertaining literature
Discovering discriminative graph patterns from gene expression data
We consider the problem of mining gene expression data in order to single out interesting features characterizing healthy/unhealthy samples of an input dataset. We present an approach based on a network model of the input gene expression data, where there is a labelled graph for each sample. To the best of our knowledge, this is the first attempt to build a different graph for each sample and, then, to have a database of graphs for representing a sample set. Our main goal is that of singling out interesting differences between healthy and unhealthy samples, through the extraction of "discriminative patterns" among graphs belonging to the two different sample sets. Differently from the other approaches presented in the literature, our techniques is able to take into account important local similarities, and also collaborative effects involving interactions between multiple genes. In particular, we use edge-labelled graphs and we measure the discriminative power of a pattern based on such edge weights, which are representative of how much relevant is the co-expression between two genes
Discriminative Pattern Discovery on Biological Networks
This work provides a review of biological networks as a model for analysis, presenting and discussing a number of illuminating analyses. Biological networks are an effective model for providing insights about biological mechanisms. Networks with different characteristics are employed for representing different scenarios. This powerful model allows analysts to perform many kinds of analyses which can be mined to provide interesting information about underlying biological behaviors.
The text also covers techniques for discovering exceptional patterns, such as a pattern accounting for local similarities and also collaborative effects involving interactions between multiple actors (for example genes). Among these exceptional patterns, of particular interest are discriminative patterns, namely those which are able to discriminate between two input populations (for example healthy/unhealthy samples).
In addition, the work includes a discussion on the most recent proposal on discovering discriminative patterns, in which there is a labeled network for each sample, resulting in a database of networks representing a sample set. This enables the analyst to achieve a much finer analysis than with traditional techniques, which are only able to consider an aggregated network of each population
IP6K gene identification in plant genomes by tag searching
Abstract
Background
Plants have played a special role in inositol polyphosphate (IP) research since in plant seeds was discovered the first IP, the fully phosphorylated inositol ring of phytic acid (IP6). It is now known that phytic acid is further metabolized by the IP6 Kinases (IP6Ks) to generate IP containing pyro-phosphate moiety. The IP6K are evolutionary conserved enzymes identified in several mammalian, fungi and amoebae species. Although IP6K has not yet been identified in plant chromosomes, there are many clues suggesting its presences in vegetal cells.
Results
In this paper we propose a new approach to search for the plant IP6K gene, that lead to the identification in plant genome of a nucleotide sequence corresponding to a specific tag of the IP6K family. Such a tag has been found in all IP6K genes identified up to now, as well as in all genes belonging to the Inositol Polyphosphate Kinases superfamily (IPK). The tag sequence corresponds to the inositol-binding site of the enzyme, and it can be considered as characterizing all IPK genes. To this aim we applied a technique based on motif discovery. We exploited DLSME, a software recently proposed, which allows for the motif structure to be only partially specified by the user. First we applied the new method on mitochondrial DNA (mtDNA) of plants, where such a gene could have been nested, possibly encrypted and hidden by virtue of the editing and/or trans-splicing processes. Then we looked for the gene in nuclear genome of two model plants, Arabidopsis thaliana and Oryza sativa.
Conclusions
The analysis we conducted in plant mitochondria provided the negative, though we argue relevant, result that IP6K does not actually occur in vegetable mtDNA. Very interestingly, the tag search in nuclear genomes lead us to identify a promising sequence in chromosome 5 of Oryza sativa. Further analyses are in course to confirm that this sequence actually corresponds to IP6K mammalian gene.
</jats:sec
Going Beyond Counting First Authors in Author Co-citation Analysis
The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation
counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings
are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that
only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into
account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed
- …
