Search CORE

TICO: a tool for postprocessing the predictions of prokaryotic translation initiation sites

Author: Tech Maike
Meinicke Peter
Morgenstern Burkhard
Publication venue
Publication date: 01/01/2006
Field of study

Exact localization of the translation initiation sites (TIS) in prokaryotic genomes is difficult to achieve using conventional gene finders. We recently introduced the program TICO for postprocessing TIS predictions based on a completely unsupervised learning algorithm. The program can be utilized through our web interface at http://tico.gobics.de/ and it is also freely available as a commandline version for Linux and Windows. The latest version of our program provides a tool for visualization of the resulting TIS model. Although the underlying method is not based on any specific assumptions about characteristic sequence features of prokaryotic TIS the prediction rates of our tool are competitive on experimentally verified test data

Orphelia: predicting genes in metagenomic sequencing reads

Author: Lingner Thomas
Tech Maike
Meinicke Peter
Hoff Katharina J.
Publication venue
Publication date: 01/01/2009
Field of study

Metagenomic sequencing projects yield numerous sequencing reads of a diverse range of uncultivated and mostly yet unknown microorganisms. In many cases, these sequencing reads cannot be assembled into longer contigs. Thus, gene prediction tools that were originally developed for whole-genome analysis are not suitable for processing metagenomes. Orphelia is a program for predicting genes in short DNA sequences that is available through a web server application (http://orphelia.gobics.de). Orphelia utilizes prediction models that were created with machine learning techniques on the basis of a wide range of annotated genomes. In contrast to other methods for metagenomic gene prediction, Orphelia has fragment length-specific prediction models for the two most popular sequencing techniques in metagenomics, chain termination sequencing and pyrosequencing. These models ensure highly specific gene predictions

TICO: a tool for improving predictions of prokaryotic translation initiation sites

Author: Tech Maike
Pfeifer N.
Meinicke Peter
Morgenstern Burkhard
Publication venue
Publication date: 01/01/2005
Field of study

We provide the tool 'TICO' (Translation Initiation site COrrection) for improving the results of conventional gene finders for prokaryotic genomes with regard to exact localization of the translation initiation site (TIS). At the current state TICO provides an interface for direct post processing of the predictions obtained from the widely used program GLIMMER. Our program is based on a clustering algorithm for completely unsupervised scoring of potential TIS locations

University of Regensburg Publication Server

Oligo kernels for datamining on biological sequences: a case study on prokaryotic translation initiation sites

Author: Merkl R.
Rainer Merkl
Tech Maike
Merkl Rainer
Morgenstern Burkhard
Morgenstern Burkhard
Merkl Rainer
Burkhard Morgenstern
Maike Tech
Meinicke Peter
Meinicke Peter
Peter Meinicke
Tech Maike
Publication venue
Publication date: 01/01/2004
Field of study

Background: Kernel-based learning algorithms are among the most advanced machine learning methods and have been successfully applied to a variety of sequence classification tasks within the field of bioinformatics. Conventional kernels utilized so far do not provide an easy interpretation of the learnt representations in terms of positional and compositional variability of the underlying biological signals. Results: We propose a kernel-based approach to datamining on biological sequences. With our method it is possible to model and analyze positional variability of oligomers of any length in a natural way. On one hand this is achieved by mapping the sequences to an intuitive but high-dimensional feature space, well-suited for interpretation of the learnt models. On the other hand, by means of the kernel trick we can provide a general learning algorithm for that high-dimensional representation because all required statistics can be computed without performing an explicit feature space mapping of the sequences. By introducing a kernel parameter that controls the degree of position-dependency, our feature space representation can be tailored to the characteristics of the biological problem at hand. A regularized learning scheme enables application even to biological problems for which only small sets of example sequences are available. Our approach includes a visualization method for transparent representation of characteristic sequence features. Thereby importance of features can be measured in terms of discriminative strength with respect to classification of the underlying sequences. To demonstrate and validate our concept on a biochemically well-defined case, we analyze E. coli translation initiation sites in order to show that we can find biologically relevant signals. For that case, our results clearly show that the Shine-Dalgarno sequence is the most important signal upstream a start codon. The variability in position and composition we found for that signal is in accordance with previous biological knowledge. We also find evidence for signals downstream of the start codon, previously introduced as transcriptional enhancers. These signals are mainly characterized by occurrences of adenine in a region of about 4 nucleotides next to the start codon. Conclusions: We showed that the oligo kernel can provide a valuable tool for the analysis of relevant signals in biological sequences. In the case of translation initiation sites we could clearly deduce the most discriminative motifs and their positional variation from example sequences. Attractive features of our approach are its flexibility with respect to oligomer length and position conservation. By means of these two parameters oligo kernels can easily be adapted to different biological problems

Springer - Publisher Connector

Directory of Open Access Journals

Gene prediction in metagenomic fragments: A large scale machine learning approach

Author: Rolf Daniel
Lingner Thomas
Daniel Rolf
Morgenstern Burkhard
Tech Maike
Thomas Lingner
Morgenstern Burkhard
Lingner Thomas
Katharina J Hoff
Hoff Katharina J
Burkhard Morgenstern
Maike Tech
Daniel Rolf
Meinicke Peter
Meinicke Peter
Peter Meinicke
Tech Maike
Hoff Katharina J.
Publication venue
Publication date: 01/01/2008
Field of study

Abstract Background Metagenomics is an approach to the characterization of microbial genomes via the direct isolation of genomic sequences from the environment without prior cultivation. The amount of metagenomic sequence data is growing fast while computational methods for metagenome analysis are still in their infancy. In contrast to genomic sequences of single species, which can usually be assembled and analyzed by many available methods, a large proportion of metagenome data remains as unassembled anonymous sequencing reads. One of the aims of all metagenomic sequencing projects is the identification of novel genes. Short length, for example, Sanger sequencing yields on average 700 bp fragments, and unknown phylogenetic origin of most fragments require approaches to gene prediction that are different from the currently available methods for genomes of single species. In particular, the large size of metagenomic samples requires fast and accurate methods with small numbers of false positive predictions. Results We introduce a novel gene prediction algorithm for metagenomic fragments based on a two-stage machine learning approach. In the first stage, we use linear discriminants for monocodon usage, dicodon usage and translation initiation sites to extract features from DNA sequences. In the second stage, an artificial neural network combines these features with open reading frame length and fragment GC-content to compute the probability that this open reading frame encodes a protein. This probability is used for the classification and scoring of gene candidates. With large scale training, our method provides fast single fragment predictions with good sensitivity and specificity on artificially fragmented genomic DNA. Additionally, this method is able to predict translation initiation sites accurately and distinguishes complete from incomplete genes with high reliability. Conclusion Large scale machine learning methods are well-suited for gene prediction in metagenomic DNA fragments. In particular, the combination of linear discriminants and neural networks is promising and should be considered for integration into metagenomic analysis pipelines. The data sets can be downloaded from the URL provided (see Availability and requirements section).</p

Springer - Publisher Connector

Directory of Open Access Journals

Author Instructions

Author: Instructions Author
Publication venue
Publication date: 04/11/2013
Field of study

Cartographic Perspectives (E-Journal - North American Cartographic Information Society, NACIS)

Going Beyond Counting First Authors in Author Co-citation Analysis

Author: Zhao Dangzhi
Publication venue
Publication date: 01/01/2005
Field of study

The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed

E-LIS

Variations on the Author

Author: Sayad Cecilia
Publication venue
Publication date: 01/01/2016
Field of study

“Variations on the Author” discusses two of Eduardo Coutinho’s recent films (Um Dia na Vida, from 2010, and Últimas Conversas, posthumously released in 2015) and their contribution to the general question of documentary authorship. The director’s filmography is characterized by a consistent yet self-effacing form of authorial self-inscription: Coutinho often features as an interviewer that rather than express opinions propels discourses; an interviewer that is good at listening. This mode of self-inscription characterizes him as an author who is not expressive but who is nonetheless markedly present on the screen. In Um Dia na Vida, however, Coutinho is completely absent form the image, while Últimas Conversas, on the contrary, includes a confessional prologue that moves the director from the margins to the center of his films. This article examines the ways in which these works stand out in the filmography of a director who offers new insights into the notion of cinematic authorship