1,720,988 research outputs found

    Stylized facts of financial time series and hidden semi-Markov models

    No full text
    Hidden Markov models reproduce most of the stylized facts about daily series of returns. A notable exception is the inability of the models to reproduce one ubiquitous feature of such time series, namely the slow decay in the autocorrelation function of the squared returns. It is shown that this stylized fact can be described much better by means of hidden semi-Markov models. This is illustrated by examining the fit of two such models to 18 series of daily sector returns. (c) 2006 Elsevier B.V. All rights reserved

    Improving Hidden Markov Models for Classification of Human Immunodeficiency Virus-1 Subtypes through Linear Classifier Learning

    No full text
    Profile Hidden Markov Models (pHMMs) are widely used to model nucleotide or protein sequence families. In many applications, a sequence family classified into several subfamilies is given and each subfamily is modeled separately by one pHMM. A major drawback of this approach is the difficulty of coping with subfamilies composed of very few sequences. Correct subtyping of human immunodeficiency virus-1 (HIV-1) sequences is one of the most crucial bioinformatic tasks affected by this problem of small subfamilies, i.e., HIV-1 subtypes with a small number of known sequences. To deal with small samples for particular subfamilies of HIV-1, we employ a machine learning approach. More precisely, we make use of an existing HMM architecture and its associated inference engine, while replacing the unsupervised estimation of emission probabilities by a supervised method. For that purpose, we use regularized linear discriminant learning together with a balancing scheme to account for the widely varying sample size. After training the multiclass linear discriminants, the corresponding weights are transformed to valid probabilities using a softmax function. We apply this modified algorithm to classify HIV-1 sequence data (in the form of partial-length HIV-1 sequences and semi-artificial recombinants) and show that the performance of pHMMs can be significantly improved by the proposed technique.Deutsche Forschungsgemeinschaft [STA 1009/5-1

    hsmm - An R package for analyzing hidden semi-Markov models

    No full text
    Hidden semi-Markov models are a generalization of the well-known hidden Markov model. They allow for a greater flexibility of sojourn time distributions, which implicitly follow a geometric distribution in the case of a hidden Markov chain. The aim of this paper is to describe hsmm, a new software package for the statistical computing environment R. This package allows for the simulation and maximum likelihood estimation of hidden semi-Markov models. The implemented Expectation Maximization algorithm assumes that the time spent in the last visited state is subject to right-censoring. it is therefore not subject to the common limitation that the last visited state terminates at the last observation. Additionally, hsmm permits the user to make inferences about the underlying state sequence via the Viterbi algorithm and smoothing probabilities. (C) 2008 Elsevier B.V. All rights reserved.German Research Foundation (DFG

    Catalogue as a tool for reinforcing habits: Empirical evidence from a multichannel retailer

    No full text
    Retailers are experiencing a systematic shift in the buying habits of their customers as more customers buy across different channels. Marketing managers face the daunting task of embracing online and offline channels to engage consumers, influence choice, and create habits to sustain a competitive advantage. We develop a dynamic segmentation model of channel choice and purchase frequency to assess the responsiveness of segments to catalogues and email communications. In addition, we perform profitability analysis to offer insights on the profitability of using catalogues and emails to reach customers. For certain firms, especially those with a history of using catalogue mailings, the findings suggest that catalogues remain relevant and are an effective tool at influencing purchases across both online and offline channels despite the increasing trend toward digital marketing. In addition, we found a segment of digital consumers respond favorably to both emails and catalogues. We argue catalogues have retained their competitive advantage over email marketing communication because the catalogue may not compete for attention with consumers' other digital distractions. Crown Copyright (C) 2019 Published by Elsevier B.V. All rights reserved

    Detection of viral sequence fragments of HIV-1 subfamilies yet unknown

    Full text link
    Abstract Background Methods of determining whether or not any particular HIV-1 sequence stems - completely or in part - from some unknown HIV-1 subtype are important for the design of vaccines and molecular detection systems, as well as for epidemiological monitoring. Nevertheless, a single algorithm only, the Branching Index (BI), has been developed for this task so far. Moving along the genome of a query sequence in a sliding window, the BI computes a ratio quantifying how closely the query sequence clusters with a subtype clade. In its current version, however, the BI does not provide predicted boundaries of unknown fragments. Results We have developed Unknown Subtype Finder (USF), an algorithm based on a probabilistic model, which automatically determines which parts of an input sequence originate from a subtype yet unknown. The underlying model is based on a simple profile hidden Markov model (pHMM) for each known subtype and an additional pHMM for an unknown subtype. The emission probabilities of the latter are estimated using the emission frequencies of the known subtypes by means of a (position-wise) probabilistic model for the emergence of new subtypes. We have applied USF to SIV and HIV-1 sequences formerly classified as having emerged from an unknown subtype. Moreover, we have evaluated its performance on artificial HIV-1 recombinants and non-recombinant HIV-1 sequences. The results have been compared with the corresponding results of the BI. Conclusions Our results demonstrate that USF is suitable for detecting segments in HIV-1 sequences stemming from yet unknown subtypes. Comparing USF with the BI shows that our algorithm performs as good as the BI or better.</p

    A Robust and Versatile Method of Combinatorial Chemical Synthesis of Gene Libraries via Hierarchical Assembly of Partially Randomized Modules.

    No full text
    A major challenge in gene library generation is to guarantee a large functional size and diversity that significantly increases the chances of selecting different functional protein variants. The use of trinucleotides mixtures for controlled randomization results in superior library diversity and offers the ability to specify the type and distribution of the amino acids at each position. Here we describe the generation of a high diversity gene library using tHisF of the hyperthermophile Thermotoga maritima as a scaffold. Combining various rational criteria with contingency, we targeted 26 selected codons of the thisF gene sequence for randomization at a controlled level. We have developed a novel method of creating full-length gene libraries by combinatorial assembly of smaller sub-libraries. Full-length libraries of high diversity can easily be assembled on demand from smaller and much less diverse sub-libraries, which circumvent the notoriously troublesome long-term archivation and repeated proliferation of high diversity ensembles of phages or plasmids. We developed a generally applicable software tool for sequence analysis of mutated gene sequences that provides efficient assistance for analysis of library diversity. Finally, practical utility of the library was demonstrated in principle by assessment of the conformational stability of library members and isolating protein variants with HisF activity from it. Our approach integrates a number of features of nucleic acids synthetic chemistry, biochemistry and molecular genetics to a coherent, flexible and robust method of combinatorial gene synthesis

    Going Beyond Counting First Authors in Author Co-citation Analysis

    Full text link
    The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed
    corecore