1,721,003 research outputs found
Viral proteins originated de novo by overprinting can be identified by codon usage: application to the "gene nursery" of Deltaretroviruses
A well-known mechanism through which new protein-coding genes originate is by modification of pre-existing genes, e.g. by duplication or horizontal transfer. In contrast, many viruses generate protein-coding genes de novo, via the overprinting of a new reading frame onto an existing (“ancestral”) frame. This mechanism is thought to play an important role in viral pathogenicity, but has been poorly explored, perhaps because identifying the de novo frames is very challenging. Therefore, a new approach to detect them was needed. We assembled a reference set of overlapping genes for which we could reliably determine the ancestral frames, and found that their codon usage was significantly closer to that of the rest of the viral genome than the codon usage of de novo frames. Based on this observation, we designed a method that allowed the identification of de novo frames based on their codon usage with a very good specificity, but intermediate sensitivity. Using our method, we predicted that the Rex gene of deltaretroviruses has originated de novo by overprinting the Tax gene. Intriguingly, several genes in the same genomic region have also originated de novo and encode proteins that regulate the functions of Tax. Such “gene nurseries” may be common in viral genomes. Finally, our results confirm that the genomic GC content is not the only determinant of codon usage in viruses and suggest that a constraint linked to translation must influence codon usage
sj-docx-1-eso-10.1177_23969873221139410 – Supplemental material for Prolonged cardiac monitoring for stroke prevention: A systematic review and meta-analysis of randomized-controlled clinical trials
Supplemental material, sj-docx-1-eso-10.1177_23969873221139410 for Prolonged cardiac monitoring for stroke prevention: A systematic review and meta-analysis of randomized-controlled clinical trials by Georgios Tsivgoulis, Lina Palaiodimou, Sokratis Triantafyllou, Martin Köhrmann, Polychronis Dilaveris, Konstantinos Tsioufis, Gkikas Magiorkinis, Christos Krogias, Peter D Schellinger, Valeria Caso, Maurizio Paciaroni, Mukul Sharma, Robin Lemmens, David J Gladstone, Tommaso Sanna, Rolf Wachter, Gerasimos Filippatos and Aristeidis H Katsanos in European Stroke Journal</p
Prediction of the ancestral frame in overlapping genes from the benchmark dataset.
(1)<p>The last two overlaps have entered their genome by horizontal transfer and are not taken into account for calculations of specificity and sensitivity of the method.</p><p>Abbreviations and conventions are the same as in <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003162#pcbi-1003162-t002" target="_blank">Table 2</a>. A frame is predicted ancestral if its <i>r<sub>s</sub></i> is positive and significantly higher than the <i>r<sub>s</sub></i> of the other frame (P<0.05, corresponding to t-Hotelling >1.70). If no prediction is possible, the field is left blank. Numerical values are the same as in <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003162#pcbi-1003162-t003" target="_blank">Table 3</a> for actual frames, but are reproduced here for clarity.</p
Benchmark dataset of 27 overlapping genes with known genealogy.
(1)<p>gene overlaps described previously (see reference <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003162#pcbi.1003162-Rancurel1" target="_blank">[3]</a>).</p>(2)<p>additional overlaps collected for this study.</p>(3)<p>The function is that of the overlapping region of the protein; if it is not known, the field is left blank.</p>(4)<p>The NS2 proteins of <i>brevidensoviruses</i> and that of <i>densoviruses</i> are not homologous (they are encoded in different frames relative to NS1).</p>(5)<p>The <i>alphacarmotetravirus</i> polymerase and <i>machlomovirus</i> capsid have originated by horizontal transfer and thus the two corresponding overlaps are not part of the benchmark dataset, although we perform the same analyses on them than on other overlaps(see text).</p><p>Abbreviations: AAP, assembly-activating protein; dsRNA, double-stranded RNA; C-term, C-terminal; L, large envelope protein; MP, movement protein; NABP, nucleic-acid binding protein; NS, non-structural protein; NSs, non-structural protein of the small RNA segment; N-term, N-terminal; Pog, predicted overlapping gene; Pol, Polymerase; SAT, small alternatively translated protein; ssDNA, single-stranded DNA; ssRNA, single-stranded RNA (+, positive or −, negative); TGBp2, Triple Gene Block protein 2; TGBp3, Triple Gene Block protein 3; VP, viral protein.</p
Prediction, by codon usage, of the ancestral frame in overlapping reading frames with identical phylogenetic distribution.
<p>Conventions are the same as in <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003162#pcbi-1003162-t003" target="_blank">Table 3</a>. A frame is predicted ancestral if its <i>r<sub>s</sub></i> is positive and significantly higher than the <i>r<sub>s</sub></i> of the other frame (P<0.05, corresponding to t-Hotelling>1.70).</p
A “gene nursery”: the pX region of <i>deltaretroviruses</i>.
<p>The pX region of HTLV1 encodes five genes unique to <i>deltaretroviruses</i> by a complex pattern of alternative splicing and leaky scanning <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003162#pcbi.1003162-Gessain1" target="_blank">[36]</a>, <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003162#pcbi.1003162-Baydoun1" target="_blank">[39]</a>. The initial exons of these genes are very short and have not been represented, nor have been shorter versions of p12 and p30 expressed alternatively. Only the 3′ end of the Env gene is represented. The figure is approximately to scale. Ancestral regions in red and <i>de novo</i> regions in blue. Frame numbering is as in <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003162#pcbi.1003162-Firth2" target="_blank">[45]</a>, with the Tax frame taken as “0”. Protein regions with unusually low sequence complexity are indicated by dashed, grey lines.</p
Presumed evolution of the <i>deltaretrovirus</i> pX region.
<p>The deltaretrovirus phylogeny is shown as a cladogram. Conventions are the same as in <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003162#pcbi-1003162-g003" target="_blank">Figure 3</a>.</p
Analysis of the codon usage of overlapping frames from the benchmark dataset.
<p>Abbreviations are the same as in <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003162#pcbi-1003162-t001" target="_blank">Table 1</a>. The last two overlaps have entered their genome by horizontal transfer (see text).</p><p><i>r<sub>sA</sub></i> is the Spearman rank correlation coefficient <i>r<sub>s</sub></i> between the codon usage of the ancestral frame and that of its genome. <i>r<sub>sN</sub></i> is the equivalent coefficient for the <i>de novo</i> frame. N<sub>A</sub> and N<sub>N</sub> are the number of codons on which <i>r<sub>sA</sub></i> and <i>r<sub>sN</sub></i> were calculated. The first row indicates whether calculations are presented for the actual overlapping frames or for the corresponding simulated frames. The calculation of P for the actual frames is based on Hotelling's t-test, whereas for simulated frames P is based on the distribution of the simulated <i>d<sub>21</sub></i> (see text). Agreement between t-Hotelling and simulation is calculated on the basis of whether corresponding P-values are both <0.05 or >0.05.</p
A genomic hotspot of origination of silencing suppressors in plus-strand RNA viruses.
<p>The replicases of <i>Nodaviridae</i> and <i>Bromoviridae</i> contain C-terminal extensions predicted disordered (thin boxes) downstream of their homologous polymerase (RdRP) domain. These extensions encode structurally unrelated suppressors of RNA silencing, B2 and 2b (PDB accession codes respectively 2AZ2 and 2ZI0) in different reading frames. Neither the C-terminal extensions nor the suppressors of RNA silencing have detectable sequence similarity, even between closely related genera. Which region is ancestral in each overlap could not be determined (see text).</p
- …
