1,721,057 research outputs found
Integrative Analysis of Next-Generation Sequencing for Next-Generation Cancer Research toward Artificial Intelligence
The rapid improvement of next-generation sequencing (NGS) technologies and their application in large-scale cohorts in cancer research led to common challenges of big data. It opened a new research area incorporating systems biology and machine learning. As large-scale NGS data accumulated, sophisticated data analysis methods became indispensable. In addition, NGS data have been integrated with systems biology to build better predictive models to determine the characteristics of tumors and tumor subtypes. Therefore, various machine learning algorithms were introduced to identify underlying biological mechanisms. In this work, we review novel technologies developed for NGS data analysis, and we describe how these computational methodologies integrate systems biology and omics data. Subsequently, we discuss how deep neural networks outperform other approaches, the potential of graph neural networks (GNN) in systems biology, and the limitations in NGS biomedical research. To reflect on the various challenges and corresponding computational solutions, we will discuss the following three topics: (i) molecular characteristics, (ii) tumor heterogeneity, and (iii) drug discovery. We conclude that machine learning and network-based approaches can add valuable insights and build highly accurate models. However, a well-informed choice of learning algorithm and biological network information is crucial for the success of each specific research question
Fostering reproducibility, reusability, and technology transfer in health informatics
Summary: Computational methods can transform healthcare. In particular, health informatics with artificial intelligence has shown tremendous potential when applied in various fields of medical research and has opened a new era for precision medicine. The development of reusable biomedical software for research or clinical practice is time-consuming and requires rigorous compliance with quality requirements as defined by international standards.However, research projects rarely implement such measures, hindering smooth technology transfer into the research community or manufacturers as well as reproducibility and reusability.Here, we present a guideline for quality management systems (QMS) for academic organizations incorporating the essential components while confining the requirements to an easily manageable effort. It provides a starting point to implement a QMS tailored to specific needs effortlessly and greatly facilitates technology transfer in a controlled manner, thereby supporting reproducibility and reusability.Ultimately, the emerging standardized workflows can pave the way for an accelerated deployment in clinical practice
Fractal construction of constrained code words for DNA storage systems
Abstract The use of complex biological molecules to solve computational problems is an emerging field at the interface between biology and computer science. There are two main categories in which biological molecules, especially DNA, are investigated as alternatives to silicon-based computer technologies. One is to use DNA as a storage medium, and the other is to use DNA for computing. Both strategies come with certain constraints. In the current study, we present a novel approach derived from chaos game representation for DNA to generate DNA code words that fulfill user-defined constraints, namely GC content, homopolymers, and undesired motifs, and thus, can be used to build codes for reliable DNA storage systems
CORDITE: The Curated CORona Drug InTERactions Database for SARS-CoV-2
Since the outbreak in 2019, researchers are trying to find effective drugs against the SARS-CoV-2 virus based on de novo drug design and drug repurposing. The former approach is very time consuming and needs extensive testing in humans, whereas drug repurposing is more promising, as the drugs have already been tested for side effects, etc. At present, there is no treatment for COVID-19 that is clinically effective, but there is a huge amount of data from studies that analyze potential drugs. We developed CORDITE to efficiently combine state-of-the-art knowledge on potential drugs and make it accessible to scientists and clinicians. The web interface also provides access to an easy-to-use API that allows a wide use for other software and applications, e.g., for meta-analysis, design of new clinical studies, or simple literature search. CORDITE is currently empowering many scientists across all continents and accelerates research in the knowledge domains of virology and drug design
Evaluation of machine learning strategies for imaging confirmed prostate cancer recurrence prediction on electronic health records
Federated Random Forests can improve local performance of predictive models for various healthcare applications
Abstract Motivation Limited data access has hindered the field of precision medicine from exploring its full potential, e.g. concerning machine learning and privacy and data protection rules. Our study evaluates the efficacy of federated Random Forests (FRF) models, focusing particularly on the heterogeneity within and between datasets. We addressed three common challenges: (i) number of parties, (ii) sizes of datasets and (iii) imbalanced phenotypes, evaluated on five biomedical datasets. Results The FRF outperformed the average local models and performed comparably to the data-centralized models trained on the entire data. With an increasing number of models and decreasing dataset size, the performance of local models decreases drastically. The FRF, however, do not decrease significantly. When combining datasets of different sizes, the FRF vastly improve compared to the average local models. We demonstrate that the FRF remain more robust and outperform the local models by analyzing different class-imbalances. Our results support that FRF overcome boundaries of clinical research and enables collaborations across institutes without violating privacy or legal regulations. Clinicians benefit from a vast collection of unbiased data aggregated from different geographic locations, demographics and other varying factors. They can build more generalizable models to make better clinical decisions, which will have relevance, especially for patients in rural areas and rare or geographically uncommon diseases, enabling personalized treatment. In combination with secure multi-party computation, federated learning has the power to revolutionize clinical practice by increasing the accuracy and robustness of healthcare AI and thus paving the way for precision medicine. Availability and implementation The implementation of the federated random forests can be found at https://featurecloud.ai/. Supplementary information Supplementary data are available at Bioinformatics online
3'-Phosphoadenosine 5'-phosphosulfate (PAPS) synthases, naturally fragile enzymes specifically stabilized by nucleotide binding
Activated sulfate in the form of 3'-phosphoadenosine 5'-phosphosulfate (PAPS) is needed for all sulfation reactions in eukaryotes with implications for the build-up of extracellular matrices, retroviral infection, protein modification, and steroid metabolism. In metazoans, PAPS is produced by bifunctional PAPS synthases (PAPSS). A major question in the field is why two human protein isoforms, PAPSS1 and -S2, are required that cannot complement for each other. We provide evidence that these two proteins differ markedly in their stability as observed by unfolding monitored by intrinsic tryptophan fluorescence as well as circular dichroism spectroscopy. At 37 °C, the half-life for unfolding of PAPSS2 is in the range of minutes, whereas PAPSS1 remains structurally intact. In the presence of their natural ligand, the nucleotide adenosine 5'-phosphosulfate (APS), PAPS synthase proteins are stabilized. Invertebrates only possess one PAPS synthase enzyme that we classified as PAPSS2-type by sequence-based machine learning techniques. To test this prediction, we cloned and expressed the PPS-1 protein from the roundworm Caenorhabditis elegans and also subjected this protein to thermal unfolding. With respect to thermal unfolding and the stabilization by APS, PPS-1 behaved like the unstable human PAPSS2 protein suggesting that the less stable protein is evolutionarily older. Finally, APS binding more than doubled the half-life for unfolding of PAPSS2 at physiological temperatures and effectively prevented its aggregation on a time scale of days. We propose that protein stability is a major contributing factor for PAPS availability that has not as yet been considered. Moreover, naturally occurring changes in APS concentrations may be sensed by changes in the conformation of PAPSS2.</p
Artificial Intelligence Bioinformatics: Development and Application of Tools for Omics and Inter-Omics Studies
This eBook is a collection of articles from a Frontiers Research Topic. Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: frontiersin.org/about/contac
Going Beyond Counting First Authors in Author Co-citation Analysis
The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation
counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings
are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that
only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into
account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed
- …
