1,721,044 research outputs found

    Hardware and Software Solutions for Energy-Efficient Computing in Scientific Programming

    No full text
    Energy consumption is one of the major issues in today's computer science, and an increasing number of scientific communities are interested in evaluating the tradeoff between time-to-solution and energy-to-solution. Despite, in the last two decades, computing which revolved around centralized computing infrastructures, such as supercomputing and data centers, the wide adoption of the Internet of Things (IoT) paradigm is currently inverting this trend due to the huge amount of data it generates, pushing computing power back to places where the data are generated - the so-called fog/edge computing. This shift towards a decentralized model requires an equivalent change in the software engineering paradigms, development environments, hardware tools, languages, and computation models for scientific programming because the local computational capabilities are typically limited and require a careful evaluation of power consumption. This paper aims to present how these concepts can be actually implemented in scientific software by presenting the state of the art of powerful, less power-hungry processors from one side and energy-aware tools and techniques from the other one

    Advantages of using graph databases to explore chromatin conformation capture experiments

    Full text link
    Background: High-throughput sequencing Chromosome Conformation Capture (Hi-C) allows the study of DNA interactions and 3D chromosome folding at the genome-wide scale. Usually, these data are represented as matrices describing the binary contacts among the different chromosome regions. On the other hand, a graph-based representation can be advantageous to describe the complex topology achieved by the DNA in the nucleus of eukaryotic cells. Methods: Here we discuss the use of a graph database for storing and analysing data achieved by performing Hi-C experiments. The main issue is the size of the produced data and, working with a graph-based representation, the consequent necessity of adequately managing a large number of edges (contacts) connecting nodes (genes), which represents the sources of information. For this, currently available graph visualisation tools and libraries fall short with Hi-C data. The use of graph databases, instead, supports both the analysis and the visualisation of the spatial pattern present in Hi-C data, in particular for comparing different experiments or for re-mapping omics data in a space-aware context efficiently. In particular, the possibility of describing graphs through statistical indicators and, even more, the capability of correlating them through statistical distributions allows highlighting similarities and differences among different Hi-C experiments, in different cell conditions or different cell types. Results: These concepts have been implemented in NeoHiC, an open-source and user-friendly web application for the progressive visualisation and analysis of Hi-C networks based on the use of the Neo4j graph database (version 3.5). Conclusion: With the accumulation of more experiments, the tool will provide invaluable support to compare neighbours of genes across experiments and conditions, helping in highlighting changes in functional domains and identifying new co-organised genomic compartments

    Porting bioinformatics applications from grid to cloud: A macromolecular surface analysis application case study

    No full text
    In this paper we describe our experience in exploiting different cloud-based environments for an actual use case taken from the bioinformatics domain - the molecular surfaces analysis - that identifies similarities and possible complementarities in the protein surfaces. The analysis of macromolecular surfaces is important since protein surface conformations drive many biological reactions. We developed a workflow that performs the macromolecular surfaces analysis and provides interesting results from a scientific point of view. An important issue is represented by the fact that it is highly compute-intensive, therefore it cannot be run on a single CPU system for meaningful use cases and a parallel infrastructure is required to obtain reasonable execution time. For a decade grid infrastructures have represented suitable solutions to achieve cost effective computational power for Bioinformatics applications. However, these solutions do not offer an adequate customisation of the computational environment (e.g. installing databases and configuring virtual network) due to the rigid organisation of the storage and computational sites. Running applications on customised machines obtained by user-defined images simplifies the computing model, decreases the failure rates and therefore reduces waiting times for production analysis with respect to the canonical grid computations. For these reasons a cloud-based approach is more suitable than a pure grid paradigm. We experimented using two cloud-based approaches, based on the Worker Node On Demand Service and on OpenStack, to run the molecular surfaces analysis use case and we compared the results in terms of performance, efficiency and efforts to build the computing model with respect to grid computing

    Latest advances in parallel, distributed, and network-based processing

    No full text
    This editorial introduces the articles selected for the special issue concerning the International Conferences on Parallel, Distributed, and Network-Based Processing, which provided insights related to the efficient exploitation of parallel and distributed architectures, including power-aware computing, application scheduling, and application development for GPUs

    NeoHiC: A web application for the analysis of Hi-C data

    No full text
    High-throughput sequencing Chromosome Conformation Capture (Hi-C) allows the study of chromatin interactions and 3D chromosome folding on a larger scale. A graph-based multi-level representation of Hi-C data is essential for proper visualisation of the spatial pattern they represent, in particular for comparing different experiments or for re-mapping omics-data in a space-aware context. The size of the HiC data hampers the straightforward use of currently available graph visualisation tools and libraries. In this paper, we present the first version of NeoHiC, a user-friendly web application for the progressive graph visualisation of Hi-C data based on the use of the Neo4j graph database. The user could select the richness of the environment of the query gene by choosing among a large number of proximity and distance metrics

    Combining Edge and Cloud computing for low-power, cost-effective metagenomics analysis

    No full text
    Metagenomic studies are becoming increasingly widespread, yielding important insights into microbial communities covering diverse environments from terrestrial to aquatic ecosystems. This also because genome sequencing is likely to become a routinely and ubiquitous analysis in a near future thanks to a new generation of portable devices, such as the Oxford Nanopore MinION. The main issue is however represented by the huge amount of data produced by these devices, whose management is actually challenging considering the resources required for an efficient data transfer and processing. In this paper we discuss these aspects, and in particular how it is possible to couple Edge and Cloud computing in order to manage the full analysis pipeline. In general, a proper scheduling of the computational services between the data center and smart devices equipped with low-power processors represents an effective solution

    Parallel Computing in Deep Learning: Bioinformatics Case Studiesa

    No full text
    In the last two decades deep learning has attracted a lot of attention internationally, solving problems in different application domains and achieving results beyond expectations. For example it has been applied in bioinformatics, game playing, imaging processing, object detection, robotic and drug discovery. One of the main reasons for the incremented use of deep learning algorithms is the need to implement approaches for the analysis of the large amount of data produces in every field, bringing researchers to dedicate their work to deep learning development. One of the main topics discussed up today is the possibility to run the training of deep models in a parallel fashion, so to reduce the time otherwise needed to find the hyperparameters and to make the achievement of the result faster
    corecore