Search CORE

1,720,958 research outputs found

Progetto ARES: Advanced networking for EU genomic RESearch

Author: VALOCCHI DARIO
NUNZI Emilia
Publication venue
Publication date: 01/01/2014
Field of study

La velocità con cui si generano i dati di genomica sta aumentando con un tasso più alto della legge di Moore, pertanto significativamente superiore all’ammodernamento della capacità trasmissiva e di immagazzinamento nelle rete per dati. Di conseguenza, gli utenti sperimentano difficoltà crescenti nella gestione dei dati di genomica, al punto tale che a volte i dati sono traferiti mediante soluzioni alternative alle reti. Ad esempio, il Beijing Genomics Institute, che elabora attualmente 2.000 genomi umani al giorno, invece di trasmetterli attraverso Internet o altre reti, invia hard-disk contenenti i dati tramite corriere espresso [8] Per avere un’idea della serietà del problema, supponiamo che un ricercatore voglia determinare le caratteristiche di un genoma rispetto ad una specifica malattia distribuita in diversi paesi del mondo. In tal caso, non solo il numero di file di genoma da gestire ed analizzare diventa estremamente grande, ma anche ogni insieme di dati che riguarda l’individuo stesso è significativamente grande, dell’ordine delle decine di GB. L’elaborazione del genoma, in particolare quello umano, in genere procede attraverso l’esecuzione di una pipeline di pacchetti software. Esistono diversi tipi di pipeline, ognuno specifico per esigenze di ricerca o diagnostiche [1]. I file di ingresso delle pipeline sono sia file di genoma, sia risultati di elaborazioni precedenti, detti annotazioni, sia il modello di riferimento del genoma umano [6] utilizzato per eseguire l’allineamento dei dati [5], [6], [7]. Anche se il genoma di un paziente può essere immagazzinato in un data base locale, tutti gli altri file, che si trovano in database localizzati fisicamente e geograficamente su server diversi, devono essere scaricati dalla rete. La dimensione globale di questi file è variabile, da pochi GB a decine di GB. Solo quando tutti i file sono stati trasferiti, allora può iniziare l’elaborazione dei dati, che può anche durare ore. In sostanza, il tempo totale chiesto per avere i risultati di una richiesta di elaborazione potrebbe essere superiore alle 24 ore. Nella prospettiva di una veloce ed imminente diffusione del sequenziamento e dell’utilizzo dei dati di genomica ai fini diagnostici, questa problematica pone almeno due aspetti critici. La minimizzazione dei tempi di consegna del servizio, nel caso in cui si debba trattare ad esempio la diagnosi di una malattia grave, e la gestione del traffico dati in rete. Mentre in un numero relativamente piccolo di prestigiose organizzazioni i ricercatori hanno a disposizione potenti strutture di calcolo parallelo [3], in generale questo non è vero per centri medici generici e ospedali pubblici, in particolare per paesi in cui l’infrastruttura di rete e dei servizi non ha prestazioni elevate. In tale contesto, l’unità di ricerca di Perugia è responsabile di unità per il progetto ARES (Advanced networking for EU genomic RESearch) [4] che ha come obiettivo principale l’ottimizzazione della gestione delle risorse di rete finalizzata alla elaborazione e trasferimento di dati di genoma umano che, se trattati come generici “big data”, implicano una gestione delle risorse di rete con prestazioni sub-ottime. In questa memoria, oltre alla descrizione del sistema, sono riportati i risultati sperimentali preliminari che evidenziano come una attenta scelta dei parametri degli algoritmi di elaborazione, di gestione e di consegna dei servizi, che si basano sull’integrazione del modello Content Distribution Network (CDN) e di quello Cloud, permette di personalizzare servizi di rete alle esigenze specifiche di personale medico sanitario che richieda elaborazione di dati genomici caratterizzati da dimensioni molto grandi dei file. Il progetto ARES, accettato nell’ambito della prima open-call del progetto Géant/GN3plus, è co-finanziato dalla Commissione Europea

IRIS - Res&Arch Institutional Research Information System Università degli Studi di Perugia

ARES: Advanced Networking for Distributing Genomic Data

Author: FEMMINELLA Mauro
VALOCCHI DARIO
REALI Gianluca
NUNZI Emilia
Publication venue
Publication date: 01/01/2014
Field of study

This paper shows the network and service architecture being implemented within the project ARES (Advanced networking for the EU genomic RESearch). This architecture is designed for both providing delivery of genomic data set over the GÉANT network and supporting the genomic research in EU countries. For this purpose, the strategic objective of the project ARES is to create a novel Content Distribution Network (CDN) architecture, suitable for handling the rapidly increasing diffusion of genomic data. This paper summarizes the status of the project, the ongoing research, and the achieved and expected results. This CDN architecture is based on an evolved NSIS signalling, and addresses the major challenges for managing genomic data sets over a shared wideband network with limited amount of resources made available to the service. Besides a detailed description of the functional entities included in the ARES architecture, we illustrate the signalling protocols that support their interaction, and provide preliminary experimental results obtained by the implementation and deployment of two significant research scenarios within our research laboratories

IRIS - Res&Arch Institutional Research Information System Università degli Studi di Perugia

A resource discovery framework for cloud-based genomics computing

Author: Dario Valocchi
FEMMINELLA Mauro
Gianluca Reali
VALOCCHI DARIO
REALI Gianluca
NUNZI Emilia
Emilia Nunzi
Mauro Femminella
Publication venue
Publication date: 01/01/2014
Field of study

In recent years scientific computing has evolved into a massive usage of cloud computing, due to its flexibility in managing computing resources. In this paper, we focus on genomic data processing, which is rapidly gaining momentum in research and medical activities. The main characteristics of these data sets is that not only the number of available genome files is becoming extremely large, but also each individual data set is significantly large, in the order of tens of GB. Hence, a wide diffusion of cloud-based genomic data processing will have a significant impact on network resources, since each processing request will require the transfer of tens of GBs into computing nodes. To face this issue, in this paper we propose a resource discovery framework which provides decision agents with the needed information for selecting the most suitable computing nodes. We have implemented this resource discovery function in a distributed fashion, and extensively tested it in a lab testbed consisting of about 70 nodes. We found that the overhead of the proposed solution is negligible in comparison with the amount of transferred data

Crossref

IRIS - Res&Arch Institutional Research Information System Università degli Studi di Perugia

The ARES Project: Cloud Services for Medical Genomics

Author: Dario Valocchi
FEMMINELLA Mauro
Gianluca Reali
Valerio Napolioni
VALOCCHI DARIO
REALI Gianluca
NUNZI Emilia
Emilia Nunzi
Matteo Picciolini
Mauro Femminella
Publication venue
Publication date: 01/01/2014
Field of study

This paper shows the cloud services provided by the project ARES. The network solutions have been illustrated in a companion paper in the same conference. The ARES project aims to deploy CDN services over a broadband network for accessing and exchanging genomic datasets, accessible by medical and research personnel through a Cloud interface. This paper illustrates the procedure defined to access such services, also providing a case-study simulation to show the implementation of the bioinformatics pipeline included. The experimental activity in ARES aims to gain a detailed understanding of the network problems relating to its sustainability given the increasing use of genomics for diagnostic purposes. The main aim is to allow an extensive use of genomic data through the collection of relevant information available from the network in the medical and diagnostic field diseases

Crossref

IRIS - Res&Arch Institutional Research Information System Università degli Studi di Perugia

Author Instructions

Author: Instructions Author
Publication venue
Publication date: 04/11/2013
Field of study

Crossref

Cartographic Perspectives (E-Journal - North American Cartographic Information Society, NACIS)

Going Beyond Counting First Authors in Author Co-citation Analysis

Author: Zhao Dangzhi
Publication venue
Publication date: 01/01/2005
Field of study

The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed

E-LIS

Variations on the Author

Author: Sayad Cecilia
Publication venue
Publication date: 01/01/2016
Field of study

“Variations on the Author” discusses two of Eduardo Coutinho’s recent films (Um Dia na Vida, from 2010, and Últimas Conversas, posthumously released in 2015) and their contribution to the general question of documentary authorship. The director’s filmography is characterized by a consistent yet self-effacing form of authorial self-inscription: Coutinho often features as an interviewer that rather than express opinions propels discourses; an interviewer that is good at listening. This mode of self-inscription characterizes him as an author who is not expressive but who is nonetheless markedly present on the screen. In Um Dia na Vida, however, Coutinho is completely absent form the image, while Últimas Conversas, on the contrary, includes a confessional prologue that moves the director from the margins to the center of his films. This article examines the ways in which these works stand out in the filmography of a director who offers new insights into the notion of cinematic authorship

Crossref

Kent Academic Repository

Appropriate Similarity Measures for Author Cocitation Analysis

Author: Waltman L.R.
Eck N.J.P. van
Publication venue
Publication date
Field of study

We provide a number of new insights into the methodological discussion about author cocitation analysis. We first argue that the use of the Pearson correlation for measuring the similarity between authorsâ€™ cocitation profiles is not very satisfactory. We then discuss what kind of similarity measures may be used as an alternative to the Pearson correlation. We consider three similarity measures in particular. One is the well-known cosine. The other two similarity measures have not been used before in the bibliometric literature. Finally, we show by means of an example that our findings have a high practical relevance.information science;Pearson correlation;cosine;similarity measure;author cocitation analysis

Research Papers in Economics

Dispelling the Myths Behind First-author Citation Counts

Author: Zhao Dangzhi
Publication venue
Publication date: 01/01/2006
Field of study

We conducted a full-scale evaluative citation analysis study of scholars in the XML research field to explore just how different from each other author rankings resulting from different citation counting methods actually are, and to demonstrate the capability of emerging data and tools on the Web in supporting more realistic citation counting methods. Our results contest some common arguments for the continued use of first-author citation counts in the evaluation of scholars, such as high correlations between author rankings by first-author citation counts and other citation counting methods, and high costs of using more realistic citation counting methods that are not well-supported by the ISI databases. It is argued that increasingly available digital full text research papers make it possible for citation analysis studies to go beyond what the ISI databases have directly supported and to employ more sophisticated methods

E-LIS

Author Index

Author: Author Index
Publication venue
Publication date
Field of study

Nao informado