Search CORE

1,720,960 research outputs found

H2F: A Hierarchical Hadoop Framework for big data processing in geo-distributed environments

Author: Cavallo Marco
Orazio Tomarchio
Tomarchio Orazio
Giuseppe Di Modica
Carmelo Polito
Di Modica Giuseppe
Marco Cavallo
Polito Carmelo
Publication venue
Publication date: 01/01/2016
Field of study

Crossref

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

A LAHC-based Job Scheduling Strategy to Improve Big Data Processing in Geo-distributed Contexts

Author: Cavallo Marco
Orazio Tomarchio
Tomarchio Orazio
Giuseppe Di Modica
Carmelo Polito
Di Modica Giuseppe
Marco Cavallo
Polito Carmelo
Publication venue
Publication date: 01/01/2017
Field of study

Crossref

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Multi-job Hadoop scheduling to process Geo-distributed big data

Author: Cavallo Marco
Orazio Tomarchio
Tomarchio Orazio
Giuseppe Di Modica
Carmelo Polito
Di Modica Giuseppe
Marco Cavallo
Polito Carmelo
Publication venue
Publication date: 01/01/2017
Field of study

Crossref

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Application Profiling in Hierarchical Hadoop for Geo-distributed Computing Environments

Author: Cavallo Marco
Orazio Tomarchio
Giuseppe Di Modica
Tomarchio O
Carmelo Polito
Di Modica Giuseppe
Marco Cavallo
Polito Carmelo
Publication venue
Publication date: 01/01/2016
Field of study

Crossref

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Context-aware MapReduce for Geo-distributed Big Data

Author: Cavallo Marco
Orazio Tomarchio
Giuseppe Di Modica
Tomarchio O
Carmelo Polito
Di Modica Giuseppe
Marco Cavallo
Polito Carmelo
Publication venue
Publication date: 01/01/2015
Field of study

Crossref

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

A hierarchical Hadoop framework to handle big data in geo-distributed computing environments

Author: Cavallo Marco
Orazio Tomarchio
Giuseppe Di Modica
Tomarchio Orazio
Polito Carmelo
Modica Giuseppe DI
Carmelo Polito
Marco Cavallo
Publication venue
Publication date: 01/01/2018
Field of study

Advances in the communication technologies, along with the birth of new communication paradigms leveraging on the power of the social, has fostered the production of huge amounts of data. Oldfashioned computing paradigms are unfit to handle the dimensions of the data daily produced by the countless, worldwide distributed sources of information. So far, the MapReduce has been able to keep the promise of speeding up the computation over Big Data within a cluster. This article focuses on scenarios of worldwide distributed Big Data. While stigmatizing the poor performance of the Hadoop framework when deployed in such scenarios, it proposes the definition of a Hierarchical Hadoop Framework (H2F) to cope with the issues arising when Big Data are scattered over geographically distant data centers. The article highlights the novelty introduced by the H2F with respect to other hierarchical approaches. Tests run on a software prototype are also reported to show the increase of performance that H2F is able to achieve in geographical scenarios over a plain Hadoop approach

Crossref

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

A Scheduling Strategy to Run Hadoop Jobs on Geodistributed Data

Author: Cavallo Marco
Tomarchio Orazio
Di Modica G
Cusmà Lorenzo
Polito Carmelo
Publication venue
Publication date: 01/01/2016
Field of study

Internet-of-Things scenarios will be typically characterized by huge amounts of data made available. A challenging task is to efficiently manage such data, by analyzing, elaborating and extracting useful information from them. Distributed computing framework such as Hadoop, based on the MapReduce paradigm, have been used to process such amounts of data by exploiting the computing power of many cluster nodes. However, as long as the computing context is made of clusters of homogeneous nodes interconnected through high speed links, the benefit brought by the such frameworks is clear and tangible. Unfortunately, in many real big data applications the data to be processed reside in many computationally heterogeneous data centers distributed over the planet. In those contexts, Hadoop was proved to perform very poorly. The proposal presented in this paper addresses this limitation. We designed a context-aware Hadoop framework that is capable of scheduling and distributing tasks among geographically distant clusters in a way that minimizes overall jobs execution time. The proposed scheduler leverages on the integer partitioning technique and on an a-priori knowledge of big data application patterns to explore the space of all possible task schedules and estimate the one expected to perform best. Final experiments conducted on a scheduler prototype prove the benefit of the approach

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

A Hadoop based Framework to Process Geo-distributed Big Data

Author: Cavallo Marco
Cusmà Lorenzo
Tomarchio O
Di Modica Giuseppe
Polito Carmelo
Publication venue
Publication date: 01/01/2016
Field of study

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Author Instructions

Author: Instructions Author
Publication venue
Publication date: 04/11/2013
Field of study

Crossref

Cartographic Perspectives (E-Journal - North American Cartographic Information Society, NACIS)

Going Beyond Counting First Authors in Author Co-citation Analysis

Author: Zhao Dangzhi
Publication venue
Publication date: 01/01/2005
Field of study

The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed

E-LIS