1,721,181 research outputs found
Emerging Internet-based services: New frontiers for performance models and applications
N/
Programming and Runtime Supports for SPMD Cluster Computing
this paper, we describe the programming environment called DAME (DAta Migration Environment) that supports the development and execution of SPMD applications on heterogeneous clusters. DAME overcomes the difficulties of SPMD cluster computing by means of transparent supports that hide irregular network topology, dynamically adapt data distribution to platform conditions, and mask the consequent non-uniform distribution to the programmer. The originality of DAME with respect to frameworks that share analogous goals are the following: - DAME adopts a traditional programming paradigm such as SPMD, while other tools are oriented to different models. For example, Dataparallel C supports SIMD computations [Ned93], Piranha supports Linda programs [Car95]. - DAME provides data migrations, while other frameworks such as Utopia [Zho93], UPVM and MPVM [Cas94, Cas95] are oriented to task migrations. - DAME includes applications with explicit message-passing while other tools that provide data migration (such as ADM [Cas94] and the framework described in [Ham95]) are oriented to master-slave applications without inter-task communications. - DAME is highly portable because is built on PVM. Moreover, it provides transparent supports for static and dynamic load balancing
DAME: An environment for preserving the efficiency of data-parallel computations on distributed systems
Data parallel programming is the most widely adopted paradigm for a large class of problems on traditional multicomputers (see SPMD programming model sidebar, p. 23). Nevertheless, it is a very hard task to preserve efficiency when this style is adopted on a cluster of heterogeneous nodes having nonuniform and time varying computational powers. Very popular packages, such as PVM and MPI, 1,2 allow the programmer to use these clusters as an homogeneous parallel virtual machine. However, they do not avoid the potential inefficiencies due to the unpredictable variability of usually shared resources. DAME, an acronym for DAta Migration Environment, overcomes such drawbacks by means of transparent supports that hide irregular network topology, adapt the data distribution dynamically to platform conditions, and mask the consequent nonuniform distribution to the programmer
Dynamic data decomposition in a message-passing environment
The performance of a data parallel program is critically dependent on the data decomposition that the programmer chooses at implementation time. This choice must take into account a combination of different factors such as the kind of the problem, the machine architecture and the data domain size. When these elements are known before execution, the programmer can adopt traditional message-passing languages and optimise performance by means of programs which are dependent on the chosen data decomposition. On the other hand, when the factors that determine the best decomposition are known at run-time only, adequate efficiency can be achieved by a code that dynamically adapts its computation/communication pattern to various decompositions. To assist the programmer in the implementation of a decomposition-independent code, we propose a new programming environment, namely PLUS. It provides the programmer with message-passing primitives that avoid the specification of a data decomposition during the implementation phase, and a run-time support that permits dynamic changes among regular decompositions without affecting the implemented program
Dynamic Data Reconfiguration for SPMD Programs in Faulty Multicomputer
The single-program multiple-data (SPMD) paradigm is becoming the most diffuse way to program commercial multicomputers. In this paper we demonstrate that for a wide class of SPMD algorithms it is possible to achieve an efficient fault tolerance avoiding hardware redundancy. We propose a software approach that aims to reconfigure data, thus achieving a good slowdown in computation owing to the fine granularity of the workload to redistribute. In particular, we present and compare three data reconfiguration strategies applied to a problem model that includes a wide class of SPMD iterative algorithms characterized by nonlocal communications among the nodes. The result is that in most of the cases it is better to introduce some communication overhead than to leave idle a few healthy processor
Subcube matrix decomposition: A unifying view for LU factorization on multicomputers
This paper represents the first attempt towards a decomposition-independent implementation of parallel algorithms for matrix computations. Main framework is the Subcube Matrix Decomposition, given in this work, that allows to view in a unifying way the most diffuse matrix decompositions on multicomputers. Its decomposition-independent properties extend, moreover, to the design of parallel algorithms, to their optimization, and to the performance analysis. We have verified all these characteristics by implementing an LU factorization meta-algorithm that unifies the known parallel programs and allows a decomposition-independent performance analysis both analytical and experimental
Load balancing and parallel implementation of iterative algorithms for row-continuous Markov chains
Presents the first parallel algorithms for solving row-continuous or generalized birth-death (GBD) Markov chains on distributed memory MIMD multiprocessors. These systems are characterized by very large transition probability matrices, decomposable in heterogeneous tridiagonal blocks. The parallelization of three aggregation/disaggregation iterative methods is carried out by a unique framework that keeps into account the special matrix structure. Great effort has been also devoted to define a general algorithm for approximating the optimum workload. Various computational experiments show that Vantilborgh's (1985) method is the fastest of the three algorithms on any data set dimensio
Adaptive TTL schemes for load balancing of distributed web servers
With ever increasing web traffic, a distributed Web system can provide scalability and flexibility to cope with growing client demands. Load balancing algorithms to spread the load across multiple Web servers are crucial to achieve the scalability. Various domain name server (DNS) based schedulers have been proposed in the literature, mainly for multiple homogeneous servers. DNS provides (logical) host name to IP-address mapping (i.e., the server assignment), but the mapping is not done for each server access. This is because the address mapping is cached for a time-to-live (TTL) period to reduce network traffic. The presence of heterogeneous Web servers not only increases the complexity of the DNS scheduling problem, but also makes previously proposed algorithms for homogeneous distributed systems such as round robin not directly applicable. This leads us to propose new policies, called adaptive TTL algorithms, that take both the uneven distribution of client request rates and heterogeneity of Web servers into account to adaptively set the TTL value for each address mapping request. Extensive simulation results show that these strategies are effective in balancing load among geographically distributed heterogeneous Web servers
Unifying and optimizing parallel linear algebra algorithms
Two issues in linear algebra algorithms for multicomputers are addressed. First, how tounify parallel implementations of the same algorithm in a decomposition-independent way. Second, how to optimize naive parallel programs maintaining the decompositionindependence. Several matrix decompositions are viewed as instances of a more generalallocation function called subcube matrix decomposition. By this meta-decomposition, aprogramming environment characterized by general primitives that allow one to designmeta-algorithms independently of a particular decomposition. The authors apply such aframework to the parallel solution of dense matrices. This demonstrates that most of theexisting algorithms can be derived by suitably setting the primitives used in themeta-algorithm. A further application of this programming style concerns the optimization of parallel algorithms. The idea to overlap communication and computation has been extended from 1-D decompositions to 2-D decompositions. Thus, a first attempt towards a decomposition-independent definition of such optimization strategies is provided
- …
