1,721,044 research outputs found
A parametric study of the term structure dynamics
We present an analysis of the dynamics of the term structure of interest rates based on the study of the time evolution of the parameters of a variation of the Nelson–Siegel model. The results show that it is extremely difficult to find a relation between the evolution of the term structure and the behavior of macroeconomic variables different from the official interest rate
A GPU implementation of the Factored Sparse Approximate Inverse preconditioner for the iterative solution of SPD linear systems
Analysis and experimentation over eterogeneous wireless networks
Wireless and mobile networks represent an enabling technology for ubiquitous access to information systems. However, there are critical issues that still prevent the widespread use of these technologies. In this paper we analyze and discuss our experience over a real ubiquitous network testbed capable to provide a seamless hand-off among heterogeneous networks. We describe Mobile IPv6/IPv4 interoperability and an efficient mechanism, based on link-layer information, for a seamless handoff among wired and wireless networks. We present the solutions adopted in setting up a real testbed and provide an evaluation of the observed performance, including a characterization of interoperability among three wireless access network technologies: 802.11 WLAN, GPRS, and UMTS
A Dynamic Pattern Factored Sparse Approximate Inverse Preconditioner on Graphics Processing Units
A factored sparse approximate inverse preconditioned conjugate gradient solver on graphics processing units
Graphics Processing Units (GPUs) exhibit significantly higher peak performance than conventional CPUs. However, in general only highly parallel algorithms can exploit their potential. In this scenario, the iterative solution to sparse linear systems of equations could be carried out quite efficiently on a GPU as it requires only matrix-by-vector products, dot products, and vector updates. However, to be really effective, any iterative solver needs to be properly preconditioned and this represents a major bottleneck for a successful GPU implementation. Due to its inherent parallelism, the factored sparse approximate inverse (FSAI) preconditioner represents an optimal candidate for the conjugate gradient-like solution of sparse linear systems. However, its GPU implementation requires a nontrivial recasting of multiple computational steps. We present our GPU version of the FSAI preconditioner along with a set of results that show how a noticeable speedup with respect to a highly tuned CPU counterpart is obtained
Solutions to the st-connectivity problem using a gpu-based distributed bfs
The st-connectivity problem (ST-CON) is a decision problem that asks, for vertices ss and tt in a graph, if tt is reachable from ss. Although originally defined for directed graphs, it can also be studied on undirected graphs and used as a building block for solving more complex tasks on large scale graphs. We present solutions to ST-CON based on a high performance Breadth First Search (BFS) executed on clusters of Graphics Processing Units (GPUs) using the Nvidia CUDA platform. To measure performances, we use the number of ST-CONs per second. We present the results for two different implementations that highlight the impact of atomic operations in CUDA
Kite attack: reshaping the cube attack for a flexible GPU-based maxterm search
Dinur and Shamir’s cube attack has attracted significant attention in the literature. Nevertheless, the lack of implementations achieving effective results casts doubts on its practical relevance. On the theoretical side, promising results have been recently achieved leveraging on division trails. The present paper follows a more practical approach and aims at giving new impetus to this line of research by means of a cipher-independent flexible framework that is able to carry out the cube attack on GPU/CPU clusters. We address all issues posed by a GPU implementation, providing evidence in support of parallel variants of the attack and identifying viable directions for solving open problems in the future. We report the results of running our GPU-based cube attack against round-reduced versions of three well-known ciphers: Trivium, Grain-128 and SNOW 3G. Our attack against Trivium improves the state of the art, permitting full key recovery for Trivium reduced to (up to) 781 initialization rounds (out of 1152) and finding the first-ever maxterm after 800 rounds. In this paper, we also present the first standard cube attack (i.e., neither dynamic nor tester) to yield maxterms for Grain-128 up to 160 initialization rounds on non-programmable hardware. We include a thorough evaluation of the impact of system parameters and GPU architecture on the performance. Moreover, we demonstrate the scalability of our solution on multi-GPU systems. We believe that our extensive set of results can be useful for the cryptographic engineering community at large and can pave the way to further results in the area
Dynamic merging of frontiers for accelerating the evaluation of betweenness centrality
Betweenness Centrality (BC) is a widely used metric of the relevance of a node in a network. The fastest-known algorithm for the evaluation of BC on unweighted graphs builds a tree representing information about the shortest paths for each vertex to calculate its contribution to the BC score. Actually, for specific vertices, the shortest-path trees of neighboring nodes could be leveraged to reduce the computational burden, but existing BC algorithms do not exploit that information and carry out redundant computations. We propose a new algorithm, called dynamic merging of frontiers, which makes use of such information to derive the BC score of degree-2 vertices by re-using the results of the sub-trees of the neighbors. We implemented our idea in parallel fashion exploiting Graphics Processing Units. Compared to state-of-the-art implementations, our approach achieves a linear improvement in the number of degree-2 vertices and an average improvement of × over a variety of real-world graphs
Scalable betweenness centrality on multi-GPU systems
Betweenness Centrality (BC) is steadily growing in popularity as a metrics of the influence of a vertex in a graph. The BC score of a vertex is proportional to the number of all-pairs-shortest-paths passing through it. However, complete and exact BC computation for a large-scale graph is an extraordinary challenge that requires high performance computing techniques to provide results in a reasonable amount of time. Our approach combines bi-dimensional (2-D) decomposition of the graph and multi-level parallelism together with a suitable data-thread mapping that overcomes most of the difficulties caused by the irregularity of the computation on GPUs. In order to reduce time and space requirements of BC computation, a heuristics based on 1-degree reduction technique is developed as well. Experimental results on synthetic and real-world graphs show that the proposed techniques are well suited to compute BC scores in graphs which are too large to fit in the memory of a single computational node
Betweenness centrality on Multi-GPU systems
Betweenness Centrality (BC) is steadily growing in popularity as a metrics of the inuence of a vertex in a graph. The exact BC computation for a large scale graph is an extraordinary challenging and requires high performance computing techniques to provide results in a reasonable amount of time. Here, we present the techniques we developed to speed-up the computation of the BC on Multi-GPU systems. Our approach combines the bi-dimensional (2-D) decomposition of the graph and multi-level parallelism. Experimental results show that the proposed techniques are well suited to compute BC scores in graphs which are too large to fit in single GPU memory. In particular, the computation time of a 234 million edges graph is reduced to less than 2 hour
- …
