MIMS EPrints

Not a member yet

2151 research outputs found

Sort by

Matrices with Tunable Infinity-Norm Condition Number and No Need for Pivoting in LU Factorization

Author: Fasi Massimiliano
Higham Nicholas J.
Publication venue
Publication date: 01/08/2020
Field of study

We propose a two-parameter family of nonsymmetric dense n ⨉ n matrices A(α, β) for which LU factorization without pivoting is numerically stable, and we show how to choose α and β to achieve any value of the ∞-norm condition number. The matrix A(α, β) can be formed from a simple formula in O(n²) flops. The matrix is suitable for use in the HPL-AI Mixed-Precision Benchmark, which requires an extreme scale test matrix (dimension n > 10⁷) that has a controlled condition number and can be safely used in LU factorization without pivoting. It is also of interest as a general-purpose test matrix

Stochastic Rounding and its Probabilistic Backward Error Analysis

Author: Connolly Michael P.
Higham Nicholas J.
Mary Theo
Publication venue
Publication date: 28/04/2020
Field of study

Stochastic rounding rounds a real number to the next larger or smaller floating-point number with probabilities

1

minus the relative distances to those numbers. It is gaining attention in deep learning because it can increase the success of low precision computations. We compare basic properties of stochastic rounding with those for round to nearest, finding properties in common as well as significant differences. We prove that for stochastic rounding the rounding errors are mean independent random variables with zero mean. We derive a new version of our probabilistic error analysis theorem from [{\em SIAM J. Sci. Comput.}, 41 (2019), pp.\ A2815--A2835], weakening the assumption of independence of the random variables to mean independence. These results imply that for a wide range of linear algebra computations the backward error for stochastic rounding is unconditionally bounded by a multiple of

\sqrt{n}\mkern1muu

to first order, with a certain probability, where

n

is the problem size and

u

is the unit roundoff. This is the first scenario where the rule of thumb that one can replace

nu

\sqrt{n}\mkern1muu

in a rounding error bound has been shown to hold without any additional assumptions on the rounding errors. We also explain how stochastic rounding avoids the phenomenon of stagnation in sums, whereby small addends are obliterated by round to nearest when they are too small relative to the sum

A Multiprecision Derivative-Free Schur--Parlett Algorithm for Computing Matrix Functions

Author: Higham Nicholas J.
Liu Xiaobo
Publication venue
Publication date: 07/09/2020
Field of study

The Schur--Parlett algorithm, implemented in MATLAB as \texttt{funm}, computes a function

f(A)

of an

n\times n

matrix

A

by using the Schur decomposition and a block recurrence of Parlett. The algorithm requires the ability to compute

f

and its derivatives, and it requires that

f

has a Taylor series expansion with a suitably large radius of convergence. We develop a version of the Schur--Parlett algorithm that requires only function values and uses higher precision arithmetic to evaluate

f

on the diagonal blocks of order greater than

2

(if there are any) of the reordered and blocked Schur form. The key idea is to compute by diagonalization the function of a small random diagonal perturbation of each triangular block, where the perturbation ensures that diagonalization will succeed. This multiprecision Schur--Parlett algorithm is applicable to arbitrary functions

f

and, like the original Schur--Parlett algorithm, it generally behaves in a numerically stable fashion. Our algorithm is inspired by Davies's randomized approximate diagonalization method, but we explain why that is not a reliable numerical method for computing matrix functions. We apply our algorithm to the matrix Mittag--Leffler function and show that it yields results of accuracy similar to, and in some cases much greater than, the state of the art algorithm for this function. The algorithm will be useful for evaluating any matrix function for which the derivatives of the underlying function are not readily available or accurately computable

Mixed Precision LU Factorization on GPU Tensor Cores: Reducing Data Movement and Memory Footprint

Author: Lopez Florent
Mary Theo
Publication venue
Publication date: 13/09/2020
Field of study

Three-Precision GMRES-Based Iterative Refinement for Least Squares Problems

Author: Carson Erin
Higham Nicholas J.
Pranesh Srikara
Publication venue
Publication date: 03/02/2020
Field of study

Performance Impact of Precision Reduction In Sparse Linear Systems Solvers

Author: Zounon Mawussi
Higham Nicholas J.
Lucas Craig
Tisseur Françoise
Publication venue
Publication date: 20/09/2020
Field of study

It is well established that reduced precision arithmetic can be exploited to accelerate the solution of dense linear systems. Typical examples are mixed precision algorithms that reduce the execution time and the energy consumption of parallel solvers for dense linear systems by factorizing a matrix at a precision lower than the working precision. Much less is known about the efficiency of reduced precision in parallel solvers for sparse linear systems, and existing work focuses on single core experiments. We evaluate the benefits of using single precision arithmetic in solving a double precision sparse linear system using multiple cores. We consider both direct methods and iterative methods and we focus on using single precision for the key components of LU factorization and matrix–vector products. Our results show that the anticipated speedup of 2 over a double precision LU factorization is obtained only for the very largest of our test problems. We point out two key factors underlying the poor speedup. First, we find that single precision sparse LU factorization is prone to a severe loss of performance due to the intrusion of subnormal numbers. We identify a mechanism that allows cascading fill-ins to generate subnormal numbers and show that automatically flushing subnormals to zero avoids the performance penalties. The second factor is the lack of parallelism in the analysis and reordering phases of the solvers and the absence of floating-point arithmetic in these phases. For iterative solvers, we find that for the majority of the matrices computing or applying incomplete factorization preconditioners in single precision provides at best modest performance benefits compared with the use of double precision. We also find that using single precision for the matrix–vector product kernels provides an average speedup of 1.5 over double precision kernels. In both cases some form of refinement is needed to raise the single precision results to double precision accuracy, which will reduce performance gains

Generating extreme-scale matrices with specified singular values or condition numbers

Author: Fasi Massimiliano
Higham Nicholas J.
Publication venue
Publication date: 27/03/2020
Field of study

A widely used form of test matrix is the randsvd matrix constructed as the product A = USV*, where U and V are random orthogonal or unitary matrices from the Haar distribution and S is a diagonal matrix of singular values. Such matrices are random but have a specified singular value distribution. The cost of forming an m-by-n randsvd matrix is m³ + n³ flops, which is prohibitively expensive at extreme scale; moreover, the randsvd construction requires a significant amount of communication, making it unsuitable for distributed memory environments. By dropping the requirement that U and V be Haar distributed and that both be random, we derive new algorithms for forming A that have cost linear in the number of matrix elements and require a low amount of communication and synchronization. We specialize these algorithms to generating matrices with specified 2-norm condition number. Numerical experiments show that the algorithms have excellent efficiency and scalability

A Multiprecision Derivative-Free Schur--Parlett Algorithm for Computing Matrix Functions

Author: Higham Nicholas J.
Liu Xiaobo
Publication venue
Publication date: 07/09/2020
Field of study

The Schur--Parlett algorithm, implemented in MATLAB as \texttt{funm}, computes a function

f(A)

of an

n\times n

matrix

A

by using the Schur decomposition and a block recurrence of Parlett. The algorithm requires the ability to compute

f

and its derivatives, and it requires that

f

has a Taylor series expansion with a suitably large radius of convergence. We develop a version of the Schur--Parlett algorithm that requires only function values and not derivatives. The algorithm requires access to arithmetic of a matrix-dependent precision at least double the working precision, which is used to evaluate

f

on the diagonal blocks of order greater than

2

(if there are any) of the reordered and blocked Schur form. The key idea is to compute by diagonalization the function of a small random diagonal perturbation of each diagonal block, where the perturbation ensures that diagonalization will succeed. Our algorithm is inspired by Davies's randomized approximate diagonalization method, but we explain why that is not a reliable numerical method for computing matrix functions. This multiprecision Schur--Parlett algorithm is applicable to arbitrary functions

f

and, like the original Schur--Parlett algorithm, it generally behaves in a numerically stable fashion. The algorithm is especially useful when the derivatives of the underlying function are not readily available or accurately computable. We apply our algorithm to the matrix Mittag--Leffler function and show that it yields results of accuracy similar to, and in some cases much greater than, the state of the art algorithm for this function

Mixed Precision LU Factorization on GPU Tensor Cores: Reducing Data Movement and Memory Footprint

Author: Lopez Florent
Mary Theo
Publication venue
Publication date: 13/09/2020
Field of study

Random Matrices Generating Large Growth in LU Factorization with Pivoting

Author: Higham Desmond J.
Higham Nicholas J.
Pranesh Srikara
Publication venue
Publication date: 14/05/2020
Field of study

We identify a class of random, dense

n\times n

matrices for which LU factorization with any form of pivoting produces a growth factor of at least

n/(4 \log n)

for large

n

with high probability. The condition number of the matrices can be arbitrarily chosen and large growth also happens for the transpose. No previous matrices with all these properties were known. The matrices can be generated by the MATLAB function \verb"gallery('randsvd',..)", and they are formed as the product of two random orthogonal matrices from the Haar distribution with a diagonal matrix having only one diagonal entry different from

1

, which lies between

0

and

1

(the ``one small singular value'' case). Our explanation for the large growth uses the fact that the maximum absolute value of any element of a Haar distributed orthogonal matrix tends to be relatively small for large

n

. We verify the behavior numerically, finding that for partial pivoting the actual growth is significantly larger than the lower bound, and much larger than the growth observed for random matrices with elements from the uniform

[0,1

] or standard normal distributions. We show more generally that a rank-

1

perturbation to an orthogonal matrix producing large growth for any form of pivoting also generates large growth under reasonable assumptions. Finally, we demonstrate that GMRES-based iterative refinement can provide stable solutions to

Ax = b

when large growth occurs in low precision LU factors, even when standard iterative refinement cannot

1,445

full texts

2,151

metadata records

Updated in last 30 days.

MIMS EPrints

Access Repository Dashboard

Do you manage Open Research Online? Become a CORE Member to access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! 👇