MIMS EPrints
Not a member yet
2151 research outputs found
Sort by
Exploiting Fast Matrix Arithmetic in Block Low-Rank Factorizations
We consider the LU factorization of an matrix represented as a block low-rank (BLR) matrix: most of its off-diagonal blocks are approximated by matrices of small rank , which reduces the asymptoticcomplexity of computing the LU factorization of down to \O(n^2r). In this article, our aim is to further reduce this complexity by exploiting fast matrix arithmetic, that is, the ability to multiply two full-rank matrices together for \O(n^\w) flops, where \w<3. This is not straightforward: simply accelerating the intermediate operations performed in the standard BLR factorization algorithm does not suffice to reduce the quadratic complexity in , because these operations are performed on matrices whose size is too small. To overcome this obstacle, we devise a new BLR factorization algorithm that, by recasting the operations so as to work on intermediate matrices of larger size, can exploit more efficiently fast matrix arithmetic. This new algorithm achieves an asymptotic complexity of \O(n^{(\w+1)/2}r^{(\w-1)/2}), which represents an asymptotic improvement compared to the standard BLR factorization as soon as \w<3. In particular, for Strassen's algorithm, \w\approx2.81 yields a complexity of \O(n^{1.904}r^{0.904}). Our numerical experiments are in good agreement with this theoretical result
Accurate Computation of the Log-Sum-Exp and Softmax Functions
Evaluating the log-sum-exp function or the softmax function is a key step in many modern data science algorithms, notably in inference and classification. Because of the exponentials that these functions contain, the evaluation is prone to overflow and underflow, especially in low precision arithmetic. Software implementations commonly use alternative formulas that avoid overflow and reduce the chance of harmful underflow, employing a shift or another rewriting. Although mathematically equivalent, these variants behave differently in floating-point arithmetic. We give rounding error analyses of different evaluation algorithms and interpret the error bounds using condition numbers for the functions. We conclude, based on the analysis and numerical experiments, that the shifted formulas are of similar accuracy to the unshifted ones and that the shifted softmax formula is typically more accurate than a division-free variant
Determinants of Normalized Bohemian Upper Hessemberg Matrices
A matrix is Bohemian if its elements are taken from a finite
set of integers. We enumerate all possible determinants of Bohemian upper
Hessenberg matrices with subdiagonal fixed to one, and consider the special
case of families of matrices with only zeros on the main diagonal, whose
determinants proved to be related to a generalization of Fibonacci numbers.
Several conjectures recently stated by Corless and Thornton follow from our
results
An Updated Set of Nonlinear Eigenvalue Problems
This is the description of the new changes made in version 4.0 of the NLEVP MATLAB toolbox. The collection and its organization are described in a separate paper.
A user's guide describes how to install and use the NLEVP MATLAB toolbox
Substitution algorithms for rational matrix equations
We study equations of the form , where is a rational function and and are square matrices of the same size. We develop two techniques for solving these equations by inverting, through a substitution strategy, two schemes for the evaluation of rational functions of matrices. For block triangular matrices, the new methods yield the same computational cost as the evaluation schemes from which they are obtained. A general equation is reduced to block upper triangular form by exploiting the Schur decomposition of the given matrix. For real data, using the real Schur decomposition, the algorithms compute the real solutions using only real arithmetic, and numerical experiments show that our implementations are superior, in terms of both accuracy and speed, to existing alternatives. Finally, we discuss how these techniques can be used as building blocks in algorithms for computing functions of matrices defined through matrix equation of the type , where is a primary matrix function
SIMULATING LOW PRECISION FLOATING-POINT ARITHMETIC
The half precision (fp16) floating-point format, defined in the 2008 revision of the IEEE standard for floating-point arithmetic, and a more recently proposed half precision format bfloat16, are increasingly available in GPUs and other accelerators. While the support for low precision arithmetic is mainly motivated by machine learning applications, general purpose numerical algorithms can benefit from it, too, gaining in speed, energy usage, and reduced communication costs. Since the appropriate hardware is not always available, and one may wish to experiment with new arithmetics not yet implemented in hardware, software simulations of low precision arithmetic are needed. We discuss how to simulate low precision arithmetic using arithmetic of higher precision. We examine the correctness of such simulations and explain via rounding error analysis why a natural method of simulation can provide results that are more accurate than actual computations at low precision. We provide a MATLAB function chop that can be used to efficiently simulate fp16, bfloat16, and other low precision arithmetics, with or without the representation of subnormal numbers and with the options of round to nearest, directed rounding, stochastic rounding, and random bit flips in the significand. We demonstrate the advantages of this approach over defining a new MATLAB class and overloading operators
The Geometry of Elliptical Probability Contours for a Fix using Multiple Lines of Position
Navigation methods, traditional and modern, use lines of position in the plane. Standard Gaussian assumptions about errors leads to a constant sum of squared distances from the lines defining a probability contour. It is well known these contours are a family of ellipses centred on the most probable position and they can be computed using algebraic methods. In this paper we show how the most probable position, the axes and foci of ellipses can be found using geometric methods. This results in a ruler and compasses construction of these points and this gives insight into the way the shape and orientation of the probability contours depend on the angles between the lines of position. We start with the classical case of three lines of position with equal variances, we show how this can be extended to the case where the variances in the errors in the lines of position differ, and we go on to consider the case of four lines of position using a methodology that generalises to an arbitrary number of lines
Exploiting Lower Precision Arithmetic in Solving Symmetric Positive Definite Linear Systems and Least Squares Problems
What is the fastest way to solve a linear system in arithmetic of a given precision when is symmetric positive definite and otherwise unstructured? The usual answer is by Cholesky factorization, assuming that can be factorized. We develop an algorithm that can be faster, given an arithmetic of precision lower than the working precision as well as (optionally) one of higher precision. The arithmetics might, for example, be of precisions half, single, and double; half and double, possibly with quadruple; or single and double, possibly with quadruple. We compute a Cholesky factorization at the lower precision and use the factors as preconditioners in GMRES-based iterative refinement. To avoid breakdown of the factorization we shift the matrix by a small multiple of its diagonal. We explain why this is preferable to the common approach of shifting by a multiple of the identity matrix, We also incorporate scaling in order to avoid overflow and reduce the chance of underflow when working in IEEE half precision arithmetic. We extend the algorithm to solve a linear least squares problem with a well conditioned coefficient matrix by forming and solving the normal equations. In both algorithms most of the work is done at low precision provided that iterative refinement and the inner iterative solver converge quickly. We explain why replacing GMRES by the conjugate gradient method causes convergence guarantees to be lost, but show that this change has little effect on convergence in practice. Our numerical experiments confirm the potential of the new algorithms to provide faster solutions in environments that support multiple precisions of arithmetic