MIMS EPrints
Not a member yet
    2151 research outputs found

    Exploiting Fast Matrix Arithmetic in Block Low-Rank Factorizations

    Full text link
    We consider the LU factorization of an n×nn\times n matrix AA represented as a block low-rank (BLR) matrix: most of its off-diagonal blocks are approximated by matrices of small rank rr, which reduces the asymptoticcomplexity of computing the LU factorization of AA down to \O(n^2r). In this article, our aim is to further reduce this complexity by exploiting fast matrix arithmetic, that is, the ability to multiply two n×nn\times n full-rank matrices together for \O(n^\w) flops, where \w<3. This is not straightforward: simply accelerating the intermediate operations performed in the standard BLR factorization algorithm does not suffice to reduce the quadratic complexity in nn, because these operations are performed on matrices whose size is too small. To overcome this obstacle, we devise a new BLR factorization algorithm that, by recasting the operations so as to work on intermediate matrices of larger size, can exploit more efficiently fast matrix arithmetic. This new algorithm achieves an asymptotic complexity of \O(n^{(\w+1)/2}r^{(\w-1)/2}), which represents an asymptotic improvement compared to the standard BLR factorization as soon as \w<3. In particular, for Strassen's algorithm, \w\approx2.81 yields a complexity of \O(n^{1.904}r^{0.904}). Our numerical experiments are in good agreement with this theoretical result

    Accurate Computation of the Log-Sum-Exp and Softmax Functions

    Full text link
    Evaluating the log-sum-exp function or the softmax function is a key step in many modern data science algorithms, notably in inference and classification. Because of the exponentials that these functions contain, the evaluation is prone to overflow and underflow, especially in low precision arithmetic. Software implementations commonly use alternative formulas that avoid overflow and reduce the chance of harmful underflow, employing a shift or another rewriting. Although mathematically equivalent, these variants behave differently in floating-point arithmetic. We give rounding error analyses of different evaluation algorithms and interpret the error bounds using condition numbers for the functions. We conclude, based on the analysis and numerical experiments, that the shifted formulas are of similar accuracy to the unshifted ones and that the shifted softmax formula is typically more accurate than a division-free variant

    Determinants of Normalized Bohemian Upper Hessemberg Matrices

    No full text
    A matrix is Bohemian if its elements are taken from a finite set of integers. We enumerate all possible determinants of Bohemian upper Hessenberg matrices with subdiagonal fixed to one, and consider the special case of families of matrices with only zeros on the main diagonal, whose determinants proved to be related to a generalization of Fibonacci numbers. Several conjectures recently stated by Corless and Thornton follow from our results

    An Updated Set of Nonlinear Eigenvalue Problems

    No full text
    This is the description of the new changes made in version 4.0 of the NLEVP MATLAB toolbox. The collection and its organization are described in a separate paper. A user's guide describes how to install and use the NLEVP MATLAB toolbox

    Substitution algorithms for rational matrix equations

    Full text link
    We study equations of the form r(X)=Ar(X) = A, where rr is a rational function and AA and XX are square matrices of the same size. We develop two techniques for solving these equations by inverting, through a substitution strategy, two schemes for the evaluation of rational functions of matrices. For block triangular matrices, the new methods yield the same computational cost as the evaluation schemes from which they are obtained. A general equation is reduced to block upper triangular form by exploiting the Schur decomposition of the given matrix. For real data, using the real Schur decomposition, the algorithms compute the real solutions using only real arithmetic, and numerical experiments show that our implementations are superior, in terms of both accuracy and speed, to existing alternatives. Finally, we discuss how these techniques can be used as building blocks in algorithms for computing functions of matrices defined through matrix equation of the type f(X)=Af(X) = A, where ff is a primary matrix function

    SIMULATING LOW PRECISION FLOATING-POINT ARITHMETIC

    Full text link
    The half precision (fp16) floating-point format, defined in the 2008 revision of the IEEE standard for floating-point arithmetic, and a more recently proposed half precision format bfloat16, are increasingly available in GPUs and other accelerators. While the support for low precision arithmetic is mainly motivated by machine learning applications, general purpose numerical algorithms can benefit from it, too, gaining in speed, energy usage, and reduced communication costs. Since the appropriate hardware is not always available, and one may wish to experiment with new arithmetics not yet implemented in hardware, software simulations of low precision arithmetic are needed. We discuss how to simulate low precision arithmetic using arithmetic of higher precision. We examine the correctness of such simulations and explain via rounding error analysis why a natural method of simulation can provide results that are more accurate than actual computations at low precision. We provide a MATLAB function chop that can be used to efficiently simulate fp16, bfloat16, and other low precision arithmetics, with or without the representation of subnormal numbers and with the options of round to nearest, directed rounding, stochastic rounding, and random bit flips in the significand. We demonstrate the advantages of this approach over defining a new MATLAB class and overloading operators

    The Geometry of Elliptical Probability Contours for a Fix using Multiple Lines of Position

    Full text link
    Navigation methods, traditional and modern, use lines of position in the plane. Standard Gaussian assumptions about errors leads to a constant sum of squared distances from the lines defining a probability contour. It is well known these contours are a family of ellipses centred on the most probable position and they can be computed using algebraic methods. In this paper we show how the most probable position, the axes and foci of ellipses can be found using geometric methods. This results in a ruler and compasses construction of these points and this gives insight into the way the shape and orientation of the probability contours depend on the angles between the lines of position. We start with the classical case of three lines of position with equal variances, we show how this can be extended to the case where the variances in the errors in the lines of position differ, and we go on to consider the case of four lines of position using a methodology that generalises to an arbitrary number of lines

    Exploiting Lower Precision Arithmetic in Solving Symmetric Positive Definite Linear Systems and Least Squares Problems

    No full text
    What is the fastest way to solve a linear system Ax=bAx= b in arithmetic of a given precision when AA is symmetric positive definite and otherwise unstructured? The usual answer is by Cholesky factorization, assuming that AA can be factorized. We develop an algorithm that can be faster, given an arithmetic of precision lower than the working precision as well as (optionally) one of higher precision. The arithmetics might, for example, be of precisions half, single, and double; half and double, possibly with quadruple; or single and double, possibly with quadruple. We compute a Cholesky factorization at the lower precision and use the factors as preconditioners in GMRES-based iterative refinement. To avoid breakdown of the factorization we shift the matrix by a small multiple of its diagonal. We explain why this is preferable to the common approach of shifting by a multiple of the identity matrix, We also incorporate scaling in order to avoid overflow and reduce the chance of underflow when working in IEEE half precision arithmetic. We extend the algorithm to solve a linear least squares problem with a well conditioned coefficient matrix by forming and solving the normal equations. In both algorithms most of the work is done at low precision provided that iterative refinement and the inner iterative solver converge quickly. We explain why replacing GMRES by the conjugate gradient method causes convergence guarantees to be lost, but show that this change has little effect on convergence in practice. Our numerical experiments confirm the potential of the new algorithms to provide faster solutions in environments that support multiple precisions of arithmetic

    1,445

    full texts

    2,151

    metadata records
    Updated in last 30 days.
    MIMS EPrints
    Access Repository Dashboard
    Do you manage Open Research Online? Become a CORE Member to access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! 👇