SOL Logo

Systems Optimization Laboratory

Stanford University
Dept of Management Science and Engineering (MS&E)

Huang Engineering Center

Stanford, CA 94305-4121  USA

LSMR: Sparse Equations and Least Squares

  • AUTHORS: David Fong, Michael Saunders.
  • CONTRIBUTORS: Dominique Orban, Austin Benson, Victor Minden, Matthieu Gomez, Nick Gould, Jennifer Scott.
  • CONTENTS: Implementation of a conjugate-gradient type method for solving sparse linear equations and sparse least-squares problems: \begin{align*} \text{Solve } & Ax=b \\ \text{or minimize } & \|Ax-b\|^2 \\ \text{or minimize } & \|Ax-b\|^2 + \lambda^2 \|x\|^2 \end{align*} where the matrix \(A\) may be square or rectangular (over-determined or under-determined), and may have any rank. It is represented by a routine for computing \(Av\) and \(A^T u\) for given vectors \(v\) and \(u\).

    The scalar \(\lambda\) is a damping parameter. If \(\lambda > 0\), the solution is "regularized" in the sense that a unique solution always exists, and \(\|x\|\) is bounded.

    The method is based on the Golub-Kahan bidiagonalization process. It is algebraically equivalent to applying MINRES to the normal equation \( (A^T A + \lambda^2 I) x = A^T b, \) but has better numerical properties, especially if \(A\) is ill-conditioned.

    If \(A\) is symmetric, use SYMMLQ, MINRES, or MINRES-QLP.

    If \(A\) has low column rank and \(\lambda=0\), the solution is not unique. LSMR returns the solution of minimum length. Thus for under-determined systems, it solves the problem \(\min \|x\| \text{ subject to } Ax=b\). More generally, it solves the problem \(\min \|x\| \text{ subject to } A^T Ax=A^T b\), where \(A\) may have any shape or rank.

    For \(\min \|x\| \text{ subject to } Ax=b\), consider using CRAIG.

    Special feature: Both \(\|r\|\) and \(\|A^T r\|\) decrease monotonically, where \(r = b - Ax\) is the current residual. For LSQR, only \(\|r\|\) is monotonic. LSQR is recommended for compatible systems \(Ax=b\), but on least-squares problems with loose stopping tolerances, LSMR may be able to terminate significantly sooner than LSQR.

    Special application: To find a null vector of a singular (square or rectangular) matrix \(A\), apply LSMR to the system \( \min \|A^T x - b\| \) with any nonzero vector \(b\) (e.g. a random \(b\)). At a minimizer, the residual vector \(r = b - A^T x \) will satisfy \( Ar=0 \). See [1] for examples.

    Preconditioning: If \(A\) is an explicit sparse matrix, it is straightforward to scale its columns so that every column has unit 2-norm. This is called diagonal preconditioning and should be done wherever possible. It is likely to reduce the number of LSMR iterations significantly without affecting the time per iteration. LSMR solves the problem \( \min \|ADy - b\| \) and then \( x = Dy \) solves the original problem, where the \(j\)th diagonal of \(D\) is the reciprocal of the 2-norm of the \(j\)th column of \(A\).

    If \(A\) is an operator, diagonal preconditioning is not simple unless the user can estimate the column norms of \(A\) accurately.

    A general preconditioner depends on the user's knowledge of \(A\). For some square nonsingular matrix \(B\), the user asks LSMR to solve the problem \( \min \|AB^{-1}y - b\| \) and then recovers the final solution by solving \(Bx = y\). The user now needs to code matrix-vector products of the form \(AB^{-1}v\) and \(B^{-T}A^T u\) by solving linear systems involving \(B\) and \(B^T\) before and after the products with \(A\) and \(A^T\) respectively.

    \(B\) will be a good preconditioner if \(AB^{-1}\) is significantly better conditioned that \(A\). An ideal preconditioner would be \(B=R\), where \(R\) is a nonsingular triangular matrix coming from the QR factors of \(A\). Thus incomplete QR factors of \(A\) might give a good preconditioner.

    Reverse communication: One f90 version (lsmrReverse) implements reverse communication for the products \(Av\) and \(A^T u\). Each time LSMR needs a product, it returns control to the user. This permits users to implement their own stopping rule, and perhaps their own preconditioner. Ideally the stopping rule should apply to the original problem and not to the preconditioned problem.

    [1] S.-C. T. Choi (2006). Iterative Methods for Singular Linear Equations and Least-Squares Problems, PhD thesis, ICME, Stanford University.

    [2] D. C.-L. Fong and M. A. Saunders, LSMR: An iterative algorithm for sparse least-squares problems, SIAM J. Sci. Comput. 33:5, 2950-2971, published electronically Oct 27, 2011.

    14 Apr 2010: Matlab implementation.
    03 Jun 2010: Python implementation.
    17 Jul 2010: f90 implementation (Beta).
    07 Sep 2010: f90 local reorthogonalization fixed.
    26 Oct 2012: f90 test program updated.
    05 Aug 2013: Complex f90 version added.
    02 May 2014: Exit message corrected for LS solutions.
    30 Sep 2015: Links added to Julia versions.
    27 Nov 2015: f90 lsmrReverse added (reverse communication).
    05 Jul 2016: Link to C++ implementation added (Tom Vercauteren).