Least squares

$\newcommand{\ones}{\mathbf 1}$

Least-squares approximate solution of overdetermined equations

is a method to find a solution of $Ax=y$ when $A$ is not full rank.

Incorrect.
is most useful when $A$ is square.

Incorrect.
is a method for computing $A^{-1}$ that avoids numerical instability.

Incorrect.
is a method for finding a value of $x$ that minimizes $\|Ax-y\|$.

Correct!

Suppose $A\in {\mathbf R}^{m \times n}$ is skinny and full rank, $y \in {\mathbf R}^m$, and $x_\mathrm{ls} = (A^TA)^{-1}A^Ty$.

$x_\mathrm{ls}$ is the point in ${\mathbf R}^n$ closest (in terms of norm) to a solution of $Ax=y$.

True.

Incorrect.
False.

Correct! There may be no solution of $Ax=y$.

$Ax_\mathrm{ls}$ is the point in $\mathcal R(A)$ closest (in terms of norm) to $y$.

True.

Correct!
False.

Incorrect.

If $y \in \mathcal R (A)$, then $Ax_\mathrm{ls} =y$.

True.

Correct!
False.

Incorrect.

Suppose $y=Ax+v$, where $x\in {\mathbf R}^n$ is some set of parameters you wish to estimate, $y\in {\mathbf R}^m$ is a set of measurements, and $v$ represents a noise. We assume $m>n$, and $A$ is full rank. Consider an estimator of the form $\hat x=By$.

Choosing $B$ to be any left inverse of $A$ yields $\hat x =x$, no matter what $x$ is, provided $v=0$.

True.

Correct!
False.

Incorrect.

The choice $B = A^\dagger = (A^TA)^{-1}A^T$ yields $\hat x =x$, provided $v$ is small.

True.

Incorrect.
False.

Correct!

The choice $B = A^\dagger$ yields $\hat x$ that is closest to $x$.

True.

Incorrect.
False.

Correct!

The choice $B = A^\dagger$ yields $\hat x$ that minimizes the norm of $Ax-y$.

True.

Correct!
False.

Incorrect.

If $B$ is any left inverse of $A$, then for each $i,j$, $|B_{ij}| \geq |B^\dagger _{ij}|$.

True.

Incorrect.
False.

Correct!