Let $r(x) = g(x) - b$ denote the
residual.
In terms of the residual, the optimality condition $\nabla f(x) = 0$
becomes
$$J_g(x)^T r(x) = 0.$$
Residual
$$J_g(x)^T r(x) = 0.$$
Since $g$ is non-linear in general, this system is
non-linear.
We can use Newton's method to solve this system!
However, the product makes for a nasty derivative calculation
so we might be better off using a different solution method!
Gauss-Newton
We'll develop an iterative method based on a
Taylor expansion!
At the $\color{var(--emphColor)}{k}$th step, we approximate
$$g(x^{(\color{var(--emphColor)}{k})} + h) \approx g(x^{(\color{var(--emphColor)}{k})}) + J_g(x^{(\color{var(--emphColor)}{k})}) h.$$
Then we solve the linear least squares
problem
$$\min_h ||J_g(x^{(\color{var(--emphColor)}{k})}) h - (-g(x^{(\color{var(--emphColor)}{k})}) + b) ||^2$$
Gauss-Newton
We may solve the
linear least squares problem
in the variable $h$ by
solving the
normal equations
$$J_g(x^{(\color{var(--emphColor)}{k})})^T J_g(x^{(\color{var(--emphColor)}{k})}) h
= -J_g(x^{(\color{var(--emphColor)}{k})})^T r(x^{(\color{var(--emphColor)}{k})}).$$
Pseudocode
\begin{algorithm}
\caption{Gauss-Newton method}
\begin{algorithmic}
\Input{initial guess $x^{(0)}$, tolerance $\epsilon$}
\FOR{$k \gets 1$ to max\_iter}
\STATE{$J \gets J_g(x^{(k)})$}
\STATE{solve $(J^T J) h = -J^T r(x^{(k)})$ for $h$}
\STATE{$x^{(k+1)} \gets x^{(k)} + h$}
\IF{$||h|| < \epsilon$}
\STATE{break}
\ENDIF
\ENDFOR
\end{algorithmic}
\end{algorithm}
Pseudocode
\begin{algorithm}
\caption{Gauss-Newton method}
\begin{algorithmic}
\Input{$\color{var(--emphColor)}{\text{initial guess }x^{(0)}}$, tolerance $\epsilon$}
\FOR{$k \gets 1$ to max\_iter}
\STATE{$J \gets J_g(x^{(k)})$}
\STATE{solve $(J^T J) h = -J^T r(x^{(k)})$ for $h$}
\STATE{$x^{(k+1)} \gets x^{(k)} + h$}
\IF{$||h|| < \epsilon$}
\STATE{break}
\ENDIF
\ENDFOR
\end{algorithmic}
\end{algorithm}
The convergence properties of the Gauss-Newton method
also
depend on a choice of initial guess.
Pseudocode
\begin{algorithm}
\caption{Gauss-Newton method}
\begin{algorithmic}
\Input{initial guess $x^{(0)}$, tolerance $\epsilon$}
\FOR{$k \gets 1$ to max\_iter}
\STATE{$J \gets J_g(x^{(k)})$}
\STATE{solve $\color{var(--emphColor)}{(J^T J) h = -J^T r(x^{(k)})}$ for $h$}
\STATE{$x^{(k+1)} \gets x^{(k)} + h$}
\IF{$||h|| < \epsilon$}
\STATE{break}
\ENDIF
\ENDFOR
\end{algorithmic}
\end{algorithm}
At the heart of our method is a linear system involving the
the Jacobian of $g$.
Pseudocode
\begin{algorithm}
\caption{Gauss-Newton method}
\begin{algorithmic}
\Input{initial guess $x^{(0)}$, tolerance $\epsilon$}
\FOR{$k \gets 1$ to max\_iter}
\STATE{$J \gets J_g(x^{(k)})$}
\STATE{solve $(J^T J) h = -J^T r(x^{(k)})$ for $h$}
\STATE{$x^{(k+1)} \gets x^{(k)} + h$}
\IF{$\color{var(--emphColor)}{||h|| < \epsilon}$}
\STATE{break}
\ENDIF
\ENDFOR
\end{algorithmic}
\end{algorithm}
We stop iterating when consecutive updates are sufficiently
close to each other.
(Bacterial population modeling)
Consider a bacterial population $N(t)$ which obeys the
logistic growth
equation
$${ dN \over dt} = rN (1 - {N \over \kappa}), \quad N(0) = N_0,$$
where $\kappa$ and $N_0$ are given constants.
(Bacterial population modeling)
This means that the population $N(t)$ grows like
$$N(t) = {\color{var(--emphColor)}{N_0 \kappa} e^{\color{var(--emphColor)}{r}t} \over \color{var(--emphColor)}{\kappa} + \color{var(--emphColor)}{N_0} (e^{\color{var(--emphColor)}{r}t} - 1)}.$$
(Bacterial population modeling)
Suppose the lab next door gives us measurements of the bacterial
population $b_i$ at times $t_i$, for $i = 1, 2, \ldots, m$.
Our task is to find the parameters
$$\mathbf{x} = \begin{bmatrix} \color{var(--emphColor)}{N_0} \\ \color{var(--emphColor)}{\kappa} \\ \color{var(--emphColor)}{r} \end{bmatrix}$$
that best fit the data.
(Bacterial population modeling)
In other words, we seek to solve the minimization problem
$$\min_{\color{var(--emphColor)}{\mathbf{x}}} \sum_{i=1}^m \bigg({\color{var(--emphColor)}{N_0 \kappa} e^{\color{var(--emphColor)}{r}t_i} \over \color{var(--emphColor)}{\kappa} + \color{var(--emphColor)}{N_0} (e^{\color{var(--emphColor)}{r}t_i} - 1)} - b_i \bigg)^2$$
(Bacterial population modeling)
In other words, we seek to solve the minimization problem
$$\min_{\color{var(--emphColor)}{\mathbf{x}}} \bigg| \bigg|
\begin{bmatrix}
{\color{var(--emphColor)}{N_0 \kappa} e^{\color{var(--emphColor)}{r}t_1} \over \color{var(--emphColor)}{\kappa} + \color{var(--emphColor)}{N_0} (e^{\color{var(--emphColor)}{r}t_1} - 1)} \\
\vdots \\
{\color{var(--emphColor)}{N_0 \kappa} e^{\color{var(--emphColor)}{r}t_m} \over \color{var(--emphColor)}{\kappa} + \color{var(--emphColor)}{N_0} (e^{\color{var(--emphColor)}{r}t_m} - 1)}
\end{bmatrix}
-
\begin{bmatrix}
b_1 \\
\vdots \\
b_m
\end{bmatrix}
\bigg|\bigg|^2$$
(Bacterial population modeling)
In other words, we seek to solve the minimization problem
$$\min_{\color{var(--emphColor)}{\mathbf{x}}} \bigg| \bigg|
\begin{bmatrix}
{\color{var(--emphColor)}{x_1 x_2} e^{\color{var(--emphColor)}{x_3}t_1} \over \color{var(--emphColor)}{x_2} + \color{var(--emphColor)}{x_1} (e^{\color{var(--emphColor)}{x_3}t_1} - 1)} \\
\vdots \\
{\color{var(--emphColor)}{x_1 x_2} e^{\color{var(--emphColor)}{x_3}t_m} \over \color{var(--emphColor)}{x_2} + \color{var(--emphColor)}{x_1} (e^{\color{var(--emphColor)}{x_3}t_m} - 1)}
\end{bmatrix}
-
\begin{bmatrix}
b_1 \\
\vdots \\
b_m
\end{bmatrix}
\bigg|\bigg|^2$$
Recall $\mathbf{x} = \begin{bmatrix} N_0 & \kappa & r \end{bmatrix}^T$.
(Bacterial population modeling)
In other words, we seek to solve the minimization problem
$$\min_{\color{var(--emphColor)}{\mathbf{x}}} \bigg| \bigg|
\begin{bmatrix}
g_1(\color{var(--emphColor)}{\mathbf{x}}) \\
\vdots \\
g_m(\color{var(--emphColor)}{\mathbf{x}})
\end{bmatrix}
-
\begin{bmatrix}
b_1 \\
\vdots \\
b_m
\end{bmatrix}
\bigg|\bigg|^2$$
(Bacterial population modeling)
In other words, we seek to solve the minimization problem
$$\min_{\color{var(--emphColor)}{\mathbf{x}}} ||g(\color{var(--emphColor)}{\mathbf{x}}) - b ||^2$$
(Bacterial population modeling)
In order to implement the Gauss-Newton scheme, we must compute the
Jacobian of $g$, with $(i, j)$-entry given by ${\partial g_i \over \partial x_j}$.
For instance, the second column of the Jacobian is
$$(J_g(\mathbf{x}))_{i,\color{var(--emphColor)}{2}} = \begin{bmatrix}
{\partial g_1 \over \partial \color{var(--emphColor)}{x_2}} \\
\vdots \\
{\partial g_m \over \partial \color{var(--emphColor)}{x_2}}
\end{bmatrix}
$$
(Bacterial population modeling)
In order to implement the Gauss-Newton scheme, we must compute the
Jacobian of $g$, with $(i, j)$-entry given by ${\partial g_i \over \partial x_j}$.
For instance, the second column of the Jacobian is
$$(J_g(\mathbf{x}))_{i,\color{var(--emphColor)}{2}} = \begin{bmatrix}
{\partial \over \partial \color{var(--emphColor)}{x_2}}\bigg[{x_1\color{var(--emphColor)}{ x_2} e^{x_3 t_1} \over \color{var(--emphColor)}{x_2} + x_1 (e^{x_3 t_1} - 1)}\bigg] \\
\vdots \\
{\partial \over \partial \color{var(--emphColor)}{x_2}}\bigg[{x_1\color{var(--emphColor)}{ x_2} e^{x_3 t_m} \over \color{var(--emphColor)}{x_2} + x_1 (e^{x_3 t_m} - 1)}\bigg] \\
\end{bmatrix}.
$$
Congratulations! You reached the end of this lecture.