Let $\color{var(--emphColor)}{k} = \min(1/2, \delta_1)$. Since $g'$ is continuous and $g'(p) = 0$, there exists $\delta_2 > 0$
such that
$$|g'(x)| < \color{var(--emphColor)}{k}$$
whenever $x \in (p - \delta_1, p + \delta_1)$.
We take $\color{var(--emphColor)}{k} = \min(1/2, \delta_1)$ here to
ensure $|g'(x)| \leq k < 1$.
Find an interval $I$ containing $p$ such that $g$ is continuous on $I$. $\color{var(--emphColor)}{\checkmark}$
Show $g$ is differentiable on $I$ and $|g'(x)| \leq k < 1$ for
some $k$ and every $x \in I$. $\color{var(--emphColor)}{\checkmark}$
Again, we've used $I'$ because we'll need a subinterval $I \subset I'$
to satisfy the third contidion.
$g: I \to I$
Now set $\delta = \min(\delta_1, \delta_2)$ and take $\color{var(--emphColor)}{I = (p - \delta, p+ \delta)}$.
Since $\delta \leq \delta_1$, the interval $I$ is contained in $I' = (p - \delta_1, p+ \delta_1).$
For each $x \in I$, we use the MVT to compute
\begin{align}
\color{var(--emphColor)}{|g(x) - p|} &= |g(x) - g(p)| = |g'(c)||x-p| \\
&< |x - p| \\
&< \color{var(--emphColor)}{\delta}.
\end{align}
$\color{var(--emphColor)}{g: I \to I}$
$$|g(x) - p| < \delta$$
For each $\color{var(--emphColor)}{x \in I} = (p - \delta, p + \delta)$,
the distance between $g(x)$ and $p$ is less than $\delta$, so
$\color{var(--emphColor)}{g(x) \in I}$.
Find an interval $I$ containing $p$ such that $g$ is continuous on $I$.
$\color{var(--emphColor)}{\checkmark}$
Show $g$ is differentiable on $I$ and $|g'(x)| \leq k < 1$ for
some $k$ and every $x \in I$. $\color{var(--emphColor)}{\checkmark}$
Ensure that $g : I \to I$. That is, the image $g(I)$ of $I$ under $g$ is
contained in $I$. $\color{var(--emphColor)}{\checkmark}$
Since our interval $I \subseteq I'$, and $I'$ satisfies 1 and 2,
$I$ satisfies all three conditions.
Newton's convergence
We have just proved the following precise statement.
If $f$ is twice continuously differentiable,
$f(p) = 0$, and $f'(p) \neq 0$, there is an interval $I$
around $p$ such that, by the
convergence theorem for fixed point iteration schemes,
Newton's method is guaranteed to converge to $p$ whenever $x_0 \in I$.
So what's all this fuss about $I$?
The crucial point is that Newton's method converges to the unique
root of $f$ in $I$
if $x_0 \in I$.
Achilles' heel: Newton's method is guaranteed to converge to a root $p$
only if the starting point is sufficiently close to $p$.
Theory vs. practice
While our convergence theorem may appease our inner mathematician, it
is hard to apply in practice.
In practice we don't usually know $\delta$ a priori: we don't know
which starting points are guaranteed to produce sequences of
approximations converging to a root of $f$.
“Performance and convergence properties of Newton's method
are very sensitive to the choice of starting point.”
Later in the course we'll see how this sensitivity impacts some optimization algorithms,
partly explaining why initializing parameters in the right
way may be critical to your application.
Theory vs. practice
Once again, our theorem does not provide a bound on the absolute error
$|x_n - p|$ at each step.
Thus we must use $|x_{n} - x_{n-1}| < \epsilon$ or
$|f(x_n)| < \epsilon$ as our stopping condition, recalling that each
has its limitations.
Loose ends
We've mentioned that when both methods converge to the same root,
Newton's method is faster than the bisection method.
What about multiplicity?
Does $f$ have to change sign in an interval
containing $p$ to hope for convergence of Newton's method?
Order of convergence
A sequence $p_n$ which converges to $p$ is said to be of
order
$\alpha > 0$ with asymptotic error constant $\lambda > 0$ if
\begin{equation*}
\lim_{n \to \infty} \frac{|p_{n+1} - p|}{|p_n - p|^{\alpha}} = \lambda.
\end{equation*}
If a sequence $p_n \to p$ is of order $\alpha = 1$, we say it converges
linearly; if it is of order 2, we say it
converges quadratically.
Quadratic Newton, linear bisection
If $f$ is twice continuously differentiable, $f(p) = 0$, and
$f'(p) \neq 0$, there is an interval $I$ around $p$ and a constant $M$
such that the sequence $x_n$ produced by Newton's method with $x_0 \in I$
converges to $p$ with
$$|x_{n+1} - p| < M |x_n - p|^{\color{var(--emphColor)}{2}}.$$
The convergence is quadratic! We'll prove
this statement in our next real-time session.
Quadratic Newton, linear bisection
Recall that if $f$ is continuous and there is an
interval $[a, b]$ such that $f(a) f(b) < 0$, then the sequence
$p_n$ of approximations produced by the bisection method converges
to a root $p$ of $f$ in $[a, b]$ with
$$|p_{n+1} - p| < {1 \over 2} |p_n - p|.$$
In HW1 you will
empirically verify in one example that Newton's converges is faster
than the bisection method.
Newton's vs. Bisection method
$\begin{array}{c|c|c}
& \text{Convergence?} & \text{Order} & \text{Iteration cost} \\
\hline
\text{Newton's} & \text{Depends on } x_0 & \text{Quadratic} & f(x_n), f'(x_n) \\
\hline
\text{Bisection} & \text{Guaranteed if } f \text{ changes sign on } [a,b] & \text{Linear} & f(x_n)
\end{array}$
Root multiplicity
We say a root $p$ of $f$ is of
multiplicity $m$ if $f$ can be written as
$f(x)= (x-p)^m q (x)$,
with $\lim_{x\to p} q (x)\neq 0$.
Equivalently, assuming $f$ is sufficiently smooth,
$p$ is of multiplicity $m$ if
$f(p)=f'(p)= \cdots=f^{(m-1)}(p)=0$ and $f^{(m)}(p) \neq0$.
Verifying these definitions are in fact equivalent is a great exercise!
Use a Taylor expansion.
Multiplicity and Newton's method
In the last lecture we mentioned the
bisection method fails
to find roots of even multiplicity.
Newton's method can handle roots of multiplicity
$m > 1$. Convergence can be guaranteed when $x_0$ is close to a root of $f$,
but the convergence is only linear. If the multiplicity $m$ is known in
advance, the method can be
modified to achieve quadratic convergence.
Congratulations! You reached the end of this lecture.
This week's Zoom password is “true” if Newton's method
has been around since the 17th century, else the password is “false.”