Newton's method 3/3

Bounded growth

Let $\color{var(--emphColor)}{k} = \min(1/2, \delta_1)$. Since $g'$ is continuous and $g'(p) = 0$, there exists $\delta_2 > 0$ such that $$|g'(x)| < \color{var(--emphColor)}{k}$$ whenever $x \in (p - \delta_1, p + \delta_1)$.

We take $\color{var(--emphColor)}{k} = \min(1/2, \delta_1)$ here to ensure $|g'(x)| \leq k < 1$.

To guarantee convergence of Newton's method as a fixed point iteration scheme, we need to:

Find an interval $I$ containing $p$ such that $g$ is continuous on $I$. $\color{var(--emphColor)}{\checkmark}$
Show $g$ is differentiable on $I$ and $|g'(x)| \leq k < 1$ for some $k$ and every $x \in I$. $\color{var(--emphColor)}{\checkmark}$

Again, we've used $I'$ because we'll need a subinterval $I \subset I'$ to satisfy the third contidion.

$g: I \to I$

Now set $\delta = \min(\delta_1, \delta_2)$ and take $\color{var(--emphColor)}{I = (p - \delta, p+ \delta)}$. Since $\delta \leq \delta_1$, the interval $I$ is contained in $I' = (p - \delta_1, p+ \delta_1).$

For each $x \in I$, we use the MVT to compute \begin{align} \color{var(--emphColor)}{|g(x) - p|} &= |g(x) - g(p)| = |g'(c)||x-p| \\ &< |x - p| \\ &< \color{var(--emphColor)}{\delta}. \end{align}

$\color{var(--emphColor)}{g: I \to I}$

$$|g(x) - p| < \delta$$ For each $\color{var(--emphColor)}{x \in I} = (p - \delta, p + \delta)$, the distance between $g(x)$ and $p$ is less than $\delta$, so $\color{var(--emphColor)}{g(x) \in I}$.

To guarantee convergence of Newton's method as a fixed point iteration scheme, we need to:

Find an interval $I$ containing $p$ such that $g$ is continuous on $I$. $\color{var(--emphColor)}{\checkmark}$

Show $g$ is differentiable on $I$ and $|g'(x)| \leq k < 1$ for some $k$ and every $x \in I$. $\color{var(--emphColor)}{\checkmark}$

Ensure that $g : I \to I$. That is, the image $g(I)$ of $I$ under $g$ is contained in $I$. $\color{var(--emphColor)}{\checkmark}$

Since our interval $I \subseteq I'$, and $I'$ satisfies 1 and 2, $I$ satisfies all three conditions.

Newton's convergence

We have just proved the following precise statement.

If $f$ is twice continuously differentiable, $f(p) = 0$, and $f'(p) \neq 0$, there is an interval $I$ around $p$ such that, by the convergence theorem for fixed point iteration schemes, Newton's method is guaranteed to converge to $p$ whenever $x_0 \in I$.

So what's all this fuss about $I$?

The crucial point is that Newton's method converges to the unique root of $f$ in $I$ if $x_0 \in I$.

Achilles' heel: Newton's method is guaranteed to converge to a root $p$ only if the starting point is sufficiently close to $p$.

Theory vs. practice

While our convergence theorem may appease our inner mathematician, it is hard to apply in practice.

In practice we don't usually know $\delta$ a priori: we don't know which starting points are guaranteed to produce sequences of approximations converging to a root of $f$.

Upside

Regardless, our convergence theorem provides practical insight:

“Performance and convergence properties of Newton's method are very sensitive to the choice of starting point.”

Later in the course we'll see how this sensitivity impacts some optimization algorithms, partly explaining why initializing parameters in the right way may be critical to your application.

Theory vs. practice

Once again, our theorem does not provide a bound on the absolute error $|x_n - p|$ at each step.

Thus we must use $|x_{n} - x_{n-1}| < \epsilon$ or $|f(x_n)| < \epsilon$ as our stopping condition, recalling that each has its limitations.

Loose ends

We've mentioned that when both methods converge to the same root, Newton's method is faster than the bisection method.

What about multiplicity? Does $f$ have to change sign in an interval containing $p$ to hope for convergence of Newton's method?

Order of convergence

A sequence $p_n$ which converges to $p$ is said to be of order $\alpha > 0$ with asymptotic error constant $\lambda > 0$ if \begin{equation*} \lim_{n \to \infty} \frac{|p_{n+1} - p|}{|p_n - p|^{\alpha}} = \lambda. \end{equation*}

If a sequence $p_n \to p$ is of order $\alpha = 1$, we say it converges linearly; if it is of order 2, we say it converges quadratically.

Quadratic Newton, linear bisection

If $f$ is twice continuously differentiable, $f(p) = 0$, and $f'(p) \neq 0$, there is an interval $I$ around $p$ and a constant $M$ such that the sequence $x_n$ produced by Newton's method with $x_0 \in I$ converges to $p$ with $$|x_{n+1} - p| < M |x_n - p|^{\color{var(--emphColor)}{2}}.$$

The convergence is quadratic! We'll prove this statement in our next real-time session.

Quadratic Newton, linear bisection

Recall that if $f$ is continuous and there is an interval $[a, b]$ such that $f(a) f(b) < 0$, then the sequence $p_n$ of approximations produced by the bisection method converges to a root $p$ of $f$ in $[a, b]$ with $$|p_{n+1} - p| < {1 \over 2} |p_n - p|.$$

By definition, the convergence is only linear.

Theory vs. practice

In HW1 you will empirically verify in one example that Newton's converges is faster than the bisection method.

Newton's vs. Bisection method

$\begin{array}{c|c|c} & \text{Convergence?} & \text{Order} & \text{Iteration cost} \\ \hline \text{Newton's} & \text{Depends on } x_0 & \text{Quadratic} & f(x_n), f'(x_n) \\ \hline \text{Bisection} & \text{Guaranteed if } f \text{ changes sign on } [a,b] & \text{Linear} & f(x_n) \end{array}$

Root multiplicity

We say a root $p$ of $f$ is of multiplicity $m$ if $f$ can be written as $f(x)= (x-p)^m q (x)$, with $\lim_{x\to p} q (x)\neq 0$.

Equivalently, assuming $f$ is sufficiently smooth, $p$ is of multiplicity $m$ if $f(p)=f'(p)= \cdots=f^{(m-1)}(p)=0$ and $f^{(m)}(p) \neq0$.

Verifying these definitions are in fact equivalent is a great exercise!

Use a Taylor expansion.

Multiplicity and Newton's method

In the last lecture we mentioned the bisection method fails to find roots of even multiplicity.

Newton's method can handle roots of multiplicity $m > 1$. Convergence can be guaranteed when $x_0$ is close to a root of $f$, but the convergence is only linear. If the multiplicity $m$ is known in advance, the method can be modified to achieve quadratic convergence.

Congratulations! You reached the end of this lecture.

This week's Zoom password is “true” if Newton's method has been around since the 17th century, else the password is “false.”

Take me home