Issues and improvements 1/3

Interpolation

GUIDING QUESTION: How can I construct and evaluate a polynomial interpolant efficiently and with high numerical precision?

Motivation

If we use the defining formula directly, each evaluation of the interpolating polynomial costs $O(n^2)$ operations and the roundoff error may easily accumulate.

What's more, if we receive a new data point, we must re-compute everything from scratch.

Barycentric weights

We address these numerical issues by re-writing the Lagrange basis polynomials in a more convenient form.

Given nodes $x_0, x_1, \ldots, x_n$, define the barycentric weights $w_j$ by \begin{align} \color{var(--emphColor)}{w_j}= \frac{1}{\prod_{k\not= j}(x_j - x_k)}. \end{align}

The $j$th barycentric weight $\color{var(--emphColor)}{w_j}$ is the denominator of the $j$th Lagrange basis polynomial $L_j$.

Barycentric formula

Let $\color{var(--emphColor)}{\ell}(x)=(x- x_0)(x- x_1) \cdots (x-x_n)$.

Then the $j$th Lagrange basis polynomial $L_j$ becomes

$$L_j(x) = \prod_{k \neq j \\ k = 0}^n{x - x_k \over x_j - x_k} = \color{var(--emphColor)}{\ell}(x) \frac{w_j}{x-x_j}.$$

Barycentric formula

And the interpolating polynomial as \begin{align*} p(x) &= \sum^n_{j=0} y_j L_j(x) \\ &=\color{var(--emphColor)}{\ell}(x) \sum^n_{j=0} \frac{w_j}{x-x_j} y_j. \end{align*}

The polynomial $\color{var(--emphColor)}{\ell}$ is independent of $j$, so it is a common factor across all summands.

Barycentric formula++

Let's re-write the interpolant one more time.

If we were interpolating the constant function $f(x) = 1$, then

$y_j = 1$ for every node $x_j$, and

interpolating polynomial $p$ must be the constant $p(x) = 1$ as well, since it is a polynomial of degree at most $n$ that interpolates $f$ and there is only one such polynomial.

Barycentric formula++

So we obtain an equation expressing $\color{var(--emphColor)}{\ell}$ in terms of the barycentric weights:

$1 = p(x) = \color{var(--emphColor)}{\ell}(x)\sum^n_{j=0} \frac{w_j}{x-x_j},$

or equivalently, $$\color{var(--emphColor)}{\ell}(x) = 1/ \left( \sum^n_{j=0} \frac{w_j}{x-x_j} \right).$$

Barycentric formula++

We combine our results to obtain the barycentric formula $$p(x) = \frac{\sum^n_{j=0}\frac{w_j}{x-x_j}y_j}{\sum^n_{j=0} \frac{w_j}{x-x_j}}.$$

So what's the big deal?

The barycentric formula enjoys a few key advantages.

Once we've computed the weights, we may evaluate the interpolating polynomial in $O(n)$.

The evaluation is much more numerically stable.

It's easy to update if more data becomes available.

To include a new point $(x_{n+1},y_{n+1})$,

divide every $w_j$ by $x_j- x_{n+1}$ using $O(n)$ operations
compute $w_{n+1}$ again using $O(n)$ operations.

If you want to plot the interpolating polynomial, you must evaluate it at many points, so pre-computing the barycentric weights will lead to big savings!

The barycentric weights are known for evenly-spaced nodes, so you don't even have to compute them!

Barycentric weights are also known for Chebyshev points, which we'll define in a few slides.

For more details, please refer to Berrut and Trefethen's Barycentric Lagrange Interpolation.

Good fit?

Suppose the data points $(x_0, y_0), \ldots, (x_n, y_n)$ are sampled from some continuous $f$.

How well does the interpolating polynomial $p$ approximate $f$? When is $p$ a good proxy for the unknown function $f$?

Consider $n+1$ distinct nodes $x_i$ in an interval $[a, b]$. If $(x_i, y_i)$ are samples of the $(n+1)$-times continuously differentiable function $f$, and $p$ denotes the interpolating polynomial of degree at most $n$, then for each $x \in [a, b]$ there exists a number $\xi \in [a, b]$ such that $$f(x)= p(x) + \frac{f^{n+1}(\xi)}{(n+1)!} \prod^n_{i=0} (x-x_i).$$

Much like Taylor's approximation theorem, this theorem gives $f$ as a sum of a polynomial and an error term.

Pre-requisites

To prove our approximation theorem, we'll need Rolle's theorem and its generalization, which are consequences of the Intermediate Value Theorem (IVT).

Rolle's

Interpolation error

(Interpolation error.) Since $\prod^n_{i=0} (x_j-x_i) = 0$ and $p(x_j) = f(x_j)$, the statement holds when $x = x_j$ is one of the given abscissas.

It remains to show the conclusion holds when $x \in [a, b]$ is not one of the given nodes.

Interpolation error

(Interpolation error.) Fix $x$ and consider the following gadget, which is a function of $t$: $g(t) = \displaystyle f(t) - P(t) - \left [f(x) - P(x) \right ]\prod^n_{i=0} \frac{t-x_i}{x-x_i}$.

When $t = \color{var(--emphColor)}{x_j}$, $g(\color{var(--emphColor)}{x_j}) = \displaystyle f(\color{var(--emphColor)}{x_j}) - P(\color{var(--emphColor)}{x_j}) - [f(x) -P(x)]\prod^n_{i=0}\frac{\color{var(--emphColor)}{x_j} - x_i}{x - x_i}$ $\qquad =0,$

because the product term vanishes.

Interpolation error

(Interpolation error.) Similarly, when $t$ equals the fixed number $\color{var(--emphColor)}{x}$, $g(\color{var(--emphColor)}{x}) = \displaystyle f(\color{var(--emphColor)}{x}) - P(\color{var(--emphColor)}{x}) - [f(x) -P(x)]\prod^n_{i=0}\frac{\color{var(--emphColor)}{x} - x_i}{x - x_i}$ $\qquad =0,$

because the product term collapses: it is simply $1$.

So our $(n+1)$-times differentiable gadget has $n+2$ roots ($t = x, x_0, x_1, \ldots, x_n$).

Our gadget is $(n+1)$-times differentiable (with respect to $t$), as the difference of the smooth function $f$ and a polynomial.

Interpolation error

(Interpolation error.) By Rolle's theorem, $$g^{(n+1)} (\color{var(--emphColor)}{\xi})=0$$ for some $\color{var(--emphColor)}{\xi}$.

You should now ask yourself,

What does $g^{(n+1)}$ look like?

Interpolation error

(Interpolation error.) The interpolant $p$ is a polynomial of degree at most $n$, so so $p^{(n+1)} (t) = 0$ for every $t$.

The product, $\prod_{i=0}^n \frac{t - x_i}{x - x_i}$, is also a polynomial (in $t$).

In particular, it is a polynomial of degree $\color{var(--emphColor)}{n+1}$ with leading coefficient $1/\prod (x - x_i)$, so $$\frac{d^{\color{var(--emphColor)}{n+1}}}{dt^{\color{var(--emphColor)}{n+1}}} \left[\prod^n_{i=0} \frac{t-x_i}{x-x_i}\right] = { (\color{var(--emphColor)}{n+1})! \over \prod(x-x_i)}.$$

Interpolation error

(Interpolation error.) Therefore, $$g^{n+1}(\xi) = f^{n+1}(\xi) - 0 - [f(x) - P(x)]{ (n+1)! \over \prod(x-x_i)}.$$ Since $g^{(n+1)} ({\xi})=0,$ we can rearrange to conclude that $$f(x) = P(x) + \frac{f^{(n+1)}(\xi)}{(n+1)!}\left[\prod^n_{i=0} (x-x_i)\right].$$

OK, so what?

Our theorem gives us a bound on the interpolation error!

$$\max_{x\in[a,b ]}|f(x) - P(x)| \leq \frac{1}{(n+1)!} \max_{t\in [a,b]}|f^{(n+1)}(t)| \cdot \max _{s\in[a,b]} |\ell(s)|,$$ with $\ell(s) = (s-x_0)(s - x_1) \cdots (s-x_n)$.

Our theorem also suggests suggests that the interpolation error depends on the choice of nodes $x_i$. Recall Problem 4 on HW4.

More data = better interpolation?

Now that we can quantify the interpolation error, we may ask the following question.

Does the approximation improve if we have more samples?

Not necessarily!

In fact, in some cases, the interpolation error may grow with the number of nodes.

Runge's phenomenon

Towards better points

Suppose we want to interpolate a function $f$ on the interval $[a, b]$, and suppose you can choose where to sample $f$.

Which nodes would you pick?

We'll show that the Chebyshev points, the roots of the Chebyshev polynomials, are the “best” nodes, in the sense that they minimize the associated interpolation error.

Towards better points

From now on, let's focus on the interval $[-1, 1]$.

We can then translate our results to an arbitrary interval $[a, b]$ by scaling and shifting $$x= a +\frac{b -a}{2} (t+1)$$ for $t \in[-1, 1]$.

Chebyshev polynomials

First recall that if $-1 \leq x \leq 1$, then $x = \cos \theta$ for some $\theta$.

For $x \in [-1, 1]$, the $n$th Chebyshev polynomial $T_n$ is given by $$T_n(x) = \cos(n\arccos x).$$

Equivalently, $T_n(\cos \theta)= \cos n \theta$.

A polynomial?

Using basic trigonometry, we may show that $T_n(x)$ is a degree $n$ polynomial of $x$.

Fix an $x$ in $[-1, 1]$ and set $\theta = \cos^{-1} x$.

Notice that \begin{align} T_{n\color{var(--emphColor)}{+}1}(\cos \theta) &= \cos((n\color{var(--emphColor)}{+}1) \theta) \\ &= \cos n \theta \cos \theta \color{var(--emphColor)}{-} \sin n \theta \sin \theta, \end{align}

and similarly, \begin{align} T_{n\color{var(--emphColor)}{-}1}(\cos \theta) &= \cos((n\color{var(--emphColor)}{-}1) \theta) \\ &= \cos n \theta \cos \theta \color{var(--emphColor)}{+}\sin n \theta \sin \theta. \end{align}

A polynomial?

Using basic trigonometry, we may show that $T_n(x)$ is a degree $n$ polynomial of $x$. Combining $T_{n\color{var(--emphColor)}{+}1}$ and $T_{n\color{var(--emphColor)}{-}1}$ we see that \begin{align} T_{n\color{var(--emphColor)}{-}1}(\cos \theta) + T_{n\color{var(--emphColor)}{+}1}(\cos \theta) &= 2 \cos \theta \cos n \theta \\ = 2 \cos \theta T_n (\cos \theta). \end{align}

Since $x = \cos \theta$, \begin{align} T_{n\color{var(--emphColor)}{+}1}(x) = 2x T_n(x) - T_{n\color{var(--emphColor)}{-}1}(x). \end{align}

Suppose $T_0(x) = 1$ and $T_1(x) = x$. Since $$T_{n\color{var(--emphColor)}{+}1}(x) = 2x T_n(x) - T_{n\color{var(--emphColor)}{-}1}(x),$$

it follows that $$T_2(x) = 2x T_1 (x) - T_0 (x) = 2x^2 - 1,$$

and $$T_3(x) = 2x T_2 (x) - T_1 (x) = 4x^3 - 3x.$$