Non-Standard Approach to Interpolation and More

August 23, 2021 | 10 minutes, 48 seconds


More recently, I've been working with APOs and how they can be applied/played around with (if you don't know what they are, please see my previous post). An interesting consequence of these incredibly interesting objects include the possibility of polynomial interpolation.

Furthermore, one of the examples I've worked closely with is the sums of consecutive numbers raised to a given power (e.g. \(1+2+\dots+n\)), for which there is a brief theorem that I've written. I'll cover this first. For now however, I'll talk about interpolation.

Polynomial interpolation is the process of using a set of points and manipulating them in such a way that a polynomial function passes through each point. That is, given points, say, \(\left\{(x_0,y_0),(x_1,y_1),\dots,(x_n,y_n)\right\}\) then we can create a polynomial \(P(x)\) such that \(P(x_0)=y_0,P(x_1)=y_1,\dots,P(x_n)=y_n\).

The naïve method of achieving such a thing is to let \(P(x)=a_0x^n+a_1x^{n-1}+\dots+a_n\), and then solve a system of linear equations such that:

\[ \begin{pmatrix} x_0^n & x_0^{n-1} & \dots & x_0 & 1 \\ x_1^n & x_1^{n-1} & \dots & x_1 & 1 \\ \vdots & \vdots & \ddots & \vdots & \vdots \\ x_n^n & x_n^{n-1} & \dots & x_n & 1 \end{pmatrix} \begin{pmatrix} a_0\\ a_1\\ \vdots\\ a_n \end{pmatrix} =\begin{pmatrix} y_0\\ y_1\\ \vdots\\ y_n \end{pmatrix}\]

For large \(n\), this is an incredibly difficult problem to solve by hand and, perhaps difficult for computers also. This is where two major interpolation methods/forms comes into play:

  1. Lagrange polynomials & interpolation
  2. Newton-form polynomials & divided differences

When interpolating, there is also a theorem to be known - Uniqueness of the interpolating polynomial.

Given a set of points \(\left\{(x_0,y_0),\dots,(x_n,y_n)\right\}\) there exists only a single polynomial of degree \(n\) that interpolates those points.

With that being said - there is no reason a polynomial "should" be of degree \(n\); that is, a polynomial that satisfies all this points is always degree \(n\) at a minimum. However, it is easiest to consider the case such that the polynomial is degree \(n\), although you can parameterise higher-degree polynomials.

Get out of Induction Free

A fair warning to you: this section is going to be fairly notation-heavy. A lot of people are familiar with the common formulas:

\[ 1+2+\dots+n=\frac{1}{2}n(n+1)\\ 1^2+2^2+\dots+n^2=\frac{1}{6}n(n+1)(2n+1)\]

In my last post, I touched on a way to derive these using the lagrange-piecewise approach. But now I want to prove these formulas are correct for general powers \(p\) - or rather, come up with a simple method of proving they are, avoiding induction.

Theorem: Suppose there exists a polynomial \(S_p:\mathbb{R}^{+}\cup\left\{0\right\}\to\mathbb{R}^{+}\cup\left\{0\right\}\) such that for all \(x,\ S_p(x)-S_p(x-1)=x^p\), and \(S_p(0)=0\), \(S_p(1)=1\), \(S_p(2)=1+2^p,\ \dots\), \(\ S_p(p+1)=1+2^p+\dots+(p+1)^p\), then \(S_p(n)=\sum_{k=1}^{n}{k^p}\) for all \(n\in\mathbb{Z}^{+}\cup\left\{0\right\}\).


Suppose \(\exists S_p:\mathbb{R}^+\cup\left\{0\right\}\to\mathbb{R}^+\cup\left\{0\right\}\) such that \(\forall n\in\mathbb{Z}^+\cup\left\{0\right\}\), \(S_p(n)=\sum_{k=1}^{n}{k^p}\), then:

By definition, \(S_p\) must satisfy the following conditions:

\[ \left\{\begin{align} S_p(n)&=S_p(n-1)+n^p&(1)\\ S_p(0)&=0&(2) \end{align}\right.\]

Suppose \(S_p\) is a polynomial - then it is of degree \(p+1\). A proof by contradiction is given below.

If \(S_p\) is of degree \(p\), \(S_p(n)\) and \(S_p(n-1)\) have the same coefficient for \(n^{p}\) (i.e. the highest coefficient, most easily seen via binomial theorem) and hence they zero out in equation \((1)\) - leaving at most a polynomial of degree \(p-1\) - a contradiction to \((1)\). Additionally, for lower powers, an identical argument applies.

Then, suppose \(S_p\) is of degree \(p+2\). Since \(S_p\) would uniquely describe the polynomial with \(p+3\) points, then by the uniqueness of interpolating polynomial, any polynomial of degree \(p+1\) could satisfy those same \(p+3\) points if and only if \(1\) of those points are redundant (and hence \(S_p\) is a polynomial of degree \(p+1\)). This is however a contradiction; the function \(S_0(n)=n^1\) does in fact satisfy the above conditions. For higher powers, the same argument applies.

Therefore, the polynomial can only be of degree \(p+1\). Then, by the uniqueness of interpolating polynomial, there are (any) \(p+2\) points required to determine such a polynomial \(S_p\), i.e. \(S_p(0)=0, S_p(1)=1, \dots, S_p(p+1)=\sum_{k=1}^{p+1}{k^p}\).

Notational Introduction

An APO can be used to represent a set of points. For example, given \(\left\{(x_0,y_0),\dots,(x_n,y_n)\right\}\) then we can represent this as:

\[ f(x)=\begin{cases} y_0 & x=x_0 \\ y_1 & x=x_1 \\ \vdots & \vdots \\ y_n & x=x_n \\ \star & \star \end{cases}\]

This workings if and only if the elements of the set describe mappings (in this case, a 2-dimensional coordinate can be a mapping from \(x\) to \(y\) in cartesian coordinates, etc.), and \(f\) is originally defined on \(\left\{x_0,x_1,\dots,x_n\right\}\), although the APO resolves this over \(\mathbb{R}\) - a restriction of the function after its characteristic form is found is entirely possible.

Furthermore, during the 'resolution' of the APO, one can choose to alter the conditions for each case provided the original case still occurs. That is, for example, \(x=x_0\implies f(x)=f(x_0)\iff f(x)=f(x_0)\not\implies x=x_i, i\neq 0\). Therefore, there is no reason (in the appropriate context) for the case conditions to be written the way they are traditionally written. The consequence of this is that the resultant characteristic form found for the function representing the points will be significantly different to the original. This is touched on by my previous post, also.

Resolution of the APO is possible when a root is determined and 'extracted'. This is because regardless of what the value of the APO is, the piecewise term will always zero out at that root. This means that the zeroed out case can be ignored (or integrated into the \(\star\) term).

Lastly, the traditional operations to piecewise functions apply to APOs. That is, if you apply an operation to all of the case values, then you must apply its inverse on the outside -- and vice versa. Unique to APOs, but, additionally, if that function is parameterised, then apply that parameter to each case uniquely on the inside. (e.g. if you have \(x=x_0\) on the inside, and you apply \(f(x)\) to the case value; what you're really applying is \(f(x_0)\)). This is unique to APOs because of the unresolved case/s (i.e. \(\star\)). In traditional piecewise functions, you can apply this, albeit a little bit differently and with more difficulty.


This method was touched on in the last post, but I shall write it more formally here. The output of this method is identical to standard lagrange interpolation. It is also one of the more intuitive methods of dealing with the points.

Given a set of points \(\left\{(x_0,y_0),(x_1,y_1),\dots,(x_n,y_n)\right\}\) then we can write the interpolating polynomial as:

\[ P(x)=\sum_{i=0}^{n}y_i\left(\prod_{k\neq i}{\frac{x-x_k}{x_i-x_k}}\right)\]

For \(0\leq k\leq n\). We can derive this using the following method:

\[ \begin{align} f(x)&=\begin{cases} y_0 & x=x_0 \\ y_1 & x=x_1 \\ \vdots & \vdots \\ y_n & x=x_n \\ \star & \star \end{cases}\\ &=\begin{cases} y_0 & x=x_0 \\ 0 & x=x_1 \\ \vdots & \vdots \\ 0 & x=x_n \\ \star & \star \end{cases}+\begin{cases} 0 & x=x_0 \\ y_1 & x=x_1 \\ \vdots & \vdots \\ 0 & x=x_n \\ \star & \star \end{cases}+\dots+\begin{cases} 0 & x=x_0 \\ 0 & x=x_1 \\ \vdots & \vdots \\ y_n & x=x_n \\ \star & \star \end{cases}\\ &=\left((x-x_1)\dots(x-x_n)\right)\begin{cases} \frac{y_0}{(x_0-x_1)\dots(x_0-x_n)} & x=x_0 \\ \star & \star \end{cases}+\\ &\left((x-x_0)(x-x_2)\dots(x-x_n)\right)\begin{cases} \frac{y_1}{(x_1-x_0)(x_1-x_2)\dots(x_0-x_n)} & x=x_1 \\ \star & \star \end{cases}+\\ &\dots+\left((x-x_0)\dots(x-x_{n-1})\right)\begin{cases} \frac{y_n}{(x_n-x_0)\dots(x_n-x_{n-1})} & x=x_n \\ \star & \star \end{cases}\\ &=\frac{y_0(x-x_0)\dots(x-x_n)}{(x_0-x_1)\dots(x_0-x_n)}+\frac{y_1(x-x_0)(x-x_2)\dots(x-x_n)}{(x_1-x_0)(x_1-x_2)\dots(x_1-x_n)}\\&+\dots+\frac{y_n(x-x_0)\dots(x-x_{n-1})}{(x_n-x_0)\dots(x_n-x_{n-1})}\\ &=\sum_{i=0}^{n}y_i\left(\prod_{k\neq i}{\frac{x-x_k}{x_i-x_k}}\right) \end{align}\]

In essence - we split apart the piecewise function such that we had roots in all but one of each case, in each respective piecewise function. This ensures that each coefficient component will be linear, and \(n\) roots 'attached' to each term. This is obviously a very tedious method, but is very systematic and fairly straightforward to calculate, but will leave quite a complicated expression at the end as seen above.


It turns out this is another intuitive method of interpolating a set of points. A newton-form polynomial is a polynomial in the form of

\[ P(x)=[y_0]+[y_0,y_1](x-x_0)+\dots[y_0,\dots y_n](x-x_0)(x-x_1)\dots(x-x_n)\]

The notation in square brackets denotes divided difference. While you'll see these directly in the derivation, if you wish to, you can find more information about them here.

This approach instead entails zeroing cases one at a time via subtracting a case to zero, then repeating. See below:

\[ \begin{align} f(x) &= \begin{cases} y_0 & x=x_0 \\ y_1 & x=x_1 \\ \vdots & \vdots \\ y_n & x=x_n \\ \star & \star \end{cases} \\ &= y_0+\begin{cases} 0 & x=x_0 \\ y_1-y_0 & x=x_1 \\ \vdots & \vdots \\ y_n-y_0 & x=x_n \\ \star & \star \end{cases} \\ &= y_0+(x-x_0)\begin{cases} \frac{y_1-y_0}{x_1-x_0} & x=x_1 \\ \frac{y_2-y_0}{x_2-x_0} & x=x_2\\ \vdots & \vdots \\ \frac{y_n-y_0}{x_n-x_0} & x=x_n \\ \star & \star \end{cases} \\ &= y_0+(x-x_0)\left(\frac{y_1-y_0}{x_1-x_0}+(x-x_1)\begin{cases} \frac{1}{x_2-x_1}\left(\frac{y_2-y_0}{x_2-x_0}-\frac{y_1-y_0}{x_1-x_0}\right) & x=x_2\\ \vdots & \vdots \\ \frac{1}{x_n-x_1}\left(\frac{y_n-y_0}{x_n-x_0}-\frac{y_1-y_0}{x_1-x_0}\right) & x=x_n \\ \star & \star \end{cases}\right) \\ \end{align}\]

And from there we can expand and repeat the process:

\[ \begin{align} &= y_0+\frac{y_1-y_0}{x_1-x_0}(x-x_0)+(x-x_0)(x-x_1)\begin{cases} \frac{1}{x_2-x_1}\left(\frac{y_2-y_0}{x_2-x_0}-\frac{y_1-y_0}{x_1-x_0}\right) & x=x_2\\ \vdots & \vdots \\ \frac{1}{x_n-x_1}\left(\frac{y_n-y_0}{x_n-x_0}-\frac{y_1-y_0}{x_1-x_0}\right) & x=x_n \\ \star & \star \end{cases} \\ \end{align}\]

At each stage, the coefficients of the binomials we generate are known as divided differences. Convenient, considering writing this out in a general form is tedious at best. There is, however, good news. For specific instances, these calculations are incredibly trivial most of the time and oftentimes deriving it in the same way as above provides more opportunity for factoring before the expression becomes complicated. Additionally, you can employ mixing of this method alongside the lagrange method.

Other Objects and Conclusion

One of the questions I have from this process is if other objects such as vectors, or more variables (i.e. higher dimensions in general) be used in this process. That is something I'm working on finding out, and hopefully at some point I'll also be able to apply this process to say, parametric equations.

In any case, this has so far been an interesting investigation for me, and having come up with two major results by accident, is very exciting. I'll continue to go down this path and look into especially APOs in the context of general functions.