Numerical Analysis (aimee)

Newton’s Divided Difference (2)

Posted by: aimeemarie88 on: December 12, 2008


We learned a more efficient way to code Newton’s divided difference in class last week. We began by trying the code with the equation f(x)=sin(\pi*x) with five equally spaced points:

f=@(x) sin(pi*x)

n=5

x=linspace(-1,1,n)’

c=zeros(n,1)

F=f(x)

c(1)=F(1)

for i=1:n-1

F = diff(F)./(x(i+1:n)-x(1:n-i))

c(i+1)=F(1)

end

m=101;

xx = linspace(-1,1,m)’;

phi = ones(m,n);

for i=2:n

phi(:,i) = phi(:,i-1).*(xx-x(i-1));

end

P=phi*c;

plot(xx,f(xx),‘-r’)

hold on

plot(xx,P,‘-ob’)

sinpix

With the error:

sin_equi_error

Using the same function with Gaussian points:

sin_gauss

With the error:

sin_gauss_error

Next we used Gaussian points with the equation f(x)= \frac{1}{1+4x^2}

%f=@(x) sin(pi*x)

f=@(x) 1./(1+4.*x.^2)

n=7

%x=linspace(-1,1,n)’

x = -cos(pi*(0:n-1)/(n-1))

c=zeros(n,1)

F=f(x)

c(1)=F(1)

for i=1:n-1

F = diff(F)./(x(i+1:n)-x(1:n-i))

c(i+1)=F(1)

end

m=101;

xx = linspace(-1,1,m)’;

phi = ones(m,n);

for i=2:n

phi(:,i) = phi(:,i-1).*(xx-x(i-1));

end

P=phi*c;

plot(xx,f(xx),‘-r’)

hold on

plot(xx,P,‘-ob’)

runge_gauss

With the error:

runge_gauss_error

Using the same function with equidistant points:

runge_equi

With the error:

runge_equi_error

I thought it would be interesting to do a little research into finding out some more information on the Runge Phenomenon. The Runge phenomenon occurs with polynomial interpolation for polynomials of high degree. The higher the order of the polynomial used, the larger the bound of error becomes, showing that instead of moving towards zero, the error is increasing.

Using equally spaced points, as the degree of the polynomial increases, the error near the endpoints gets increasingly larger. It can be proven that the error of the Runge phenomenon using equally spaced points will approach infinity as the degree of the polynomial increases. This contradicts the Weierstrass approximation theorem that explains that there exists a sequence of approximating polynomials with the error converging to zero.

The following picture demonstrates this:

graph1


The red curve is the Runge function.

The blue curve is a 5th order interpolating polynomial.

The green curve is a 9th order interpolating polynomial

At the interpolating points, the error between the function f(x) and the polynomial p(x) is zero. Between the interpolating points, especially near the endpoints, as the degree of the polynomial increases, the error increases.

The following table shows the Runge function, its derivatives, and their evaluations at x=1

  f(x)= \frac{1}{1+25x^2}   |f(1)|=0.038462
  f'(x)= -\frac{50x}{(1+25x^2)^2}   |f'(1)|=0.073964
  f''(x)= \frac{50*(75x^2-1)}{(1+25x^2)^3}   |f''(1)|=0.210514
  f'''(x)= \frac{-15000x(25x^2-1)}{(25x^2+1)^4}   |f'''(1)|=0.787788

Since the magnitude of higher order derivatives of the Runge function gets larger, the bound for error for higher order polynomial interpolations is larger.

The Runge phenomenon shows that using equally spaced points with interpolation of higher degree polynomials can cause problems. Using Gaussian points, the error appears to be more regular, oscillations are minimized, and the error decreases as the degree of the polynomial interpolation increases.

Using the cubic spline interpolation method can help to avoid this since it uses piecewise polynomials.

A Try at Newton’s Divided Difference

Posted by: aimeemarie88 on: December 9, 2008


When looking for a method that can be more useful in cases where the Lagrange polynomial may not work as well, Newton’s divided differences is one possibility. Starting with the interpolating polynomial:

p(x)=a_0+a_1(x-x_0)+a_2(x-x_0)(x-x_1)+ \ldots +a_n(x-x_0)(x-x_1) \ldots (x-x_n-1)

Imposing the conditions:

p(x_1)=f(x_i),i=0,1, \ldots ,n

And finding:

a_0=p(x_0)=f(x_0)

a_1=\frac{p(x_1)-a_0}{x_1-x_0}=\frac{f(x_1)-f(x_0)}{x_1-x_0}

a_2=\frac{p(x_2)-a_0-a_1(x_2-x_0)}{(x_2-x_0)(x_2-x_1)}

Etc…

One advantage this method has over the Lagrange method is that the degree of p can be increased from n to n+1 by adding the term:

a_{n+1}(x-x_0)(x-x_1) \ldots (x-x_{n-1})(x-x_n)

There would only need to be a calculation done for this specific added point, rather than in the Lagrange formula where every term would change.

Since the term a_n depends only on f,x_0,x_1, \ldots x_n, the nth divided difference can be expressed as a_n=f[x_o,x_1, \ldots ,x_n]

With some algebraic manipulation, the divided difference can be represented as:

f[x_0,x1,...,x_n]=\frac{f[x_1,...,x_n]-f[x_0,...,x_{n-1}]}{x_n-x_0} for any value of n. Newton’s divided difference can be easier to use than the Vandermonde matrix method because it uses a lower or upper triangular matrix, rather than the more complicated Vandermonde matrix.

I found a website: http://www.math.ucla.edu/~ronmiech/YAN/ndivdiff.html that will generate the upper triangular matrix for Newton’s divided difference. Using this program and 5 values on the interval [-1,1] with the function sin(pi*x), I received the following

-1.0000   -0.5000         0        0.5000    1.0000

0       -1.0000         0        1.0000         0

-2.0000    2.0000    2.0000   -2.0000         0

4.0000         0       -4.0000         0             0

-2.6667   -2.6667         0             0             0

I tried to write a MATLAB code to compute the matrix. With some knowledge of MATLAB and compiling various sources of MATLAB help, I came up with the following code:

x=[-1 -.5 0 .5 1];

y=[sin(-pi) sin(-.5.*pi) 0 sin(0.5*pi) sin(pi)];

n=length(x);

c=zeros(n,n+1);

c(:,1)=x’;

c(:,2)=y’;

for j=3:n+1;

for i=j-1:n;

i1=i-j+2;

c(i,j)=( c(i,j-1)-c(i-1,j-1) )/( c(i,1)-c(i1,1) );

end;

end;

c’

This code yielded the following matrix

-1.0000   -0.5000         0    0.5000    1.0000

-0.0000   -1.0000         0    1.0000    0.0000

0   -2.0000    2.0000    2.0000   -2.0000

0         0    4.0000         0   -4.0000

0         0         0   -2.6667   -2.6667

The two matrices are relatively the same, except that the columns are in a different order, so it seems as though the code worked well.

The interpolation polynomial found using Newton’s method is the same as the one found using Lagrange method as well as Vandermonde method. As a theorem:

Given n+1 distinct points x_0,x_1, \ldots ,x_n and corresponding values y_o,y_1, \ldots ,y_n there exists a unique polynomial p of degree n in x for which p(x_i)=y_i,i=0,1, \ldots ,n

Cubic Spline (2)

Posted by: aimeemarie88 on: December 9, 2008


In order to code the cubic spline interpolation in MATLAB, Katie found the book Numerical Computing with MATLAB by Cleve B. Moler in the library and it gave a good description on what to do.

The book explained to create the following m files:

function d = splineslopes(h, delta)

n = length(h) +1;
a= zeros(size(h)); b=a; c=a; r=a;

a(1:n-2) = h(2:n-1);
a(n-1) = h(n-2) + h(n-1);
b(1) = h(2);
b(2:n-1) = 2*(h(2:n-1)+h(1:n-2));
b(n) = h(n-2);
c(1) = h(1) + h(2);
c(2:n-1) = h(1:n-2);

r(1)=((h(1)+2*c(1))*h(2)*delta(1)+...
    h(1)^2*delta(2))/c(1);
r(2:n-1) = 3*(h(2:n-1).*delta(1:n-2)+...
    h(1:n-2).*delta(2:n-1));
r(n) = (h(n-1)^2*delta(n-2)+...
    (2*a(n-1)+h(n-1))*h(n-2)*delta(n-1))/a(n-1);

d = tridisolve(a,b,c,r);

function v = splinetx(x,y,u)
%SPLINETX  Textbook spline function.
%  v = splinetx(x,y,u) finds the piecewise cubic interpolatory
%  spline S(x), with S(x(j)) = y(j), and returns v(k) = S(u(k)).
%
%  See SPLINE, PCHIPTX.

%  First derivatives

   h = diff(x);
   delta = diff(y)./h;
   d = splineslopes(h,delta);

%  Piecewise polynomial coefficients

   n = length(x);
   c = (3*delta - 2*d(1:n-1) - d(2:n))./h;
   b = (d(1:n-1) - 2*delta + d(2:n))./h.^2;

%  Find subinterval indices k so that x(k) <= u < x(k+1)

   k = ones(size(u));
   for j = 2:n-1
      k(x(j) <= u) = j;
   end

%  Evaluate spline

   s = u - x(k);
   v = y(k) + s.*(d(k) + s.*(c(k) + s.*b(k)));

function x = tridisolve(a,b,c,d)
%   TRIDISOLVE  Solve tridiagonal system of equations.
%     x = TRIDISOLVE(a,b,c,d) solves the system of linear equations
%     b(1)*x(1) + c(1)*x(2) = d(1),
%     a(j-1)*x(j-1) + b(j)*x(j) + c(j)*x(j+1) = d(j), j = 2:n-1,
%     a(n-1)*x(n-1) + b(n)*x(n) = d(n).
%
%   The algorithm does not use pivoting, so the results might
%   be inaccurate if abs(b) is much smaller than abs(a)+abs(c).
%   More robust, but slower, alternatives with pivoting are:
%     x = T\d where T = diag(a,-1) + diag(b,0) + diag(c,1)
%     x = S\d where S = spdiags([[a; 0] b [0; c]],[-1 0 1],n,n)

x = d;
n = length(x);

for j = 1:n-1
   mu = a(j)/b(j);
   b(j+1) = b(j+1) - mu*c(j);
   x(j+1) = x(j+1) - mu*x(j);
end

x(n) = x(n)/b(n);
for j = n-1:-1:1
   x(j) = (x(j)-c(j)*x(j+1))/b(j);
end

Then typing in the MATLAB command window: 

>> n=5;
>> x=linspace(-1,1,n)';
y= sin(pi*x);
m=101;
xx = linspace(-1,1,m)';
>> P= splinetx(x,y,xx);
>> plot(xx, P)
>> y1 = sin(pi*xx);
>> plot(xx, y1)
>> plot(xx,y1,xx,P)
>> y2= abs(y1 - P);
>> plot(xx, y2)

We used the code and changed the functions and amount of points to generate the following graphs today in class:

cubic-spline1

sin(pi*x) with 5 points

cubic-spline-error

error for sin(pi*x) with 5 points

cubicspline-10

sin(pi*x) w 10 points

cubicspline-10error

error for sin(pi*x) w 10 points

cubicspline-100error1

error for sin(pi*x) w 100 points

cubicspline-1000error

error for sin(pi*x) w 1000 points

We tried to use the Runge function with Gaussian points and it was able to graph the polynomial interpolation but we couldn’t get it to graph the actual function.

Cubic Spline

Posted by: aimeemarie88 on: December 9, 2008

Cubic Spline

DEFINITION: Suppose that {(x_k,y_k)}_{k=0}^n are n+1 points, where a=x_0<x_1 \ldots <x_n=b. The function S(x) is called a cubic spline if there exists n cubic polynomials S_k(x) with coefficients S_{k,0},S_{k,1}, S_{k,2}, S_{k,3} that satisfy the properties:

(i) S(x)=S_k(x)=S_{k,0}+S_{k,1}(x-x_k)+S_{k,2}(x-x_k)^2+S_{k,3}(x-x_k)^3 for x\in[x_k,x_{k+1}] and k=0,1, \ldots ,n-1

(ii) The spline passes through each data point: S(x_k)=Y_k for k=0,1, \ldots n

(iii) The spline forms a continuous function over [a,b]: S_k(x_{k+1})=s_{k+1}(s_{k+1}) for k=0,1. \ldots ,n-2

(iv) The spline forms a smooth function: S'_k(x_{k+1})=S'{k+1}(x_{k+1}) for k=0,1, \ldots ,n-2

(v) The second derivative is continous: S''_k(x_{k+1})=S''{k+1}(x_{k+1})

Natural Spline: There exists a unique cubic spline with the free boundary conditions S''(a)=0 and S''(b)=0

Clamped Spline: There exists a unique cubic spline with the first derivative boundary conditions S'(a)=d_0 and S'(b)=d_n

Theorem: Minimum Property of Clamped cubic splines: Assume f\in C^2[a,b] and S(x) is the unique clamped spline interpolant for f(x) which passes through {[x_k,f(x_k))}_{k=0}^n and satisfies the clamped end conditions S'(a)=f(a) and S'(b)=f(b). Then \int_{a}^{b}(S''(x))^2dx\le \int_{a}^{b}(f''(x))^2dx

Interpolation vs. Function

Posted by: aimeemarie88 on: November 22, 2008

Given the function f(x) = sin(pi*x) interpolate a polynomial for the interval [-1, 1]

syms t L P

n=5

x=linspace(-1,1,n)

Y=sin(pi*x)

for j=1:5

L(j)=1;

for i=1:5

if (i~=j)

L(j) = L(j)*(t-x(i))/(x(j)-x(i));

end

end

end

P=L*Y’

expand(P)

ans =

-27043212804868892242677135040851/10141204801825835211973625643008*t^3+108172851219475573938466140184915/40564819207303340847894502572032*t

To graph the polynomial versus the function:

x=[-1:.01:1]

y=(108172851219475573938466140184915*x)/40564819207303340847894502572032 – (27043212804868892242677135040851*x.^3)/10141204801825835211973625643008

y1=sin(pi*x)

plot(x,y,x,y1)

untitled

From the graph, it can be seen that the polynomial interpolation (which used 5 points) is the same as the actual function in exactly five places. At all other points it is very similar.

To graph the error:

y3=abs((108172851219475573938466140184915*x)/40564819207303340847894502572032 – (27043212804868892242677135040851*x.^3)/10141204801825835211973625643008-(sin(pi*x)))

plot(x,y3)

untitled2

Given the function f(x) = exp(x) interpolate a polynomial for the interval [-1, 1]

syms t L P

n=5

x=linspace(-1,1,n)

Y=exp(x)

for j=1:5

L(j)=1;

for i=1:5

if (i~=j)

L(j) = L(j)*(t-x(i))/(x(j)-x(i));

end

end

end

P=L*Y’

expand(P)

ans =

48904249067013/1125899906842624*t^4+149756602621819/844424930131968*t^3+2250200748318013/4503599627370496*t^2+13481801331389657/13510798882111488*t+1

To graph the polynomial versus the function:

x=[-1:.01:1]

y=48904249067013/1125899906842624*t^4+149756602621819/844424930131968*t^3+2250200748318013/4503599627370496*t^2+13481801331389657/13510798882111488*t+1

y2=exp(x)

plot(x,y,x,y2)

untitled3

Zooming in closer at a point that is not an interpolation point:

untitled4

If zooming in close to an interpolation point, the graphs are the same.

It appears from far away that the polynomial interpolation and the function are the same, but after zooming it, it is clear they are very close but not the same.

When zooming in around zero, or one of the points that was used in interpolation, it appears that the graphs are the same, similar to what happened with the sin(pi*x) error from above.

When graphing the error:

y2=abs((48904249067013*x.^4)/1125899906842624 + (149756602621819*x.^3)/844424930131968 + (2250200748318013*x.^2)/4503599627370496 + (13481801331389657*x)/13510798882111488 + 1-(exp(x)))

plot(x,y2)

untitled5

This proves that the error is zero at the points that were interpolated.

Trying to break either method

Posted by: aimeemarie88 on: November 22, 2008

In an attempt to find examples that can cause the Vandermonde method or Lagrange method to fail, Katie and I made up various examples. The first example was chosen at random.

Example 1:

Using 6 points: (0, -5) (0.2, 0) (0.2, 0.1) (0.5, 1) (0.52, -0.1) (1,8)

Works in both Vandermonde and Lagrange

We then decided to try to make the interval between points smaller.

Example 2:

Using 4 points: (0,0) (0.001, 0.005) (0.002, 0.004) (0.005, -0.001)

Works in Vandermonde and Lagrange

In the previous example, we saw that there was no constant term. We thought this to be because the point (0,0) was chosen. Next, we decided to choose points that were very far apart.

Example 3:

Using 5 points: (-100, 89) (75, 200) (0,1) (1000,1000) (-20, 2)

Works in Vandermonde and Lagrange

After not being able to find any sort of error when choosing points that were close together then points that were far apart, we decided to try to choose points that had x values that were very close together and y values that were very far apart.

Example 4:

Using 3 points: (0.0001, 1000) (0.0002, -2) (0.0003, 200)

Worked in Vandermonde and Lagrange

Next, instead of working with points, we decided to use a specific function.

Example 5:

Y=cos(x) with 20 points chosen between -1 and 1

Works in Vandermonde and Lagrange

Since the cos function seemed to work efficiently in both methods, we thought to use the cos function of x to a negative power and add more points

Example 6:

n=50

x=linspace(-1,1,n)

Y=cos(x.^-1)

A=vander(x);

B=Y’;

Ax=B;

x=A\B

Result: Warning: Matrix is close to singular or badly scaled. Results may be inaccurate.

In the Lagrange method, it seemed to compute ok, but took a very long time.

Overall, our results were inconclusive. We thought we may have found something that caused either one or both methods to fail (by putting x^-1) or increasing the amount of points but we were unable to find any concrete results.

Polynomial interpolation

Posted by: aimeemarie88 on: November 11, 2008

When given various points in the x-y plane, the task to attempt to find a function that satisfies these points can sometimes be very difficult. One approach that can be taken uses the idea of the Vandermonde matrix and solving a linear system.

Suppose the polynomial is of the form p(x)=a_nx^n+a_{n-1}x^{n-1}+...+a_2x^2 + a_1x+a_0. A linear system of the form Ax=B can be formed using the Vandermonde matrix of the points as A, the unknown coefficients as x, and the known function values as B.

A linear system of this form can be solved by hand using linear algebra, or much easier by using MATLAB. Taking the following example using data given in class:

A = vander(-1:0.5:1)

A =

1.0000   -1.0000    1.0000   -1.0000    1.0000

0.0625   -0.1250    0.2500   -0.5000    1.0000

0         0             0             0            1.0000

0.0625    0.1250    0.2500    0.5000    1.0000

1.0000    1.0000    1.0000    1.0000    1.0000

B=[0;-.5;-1;.75;-0]

B =

0

-0.5000

-1.0000

0.7500

0

x=A\B

x =

-4.6667

-1.6667

5.6667

1.6667

-1.0000

This data shows the polynomial which satisfies these points is of the form:

p(x)=-\frac{14}{3}x^4+\frac{5}{3}x^3+\frac{17}{3}x^2-\frac{5}{3}x-1

There is also another approach that can be used to find a polynomial which satisfies a set of points. This is known as Lagrange interpolation. Lagrange polynomials are interpolating polynomials which equal zero in at all points given, except one. For example: L_0(x)=\frac{(x-x_1)(x-x_2)(x-x_3)...(x-x_n)}{(x_0-x_1)(x_0-x_2)(x_0-x_3)...(x_0-x_n)} will equal one at L_0(x_0) and zero at \forall L_0(x_j)j\ne0

The polynomial interpolation can be defined as p(x)=y_1L_1(x)+y_2L_2(x)+y_NL_N(x)

After computing the solution by hand, we received help with constructing the MATLAB code for the Lagrange interpolation and saw there were two different ways it could be done. The first used the code

syms t L P

x=[-1, -1/2, 0, 1/2, 1]

L(1) = ((t-x(2))*(t-x(3))*(t-x(4))*(t-x(5)))/((x(1)-x(2))*(x(1)-x(3))*(x(1)-x(4))*(x(1)-x(5)))

L(2) = ((t-x(1))*(t-x(3))*(t-x(4))*(t-x(5)))/((x(2)-x(1))*(x(2)-x(3))*(x(2)-x(4))*(x(2)-x(5)))

L(3) = ((t-x(2))*(t-x(1))*(t-x(4))*(t-x(5)))/((x(3)-x(2))*(x(3)-x(1))*(x(3)-x(4))*(x(3)-x(5)))

L(4) = ((t-x(2))*(t-x(3))*(t-x(1))*(t-x(5)))/((x(4)-x(2))*(x(4)-x(3))*(x(4)-x(1))*(x(4)-x(5)))

L(5) = ((t-x(2))*(t-x(3))*(t-x(4))*(t-x(1)))/((x(5)-x(2))*(x(5)-x(3))*(x(5)-x(4))*(x(5)-x(1)))

Y=[0; 3/4; -1; -1/2; 0]

P=L*Y

expand(P)

We received the answer:

P =

-2*(t+1)*t*(t-1/2)*(t-1)-4*(t+1/2)*(t+1)*(t-1/2)*(t-1)+4/3*(t+1/2)*t*(t+1)*(t-1)

ans =

-14/3*t^4+5/3*t^3+17/3*t^2-5/3*t-1

The second code, although more complex at first, will be better for future use.

syms t L P

x=[-1, -1/2, 0, 1/2, 1];

Y=[0; 3/4; -1; -1/2; 0];

for j=1:5

L(j)=1;

for i=1:5

if (i~=j)

L(j) = L(j)*(t-x(i))/(x(j)-x(i));

end

end

end

P=L*Y;

expand(P)

ans =

-14/3*t^4+5/3*t^3+17/3*t^2-5/3*t-1

It can be seen that both of these codes yield the same answer, which is also the same as the answer found using the Vandermonde matrix approach. The next step is to look more into these two approaches and find when one method will be more efficient than the other and whether or not both methods will work for all cases.

Higher order iterations

Posted by: aimeemarie88 on: October 29, 2008

Introduction:

After understanding that Newton’s method generally has second order convergence, I wondered if it was possible to have 3rd-order, 4th-order, or even nth order of convergence.

A third order algorithm will triple the number of correct digits in each iteration. To find a third order iteration, one must attempt to determine an expression q_n such that x_{n+1} - \sqrt{s} = \frac{(x_n-\sqrt{s})^3}{q_n} will lead to the desired 3rd-order iterative formula. Reqriting as x_{n+1} = \frac{x_n^3 + 3x_ns}{q_n} + (1-\frac{3x_n^2 + s}{q_n})\sqrt{s}

The choice of letting q_n = 3x_n^2 will eliminate \sqrt{s} from the formula, which is required.

Iteration formula:

This yields the third-order formula x_{n+1} = x_n(\frac{x_n^2 + 3s}{3x_n^2 + s}) which has the property x_{n+1} - \sqrt{s} = \frac{(x_n-\sqrt{s})^3}{3x_n^2 + s}

In all cases of third order iterations, |x_{n+1} - \sqrt{s}| = \frac{|x_n - \sqrt{s}|^3}{3x_n^2 + s}\le \frac{1}{s}|x_n - \sqrt{s}|^3

Since \sqrt{s} \ge 1, the typical third-order behavior is |x_{n+1} - \sqrt{s}| \le |x_n - \sqrt{s}|^3

Example:

Using this third-order formula to estimate radical 2 with the following MATLAB code:

x(1)=1.0;
for i=1 : 6
    x(i+1) = sqrt(2) + ((x(i)-sqrt(2))^3)/(3*((x(i)))^2 +2)
end

x =
  Columns 1 through 3
   1.000000000000000   1.400000000000000   1.414213197969543
 Columns 4 through 6
   1.414213562373095   1.414213562373095   1.414213562373095
 Column 7
   1.414213562373095


Graphing the error,

Halley’s method:

The method of convergence with higher order than Newton’s method that is often looked at is Halley’s method. Halley is most famous for analyzing the orbit of the comet in 1682 and making the correct prediction of its return 76 years later.

This method uses the quadratic term of the taylor series expansion of f and requires that the second derivative is continuous.

x_{n+1} is defined as the route of the quadratic expansion:

0 = f(x_n) + (x-x_n)f'(x_n) + \frac{1}{2}(x-x_n)^2f''(x_n)

Each iteration would be of the form

x_{n+1} = x_n - \frac{2f(x_n)}{f'(x_n) \pm\sqrt{|f'(x_n)|^2 - 2f(x_n)f''(x_n)}}

Since there are two signs in the denominator, the sign is chosen to try to minimize the difference between x_n and x_{n+1}

The order of convergence of Halley’s method is three. Again, Halley’s function involves the evaluation of the second derivative and each iteration requires three function evaluations.

Thoughts:

It seems as though the third-order iteration formula appears to converge faster than the second-order formula and is in turn more efficient. Fewer iterations are needed to obtain the same accuracy. However, the third-order formula is much more complex than previous iteration formulas, almost entirely cancelling out its positives. In actuality, as the order of convergence increases, the complexity of the computation of derivatives, as well as increasingly large formulas, make the benefits of these higher order iteration methods not worth the trouble.

Newton

Posted by: aimeemarie88 on: October 29, 2008

Introduction:

Newton’s method is generally of second order convergence. The number of correct digits will double with each iteration. Newton’s method is of the form:

x_{n+1} = x_n - \frac{f(x_n)}{f'(x_n)}

There are some points that are necessary in order to ensure convergence of Newton’s method:

  • The derivative must be calculated directly, not approximated by using two points and finding the slope.
  • The initial point chosen must be sufficiently close to the actual root, if it is not fairly close, the method could diverge.
  • The root must be a simple root, or of multiplicity one. If it is not, Newton’s method will be no faster than the bisection method would be.

Kantorovich?:

I wanted to look more into trying to find out if there was some sort of theorem or information that determines how close the initial guess must be to the actual root for Newton’s method to converge. It is clear that if any of the iterations of Newton’s method require a tangent line that is horizontal, the algorithm will not converge. I also found that the Kantorovich theorem can be used to describe this, however when looking more into it, I couldn’t really understand it.

Rate of Convergence of Newton’s Method:

To find the rate of convergence of Newton’s method:

  • Assume that p_n converges to p
  • Set      E_n = p - p_n for  n \ge 0
  • Let two positive constants A \ne 0 and R > 0 exist
  • \mathop{\lim}\limits_{n      \to \infty}\frac{|p-p_{n+1}|}{|p-p_n|^R} = \mathop{\lim}\limits_{n \to      \infty} \frac{|E_{n+1}|}{|E_n|^R} = A
  • The sequence converges to p with order of convergence R.
  • The number A is called the asymptotic error constant
  • If $latexR=1$, there is linear convergence
  • If R=2, there will be quadratic convergence

Simple Root versus Multiple Root:

Assume that Newton-Raphson iteration produces a sequence {{p_k}}_{k=0}^\infty that converges to the root p of the function f(x)

Simple Root: Convergence is quadratic and  |E_{k+1}|\approx \frac{|f''(p)|}{2|f'(p)|}(|E_k|)^2

for k sufficiently large.

Multiple root of order m: Convergence is linear and

|E_{k+1}| \approx\frac{m-1}{m}|E_k|

for k sufficiently large.

Proof of Order of Convergence:

Expand the Taylor polynomial of degree n=1 around the point x=pk

f(x) = f(p_k) + f'(p_k)(x-p_k) + \frac{1}{2!}f''(c_k)(x-p_k)^2

Since p is a zero of f(x), let x=p

0 = f(p_k) + f'(p_k)(p-p_k) + \frac{1}{2!}f''(c_k)(p-p_k)^2 .

With some rearranging,

f(p_k) + f'(p_k)(p-p_k) = -\frac{f''(c_k)}{2!}(p-p_k)^2 .

Assuming that f'(x) \ne 0 for all x near p

Since f'(p_k) \ne 0, you can divide by it

\frac{f(p_k)}{f'(p_k)} + \frac{f'(p_k)}{f'(p_k)}(p-p_k) = -\frac{f''(c_k)}{2f'(p_k)}(p-p_k)^2

Since \frac{f'(p_k)}{f'(p_k)} = 1

\frac{f(p_k)}{f'(p_k)} + (p-p_k) = \frac{f''(c_k)}{2f'(p_k)}(p-p_k)^2

Rearranging to

p - (p_k - \frac{f(p_k)}{f'(p_k)}) = -\frac{f''(c_k)}{2f'(p_k)}(p-p_k)^2

Using the fact that the Newton iteration formula is p_{k+1} = p_k - \frac{f(p_k)}{f'(p_k)} and substituting:

p-(p_{k+1}) = -\frac{f''(c_k)}{2f'(p_k)}(p-p_k)^2.

Using f'(p_k) \approx f'(p) and f''(c_k) \approx f''(p) when k is sufficiently large
p-(p_{k+1}) \approx -\frac{f''(p)}{2f'(p)}(p-p_k)^2
E_{k+1} \approx -\frac{f''(p)}{2f'(p)}(E_k)^2

Taking the absolute values:

|E_{k+1}| \approx -\frac{|f''(p)|}{|2f'(p)|}(|E_k|)^2

Error of fixed point method

Posted by: aimeemarie88 on: October 26, 2008

One of the last things to do concerning fixed point methods was to take a look at the error produced with each iteration using the following code.

x(1)=1.0;

for i=1 : 7

x(i+1) = -sqrt(1/8)*(x(i)^2 – 2) + x(i)

y=sqrt(2)-x(i+1)

end

Using the equation with radical two as the fixed point, starting with an initial point of x0=1, and allowing the constant to equal the optimal constant, which is –sqrt(1/8), the following graph demonstrates how quickly the error decreases. After just six iterations, the root is found, with the error = 0.

Regarding my previous post on types of convergence of the fixed point method, I would like to clarify some things. Let F’ be the derivative of the function and r be the actual root.

  • If |F’(r)|<1, the iterations will converge, therefore the error will decrease
  • If |F’(r)|>1, the iterations will diverge, therefore the error will increase

If there is convergence,

  • The error will decrease monotonically if 0 ≤ F’(r) < 1
  • The error will decrease in an oscillatory manner if -1 < F’(r) < 0

I made up the following example which demonstrates this.

I designed the fixed point method, beginning at point x=0 to find the roots of following equation:

x^2 + x – 2 = 0

x=\frac{2}{x+1}

xn= \frac{2}{xn +1}

G(x) = \frac{2}{x+1}

G'(x) = \frac{-2}{(x+1)^2}

G'(1) = \frac{-1}{2}

G'(-2) = 2

This shows that with the guess of an initial point sufficiently close to the root, it is expected that the iterations will converge to the root one in an oscillatory manner and will not converge to the root negative two.

The following MATLAB code confirms this:

x(1) = 0;

for i=1:30

x(i+1)=2*(x(i)+1)^-1

end

Columns 1 through 12

0 2.0000 0.6667 1.2000 0.9091 1.0476 0.9767 1.0118 0.9942 1.0029 0.9985 1.0007

Columns 13 through 24

0.9996 1.0002 0.9999 1.0000

This demonstrates the oscillatory convergence to the root one.

No matter which initial point is chosen, the code will not converge to the other root which is negative two. I attempted to chose negative three and negative one, but still both converge to the root one. This is appealing that the MATLAB code actually works and supports the data I previously found regarding types of convergence.


  • dankatzumd: That is interesting. Putting the points $latex p_1$ and $latex p_2$ in terms of the golden ratio $latex \phi$ gives $latex p_1 = (2\phi + \sqrt{2},
  • aheryudono: Dear Students, I just want to clarify things that we discuss in class. Maybe my guidance in class is not clear for some of you. Some of you ask me
  • sigalgottlieb: Nice observation. Now take a look at the value of g'(x) on that domain and connect it to the speed of convergence. Explain why it is connected.

Categories