NLA Final Project
by David Duran

TEXTBOOK: Iterative Solution Methods. Owe Axelsson. Cambridge University Press, 1994.

Final Project: Write a Matlab program of the Chebyshev iterative method, (5.25),(5.29) pp. 179-181. Fix some value of the polynomial degree p and execute the Chebyshev iterative method in a cycle. Test the program. Use the system diag(1,2,...,100) x = b with b = (1,1,...,1) and the initial approximation (1,1,...,1) as a standard test. Use 1 and 100 as bounds for eigenvalues in this example. Try different values of p. Check if you have any numerical instability, see Remark 5.11 on p.182. If you experience instability, try to find an appropriate ordering. Write a report and put it on the Web along with the program.

We are to code the first-order iterative method with C = I:

x@(l+1) = x@l - T.l(Ax@l-b), l=0,1,... where *.l denotes "* sub l" and *@l denotes the time step.

In this formula A is symmetric and positive definite and {T.l} is a parameter set, the proper choice of which will give an accelerated convergence over the simplest choice, T.l = T = 2/(L.n+L.1). Notation: L.n denotes the largest eigenvalue of the matrix A, L.1 denotes the smallest eigenvalue of the matrix A. The optimal parameters T.l are given by the inverses of the zeros of T.p, where T.p(z)=(1/2)[(z + (z^2-1)^(1/2))^p + (z - (z^2-1)^(1/2))^p].

(5.29) (1/T.l) = ((L.n-L.1)/2) cos(Q.l) + (L.n+L.1)/2, where Q.l = ((2*l+1)/(2*p)) *PI, l=0,1,..., p-1. Where p represents the polynomial degree.

In our standard problem, the ordering of {T.l} ie, l = 0 to p-1 vs. l = p-1 down to 0 in (5.29), yields the same convergence rates. This can be seen from the graphs below, which represent the chebyshev method of polynomial degree's 1 thru 6 executed in 50 cycles. The green graph, (which doesn't appear in the graph corresponding to p = 1), is the result of using T equal to 2/(L.n+L.1). The blue and red graphs represent the ordering of {T} using l = 1 to 100 and l = 100 down to 1 in (5.29) respectively (note: in the code there is a slight scale adjustment to get the bounds correct). Noting that the red graph completely and exactly covered the blue graph, we know the ordering of {T} in this case does not matter.

Not so important observations

It may be worth noting that the higher the order of the polynomial, the faster the method converges. When considering T=2/(L.n+L.1) ie, the green graph, polynomials of even degree return values greater than zero while polynomials of odd degree return values greater than zero only if the cycle is even; if the cycle is odd, the method returns values both positive and negative. Also note how the green graphs never "wave" about the solution, which means it is accurate for more values of x than if the same approximation "waved" about the solution. When considering T as defined in (5.29), the method returns strickly positive values only if the cycle is even; returning even values does not depend on the order of the polynomial as in the previous case with T = 2/(L.n+L.1).


[plot] [plot] [plot] [plot] [plot] [plot] 

Convergence

A logical question to ask is: Does this method convege? I attacked this question by testing the method with polynomials of degree 1 thru 100 using a cycle equal to 1 and T as defined in (5.29). My results are based on the output graphs constructed from my code. What I found was that the method with polynomials of degrees p = 38 thru 57, 61 and 64 did not converge. I then repeated the test with the method being executed in 2 cycles. I found that polynomials of degree p = 38, 48, 50, 52, 53 and 54 all converged after 2 cycles. The first three figures below is an example of the method with polynomial of degree 57 converging in 3 cycles, ie, the first figure corresponds to the method execueted in 1 cycle, the second figure with the method executed in 2 cycles, and the third figure with the method executed in 3 cycles. This made me consider the idea that maybe every method with polynomials of different degrees would converge with enough cycles. This idea is still open, although figures 4,5 and 6 below, which correspond to methods of polynomial degrees 44, 47 and 56 did not converge after 500 cycles. I would like to mention they did not diverge either, and again, my results were based on the graphs, so there could be an error in my code or the matlab language. 

[plot] [plot] [plot] [plot] [plot] [plot] 


Other definitions of T

Remembering that the ordering of {T.l} as defined in (5.29) does not matter in the convergence rate, I realize that in order for the method to converge (at least in fewer cycles), a new definition of T needs to be made. Tim Fritschel and I found that:

(5.29d) (1/T.l) = ((L.n-L.1)/2) sin(Q.l) + (L.n+L.1)/2 where Q.l = ((2*l+1)/2*p)) *PI , l = 0,1,..,p-1

and

(5.29t) (1/T.l) = ((L.n-L.1)/2) cos(Q.l) + (L.n+L.1)/2 where Q.l = (2*l+1)/2*p) , l = 0,1,..,p-1

both converge faster than using the parameter set {T.l} as defined in (5.29). It is important to note that the first few x-values converge faster using (5.29). The ordering of {T.l} as defined in (5.29d) and (5.29t) yield the same convergence rates with respect to themselves. Below are three examples comparing the convergence rates of of the method with parameter sets as defined in (5.29), (5.29d) and (5.29t). Figures 1 - 3 below represent the method with polynomial degree p=5 executed in cycles c=2. The green graph corresponds to the parameter set as defined in (5.29), the blue as defined in (5.29d) and the red as defined in (5.29t). It is easily seen that (5.29d) converges faster than (5.29t) which in turn converges faster than (5.29). [Special note: d beats t TIM, hehe] 


figure 1 [plot] figure 2 [plot] figure 3 [plot] 


The reasoning behind faster convergence of (5.29d and t)

Since it has been established that (5.29d) converges faster than (5.29t) which converges faster than (5.29), all I need to give is some reasoning on why (5.29d) converges faster than (5.29t). I only need this reasoning thanks to Tim Fritschel for showing that (5.29t) converges faster than (5.29). First note that in the general case of (5.29d) and (5.29t), when U = 0 in

1/T.l = ((L.n-L.1)/2) * U + (L.n+L.1)/2 ($$)

T = 2/(L.n+L.1), which we have seen converges rapidly for all but a few "end" values of x. Also note that this convergence is "smooth" and not "wavy" about the solution. It intuitively makes sence to choose a U close to 0 except for the "end" values of x. Since the value of U depends on Q (theta), our problem becomes: choose a Q such that U is close to 0 for the appropriate values of l. In (5.29d), Q (theta) is defined as:

Q = (2*l+1)/2*p)*PI for l = 0 to p-1

==> 0 < PI/(2*p) <= Q <= ((2*p-1)/(2*p))*PI < PI

==> 0 < U.d < 1, where U.d = sin(Q.d)

Tim established that .5403 < U.t < 1, where U.t = sin(Q.t). These limits probably don't mean much without a "sense" of where U mostly resides within these bounds. To help us see where U resides, consider the two graphs below. The first representing U.d and the second U.t. The green represents where the curve U lives and the red represents the bounded region of U. Recalling that T = 2/(L.n+L.1) is the optimal parameter, and can be achieved when U = 0 in ($$), and noticing that U.d is "closer" to 0 more often and in the sense of distance than U.t, we get a feel for why (5.29d) converges faster than (5.29t).


[plot] [plot]