next up previous contents
Next: Iterative Approaches Up: Linear algebraic problems Previous: Linear algebraic problems

Direct Solution of ${\bf A} \cdot {\bf x} = {\bf b}$

The calculation of the solution is greatly facilitated if the matrix ${\bf A}$ has a special structure. For example, if ${\bf A}$ is a non-singular lower triangular matrix ${\bf L}$ (i.e., the elements on the main diagonal lii are non-zero and all the elements above the main diagonal lij, i<j are zero) then the solution can be easily obtained through a successive forward substitution:

   \begin{eqnarray}
x_1 &=& \frac{y_1}{l_{11}} \nonumber \\ x_2 &=& \frac{y_2 - l_{...
 ...n - l_{n1} x_1 - l_{n2} x_2 - \ldots - l_{n(n-1)}
x_{n-1}}{l_{nn}}\end{eqnarray}

Note that this sequence of operations requires approximately $\frac{n^2}{2}$ number of floating point operations (i.e., pairs of add-multiplies). Similarly, if ${\bf A}$ is a non-singular upper triangular matrix ${\bf U}$ (i.e., the elements on the main diagonal uii are non-zero and all the elements below the main diagonal uij, i>j are zero) then the solution can be easily obtained through a successive backward substitution:

   \begin{eqnarray}
x_n &=& \frac{y_n}{u_{nn}} \nonumber \\ x_{n-1} &=& \frac{y_{n-...
 ...=& \frac{y_1 - u_{12} x_2 - u_{13} x_3 - ... - u_{1n}
x_n}{u_{11}}\end{eqnarray}

Note that this procedure also requires approximately $\frac{n^2}{2}$ number of floating point operatons.

However, for a full general matrix, the general direct calculation is more complex. One of the many variants that exist is to first decompose ${\bf A}$ into the product of two matrices, one lower triangular ${\bf L}$ and one upper triangular ${\bf U}$: 
 \begin{displaymath}
{\bf A} = {\bf L} \cdot {\bf U}\end{displaymath} (243)
Then the solution of ${\bf A} \cdot {\bf x} = {\bf b}$ is reduced to the solution of two triangular matrix problems, ${\bf L} \cdot {\bf y} = {\bf
b}$ and ${\bf U}
\cdot {\bf x} = {\bf y}$ which, according to what is said in the preceding paragraph, can be efficiently solved, within approximately n2 number of operations, using one forward and one backward successive substitution, respectively. The forward/backward substitution process is typically carried out by a separate subroutine, which, for example with IMSL is LSARG.

To achieve the L-U decomposition of a full matrix ${\bf A}$, a Gaussian elimination procedure can be followed. According to this procedure one successively eliminates one variable at a time, x1, x2, ... from the rest of the equations using equation 1, 2, ... to solve for x1, x2, ..., respectively, in terms of the remaining variables, i.e., at first variable x1 is eliminated from equations 2, 3,...,n by subtracting equation 1 multiplied by $\frac{a_{21}}{a_{11}}$,$\frac{a_{31}}{a_{11}}$, ... , $\frac{a_{n1}}{a_{11}}$ from equations 2, 3, ..., n, respectively. Then, variable x2 is eliminated by subtracting a suitable multiple of the new equation 2 from equations 3, 4, ..., n, and so on, till the final variable, xn-1 is eliminated from the last equation n. Once equation i is used to eliminate variable xi, each coefficients are used to form the corresponding (partial) row of the upper triangular matrix U while the multiplicative factors used in the elimination of that variable from the remaining equations i+1, i+2, ..., n are used, together with 1 for the diagonal, in order to form the corresponding (partial) column of the lower triangular matrix L. Thus, at the end of the Gaussian elimination procedure both matrices are formed. To perform the L-U decomposition IMSL subroutine LFTRG can be used.

Notice that in order for the matrix U to be non-singular, the diagonal elements a11, ..., which are called ``pivots'' need to be non-zero. Moreover, since the pivots also appear as denominators of the multiplicative factors $\frac{a_{21}}{a_{11}}$, $\frac{a_{31}}{a_{11}}$, ... , $\frac{a_{n1}}{a_{11}}$, ... that are used in order to superimpose the eliminating equation with the rest, if the magnitude of these factors is too large, one tends to wipe out useful information due to the finite precision of the computations, and thus, sometimes introduces unacceptably large error in the final numerical result. However, the order in which the equations are stacked in matrix ${\bf A}$ and the numbering of the variables can readily be changed, thus leading potentially to better conditioned elimination schemes. This is called pivoting and is available as an option in commercial subroutines that the user may want to try if he/she suspects a loss of accuracy due to the presence of large pivots. Of course, pivoting is absolutely necessary whenever any of the original pivots is exactly zero. If pivoting is used, there are several strategies for it depending on whether one exchanges equations or variables or both. The general rule is that the more flexible the pivoting strategy is, the more robust and accurate the numerical answer is, but at the same time more computationally expensive the solution becomes. So, the rule of thumb, when multiple times a similar system of equations is used, is always to try different pivoting schemes before deciding which strategy to follow. Another option that is available in LU decomposition routines to improve the accuracy of the results is equilibration (according to which the various equations are weighted differently before they are solved). This may be of usefulness if one tries to solve several dissimilar equations, such as the ones resulting from the discretization of different conservation principles.

However, all these strategies may eventually continue to fail to improve the numerical error if the equations are by themselves ``ill-conditioned'', i.e., very close to being singular. An upper bound on how possible relative errors, present either in the coefficients of the matrix ${\bf A}$ or in the coefficients of the right-hand-side vector ${\bf b}$, can affect the relative error in the solution is provided by the evaluation of ``condition number'' CN, $CN = \vert\vert{\bf A}^{-1}\vert\vert \cdot \vert\vert{\bf A}\vert\vert$.Equations with high condition numbers may be inherently ill-conditioned independent of the numerical scheme. Very little can be done in that case (sometimes a backward correction can be helpful according to which a correction is made to the solution based on a new right-hand-side calculated from the residual ${\bf
Ax} - {\bf b}$. However, it does not always work. You do not necessarily see a large error all the time associated with the solution of an ill-conditioned equation, since it does depend on the right-hand-side vector as well. All that the presence of a high CN tells you is simply that there is the danger of a high amplification of the truncation error during the solution. Numerical examples should be used to determine if indeed this is the case. But, in the design of a good algorithm is it nevertheless a good practice to avoid schemes which lead to poorly conditioned systems of equations.

In terms of the efficiency of the Gaussian elimination, note that approximately $\frac{n^2}{2}$ operations are required for the elimination of variable x1, $\frac{(n-1)^2}{2}$ for x2, and so on, for a total of approximately (to the highest power of n) $\frac{n^3}{3}$, which is substantially larger than the order n2 operations required for the back substitutions afterwards. However, note that for the solution of multiple right-hand-side vectors b one needs to perform the Gaussian elimination of the matrix only once so that the incremental cost for the solution of each additional b is only of the order n2 operations.


next up previous contents
Next: Iterative Approaches Up: Linear algebraic problems Previous: Linear algebraic problems
Michael Renardy
1998-07-13