next up previous contents
Next: Linear algebraic problems Up: Fundamentals on Numerical Methods Previous: Fundamentals on Numerical Methods

Fundamentals of FORTRAN 90 (95)

The purpose of this short section is not to teach FORTRAN. That would have required a full book and there are very good books already available for that task [1] which are highly recommended. Instead, what we aim to do here is to bring forth the principal concepts underlying a well-structured FORTRAN program given its close relationship with the numerical methods discussed below. Although most, if not all, of the numerical methods discussed in this chapter can also be implemented using a symbolic algebra software (such as Maple or Mathematica), it is still true that, for long and especially repetitious high precision calculations, the efficiency of a FORTRAN implementation remains unsurpassed. However, the availability of several well-tuned subroutine libraries nowadays makes the construction of large codes (from ``scratch'') largely unnecessary, at least for the casual computer end-user. All that this user needs now is a programming interface with available FORTRAN subroutines (such as those discussed further below in the next sections of this chapter, whenever appropriate). For learning the programming elements entering such a programming interface, we believe that the information contained in a short book, such as Ortega [4], is perfectly adequate. However, we also believe that in order to use computational resources judiciously and to avoid long frustrating debugging periods that may ``doom'' a particular numerical application, some good programming practices in handling the FORTRAN programming elements are advisable to be followed. Hopefully, the following discussion can help in strengthening, if not building, such a good programming practice. In addition, it is aimed to help better understand the numerical techniques exposed in the next sections since, at least for some of them, they were built with a FORTRAN implementation in mind.

FORTRAN (for FORmula TRANslator) is a high level language that is the most widely used one for scientific and engineering purposes. Since the time of its original development (in the late 50's (1957) at IBM from the working group of a Delaware engineer, John W. Backus), it has undergone several major changes that have eventually led to the current FORTRAN 90 standard (also, there exists a FORTRAN 95, but it is a minor update). A FORTRAN code is a collection of lines containing alphanumeric (ASCII) characters that is eventually processed by a compiler into a machine-readable (binary) set of instructions. In this version, FORTRAN 90 accommodates features (like pointers) characteristic of more advanced high level languages, like C, thus making it more appealing to experienced programmers who, though, might have preferred a more structural language altogether, as a replacement . Will that happen in the near future? To some extent yes, but not completely.

In my opinion, there are two primary reasons why C++, which is currently the most popular structural programming language (and the only one currently taught to the freshmen in institutions like MIT), still has not managed (and I feel that it will not also manage in the near future) to completely eliminate FORTRAN from high performance scientific and engineering calculations: First (and the most often quoted reason) there are zillions of FORTRAN computer codes already available, that need to be accommodated into new computational environments. However, although that would have explained why in new computers we still need FORTRAN compilers, it would not have explained why we still need to teach FORTRAN in our institutions. That, I feel can be explained by invoking a second argument (which is not so often heard), namely that of an inherent simplicity in using FORTRAN as opposed to C++. Whereas C++ thrives in the development of complex nested control structures providing to the user a lot of programming flexibility but also simultaneously a lot of programming responsibility, FORTRAN adheres to a small number of concrete constructs which make use of several user-hidden default programming actions. That makes it easier for non-computer programmers to develop computer codes. In particular, most of the heavy computer end-users among scientists and engineers have (so far, at least) found it easier to translate their algorithmic ideas for numerical methods into FORTRAN statements and FORTRAN programming than in C++ programming. This is why I feel that it is useful to start a discussion on numerical methods by reminding the user of the key features of the FORTRAN 90 language, anticipating that this is going to continue to be the language of preference in numerical analysis, based on past history.

However, we should not forget that what we have in the selection of the appropriate language really boils down quite frequently to a matter of taste. In other words, I believe that what we see in the competition between FORTRAN and C++ is really not unlike what we see between apple and pc users: it really boils down to a matter of taste between a well structured but complex and a simpler but less powerful programming environment. So far both coexist, and if the apple-pc analogy holds, both will continue to coexist for the foreseeable future. Moreover (again not dissimilar from what we see in the apple-pc race), as both the computational environment and the numerical algorithms evolve, we may continue to observe the current tendency for the two approaches to converge and merge with each other, which might indeed follow in the distant future as a natural evolution.

A FORTRAN program is first put together as a source code which consists of a number of FORTRAN 90 statements. The source code is then subsequently ``compiled'', i.e., used as the data input to a compiler program which translates the source code into cpu and memory instructions for the computer. The compilation is simply performed by invoking the FORTRAN 90 compiler using the source code program name as the first argument and, possibly, one or more compiler directives for compiler program specific work. For example, the compilation is performed as
f90 main_program second_program third_program fourth_program (libraries)
where the libraries involve needed subroutine libraries names for the support of the primary (main) and secondary programs.

In the following, we will use all capitals to declare FORTRAN statements for clarity. Both capitals and small letters can be used in actual programs interchangeably as FORTRAN is not a case-sensitive language. The only place where one has to be careful with the case letters is in specifying the names of a file within a UNIX environment whenever that name is communicated through a FORTRAN statement. Also, note that whereas previous versions of FORTRAN only allowed for a specific FIXED syntax according to which the first five characters in a line were reserved for an optional numeric line label, the sixth for a possible continuation line declaration (if non-blank) and the seventh to 72 (max) for the actual FORTRAN statement, FORTRAN 90 also allows a free format style where each line can have up to 132 characters, a continuation line is preceded in the end of the previous one by & and a comment is indicated by the exclamation point (!) before it, placed anywhere on the line. Whereas this last option is also available in the fixed format style, in that style the comments are usually identified instead with the letter C appearing as the first character in the line. In both styles, multiple statements on a line are also allowed, separated by a semicolon (;).

There are five major categories of FORTRAN statements:
1. Declaration or specification statements
These statements are used to specify the nature of the various variables, functions, and other components of a FORTRAN code, such as, for example
INTEGER X(0:10, -4:5)
REAL VAR
LOGICAL FUNCTION L(V1,V2)
etc. Those statements, placed at the beginning of the program and before any executable statements (i.e., i/o or assignment statements) typically specify both the extent and the form of the information contained in each one of the corresponding variables. For example, in the previous example we declare variable X to correspond to a two-dimensional integer array of 110 elements and a range of row and column indices from 0 to 10 and -4 to 5, respectively. The other two types of variables are LOGICAL, which can only assume two values, .true. and .false. subject to Boolean logic operations, and REAL which contain the numerical information in a ``floating point'' representation corresponding to a ``mantissa'', X, representing a real number between 0 and 1, and an integer exponent, I, which is used to scale the final number R as R = X 10I. For real numbers, the most important feature is the number of significant digits that are carried out in the finite representation of the mantissa. When 4 bytes are used for the real number representation (called REAL*4 or simply REAL), the mantissa is represented with a finite number of binary bits corresponding to 7-8 (depending on the operating system) significant digits in a base ten decimal representation. More useful for numerical calculations is an enhanced accuracy (called REAL*8 or DOUBLE PRECISION) representation corresponding to an allocation of 8 bytes for the full real number and 14-16 significant digits.

The form of the information contained in a variable can also be declared implicitly. For example, the declaration statement
IMPLICIT DOUBLE PRECISION (A-H, O-Z)
can be used to define all variables for which the first character is A-H or O-Z as double precision reals and all variables with first letter between I-N as regular integers (INTEGER*4). Although, such a statement simplifies the code by avoiding to explicitly declare the nature of all the variables involved, it can also become a source of frustration during debugging as with this implicit declaration, misspellings in the variables names remain undetected from the compiler and can be potentially difficult to detect during debugging. Thus, the current practices now call for either avoiding altogether the use of implicit declarations (which is the default if none appears but which also can be explicitly forced through the statement
IMPLICIT NONE
in the beginning of the declaration statements) or, instead, using the statement
IMPLICIT LOGICAL (A-Z)
which, by forcing a non-declared variable to be of a logical type (a type rarely used and of direct conflict with the implementation of arithmetic operations), makes typing errors in the names of variables detectable at compilation time. Of course, the downside with either one of these actions is that the type needs to be explicitly declared for all the variables used in the program.

Since the number of significant digits carried out in the mantissa determines the truncation error caused by a finite representation of real numbers in a computer, and since this truncation error usually amplifies in the course of numerical operations, it is primarily only the double precision representation that it is used in numerical calculations, especially when large scale calculations are involved. Indeed, Cray foresaw that when he decided to default the REAL declaration to double precision in CRAY computers; unfortunately the other manufacturers did not follow through (despite the fact that, due to extensive small scale parallelization inside the CPU and in the bus, it takes approximately the same number of cycles to perform single or double precision calculations). This different default now present only in CRAY computers and software has always been a cause for headaches, as one needs to manually switch from double to single precision all declaration, function, and subroutine calls as one transfers a code from a non-CRAY to a CRAY computer and vice-versa, given that automatic switches offered by the compiler have proven to be, to the author's experience, notoriously unreliable.

As far as the exponent of the floating point representation is concerned, depending on the single or douple precision declaration, it can take integer values bounded between certain limits like $\pm 38$ or $\pm 308$,respectively (those limits also vary depending on the operating system used). If the value falls outside those limits, there is or there is not an error declared (underflow for falling below or overflow for coming above), again, depending on the particular computational environment. Nowadays, underflow usually does not hinder a code from continuing to execute with the result replaced by a computer zero and this is not a problem, however, with certain systems (SGI, in particular) by default overflow is also not reported with the result (declared INF or NANQ, acronym for ``Not a Numerical Quantity'') continuing to participate in the calculations. This can result in a lot of wasted computer time, that can be avoided if suitable compiler options are chosen at compilation time. Finally, note that size limits also apply to all integer variables and those also depend on the byte size allocated to the integer values with the typical (default for INTEGER declaration) value of $\pm
10^{10}$ applicable for 4 bytes integers. Other integer sizes are also available in certain systems, but when used, care needs to be exercised whenever those integers appear as subroutine or common block arguments as they can lead to segmentation faults. Those faults created when not all bits in a ``quantum size'', which depends on the particular computer used, are assigned, can be a serious cause of frustration that can be avoided when no integers are mixed with reals in a common block, or if they always appear at the end of subroutine call arguments.

In addition to the declaration statements mentioned above, we also have declaration statements related to the allocation of certain variables in the physical memory of the computer. For example, the statement
COMMON /CB1/V1, V2, V3
allocates contiguous physical space to the variables V1, V2 and V3 in that order, the statement
EQUIVALENCE /Z1, Z2, Z3/W1, W2, W3/
allocates the same physical space for the variables Z1, Z2, Z3 and the variables W1, W2, W3 etc. Common blocks allow the transfer of information between various program segments. In addition, they have acquired special usefulness in parallel computing since they can be used to effectively align equivalent data structures within different processors so that interprocessor data transfers can be implemented using specialized communication calls. Thus, the user has to be especially careful that all variables entering those statements have been properly declared as parts of a common statement. Therefore, although for a time there was a tendency among programmers to avoid COMMON blocks, that does not seem to be the case any more. In contrast, the use of EQUIVALENCE statement is heavily discouraged considered to be a possible source of confusion. Their only previous justification (to save space while allowing the use of meaningful names to system variables) is not valid anymore, given the possibility of dynamic allocation of the variables, as explained below.

Whereas these statements are implemented at compilation time and the allocation of the space is permanent for the duration of the run, FORTRAN 90 also allows for dynamic variable allocation. For example, let us assume that you may want to allocate the real variable array V(1000) temporarily for some intermediate calculations within the code. You can achieve that through the following series of statements:
REAL, DIMENSION (:) : V
ALLOCATABLE V
ALLOCATE (V(1000), STAT=ISTAT)
DEALLOCATE (V, STAT=ISTAT)
where the first two statements are placed at the beginning of the code together with the other declaration statements of the program, the third is placed right before array V is utilized and the last one right after its last usage within the program. In this way, considerable savings on the system's resources can be accomplished without having to use and reuse the same variables with possibly irrelevant names.

2. Assignment statements
These are used to assign the value of a variable. There are always of the form
A = B
where A is a single variable or a variable array and B represents a compatible algebraic expression involving any one of the four arithmetic operations (*, /, +, -) and several functions operating on suitable data. Note that A can also appear as part of the B expression on the right-hand-side in which case its value is changed to the new one at the end of the statement's execution. This is a useful property when implementing recursive relations. Moreover, when using assignment statements for arrays, care needs to be exercised that all the operations appearing on the right-hand-side are allowed and lead to the desired results. When in doubt, one can always use an explicit evaluation in terms of individual components and a do loop structure. Inequalities or, in general, comparisons between two arithmetic data can also be performed in FORTRAN, but in this case, they do not result in new values, but simply in the combinatiovariables plus comparison being considered as a logical variable assigned to a .true. or .false. value, depending on the case. These expressions find use as arguments of control statements, as explained below.

3. Control statements
These statements control the flow of the execution of the programming instructions within the code. By default, the instructions of a FORTRAN code are executed sequentially, one at a time. However, flow control statements like
DO WHILE (K .LT. 3)
GO TO 300
etc. can change that execution order. As these two examples suggest, there are two types control statements: Conditional, the execution of which depends on the satisfaction of a certain set of criteria represented by various operations on logical variables (such as those resulting from the comparison of two arithmetic variables), and unconditional. The latter tend to be deemphasized in modern FORTRAN applications as their frequent use can create havoc during debugging.

The two most popular FORTRAN structures for the control of the flow of the execution of FORTRAN statements are the DO LOOP and the IF-THEN-ELSE conditional statement. Their combination gives rise to the DO WHILE structure and when all these are used in a nested fashion, they can generate a literally unlimited number of control structures. As mentioned, although explicit unconditional redirection of the flow to explicitly labeled lines in the FORTRAN code is still supported in FORTRAN 90, it is strongly discouraged. The corresponding structures for the DO LOOP and the IF-THEN-ELSE that avoid using explicit line references have the form
DO I=I1,I2,I3
(various FORTRAN statements)
ENDDO
and
IF (LOGICAL) THEN
(one set of various FORTRAN statements)
ELSE
(second set of various FORTRAN statements)
ENDIF

In the DO LOOP structure, the various FORTRAN statements included between the DO and ENDDO lines are executed as many times as necessary while the index variable I (which, by the way is a local variable with a proper value assigned with certainty only within the DO LOOP) takes consecutively the values I1, I1+I3, I1+2*I3 etc till its value becomes larger (or smaller, respectively) than I2 depending on whether I3 is positive (or negative, respectively). Note that in case when I1 > I2 (respectively, I1 < I2) and I3 is positive (or negative, respectively), the DO LOOP is skipped without being executed even once. The possibility to construct such type of do loops, which allow for much more compact codes, since they can arise as part of a general algorithm for various values of the index parameters I1, I2 and I3 (being, in general, variables within the code), as well as the possibility of using real variables as indices, are new features of FORTRAN 90. Note that care needs to be exercised if I, I1, I2 and I3 are declared as reals and I1 + an integer multiple of I2 is equal to I3 since truncation error in the computer, depending on how these values are represented in binary, can make an exact equality to be an inequality of either sign with the consequences as to whether the last iteration in the DO LOOP is or is not executed. Finally, note that in case that I3 is not specified explicitly, the default value taken by the program is (consistent to earlier FORTRAN versions) 1.

4. I/O (Input- output) statements
These statements are used to control the transfer of data information in and out of the code during the code's execution. A considerable amount of work was devoted in old times by the students trying to devise intricate FORMATING FORTRAN statements to provide for a sophisticated FORTRAN I/O interface, including graphics! Fortunately (since FORTRAN is the totally inappropriate language for building graphical interfaces), the availability of excellent graphics software nowadays makes such a task unnecessary. All that the user really needs to know is how to transfer data between the program and (preferably ASCII) files. For a full accuracy preservation and also for convenience, this is simply achieved most easily using unformatted read and write statements of the form
READ (10) A,B
WRITE (20) A,B
which read and write respectively, all the data content of the variables A, B from/to files associated with units 10 and 20, respectively. The association of these units with actual files is done either automatically (in which case the corresponding filenames are specified by the system, usually as fort10 and fort20, respectively) or via special declaration statements of the form
OPEN (10, file = "input.dat")
The only time when a formatted i/o may be necessary is when a particular graphics program needs it. That can be easily constructed based on the notational convention of the FORMAT statements, to which also FORTRAN 90 remains compatible. See [4] for further details.

5. Subprogram statements
These statements precede specific structural subprograms within a
FORTRAN program, namely FUNCTIONS and SUBROUTINES. For example, the statements
LOGICAL FUNCTION L(A1, A2)
SUBROUTINE SAMPLE (B1, B2)
declare the statements that follow them till the statement
RETURN
is reached, as independent sections of the program that are only executed when the corresponding names are invoked. This invocation for a function simply takes place by using its name as another system function; the invocation of a subroutine takes place through a CALL statement, i.e., in the previous example
CALL SAMPLE(V1, G1)
Note that the arguments can have a different name but they have to be of the same number as the one declared in the SUBROUTINE subsection of the code. Finally, we should note here that when a variable is passed as an argument of a subroutine in a subroutine call, what happens at compilation time is that the corresponding variable is of the type declared in the subroutine section and it is physically allocated starting at the same space as the starting position of the variable appearing in the calling part of the code. In this way, the two variables can be declared to be of different types; the type declaration appearing within the subroutine being the one used in the arithmetic manipulations within the subroutine. The only requirement placed is in the space allocated in the outer program which should be sufficient for all the uses within the subroutine so as to avoid erroneous references outside the domain of allocation with unpredictable results. In this respect, FORTRAN is especially forgiving, but also places a significant responsibility to the end user. Many errors have resulted from unsufficient space allocation in the calling routine, and/or inappropriate communication of the data. This is a very important point especially since it is through calls to external functions and subroutines that the use of third party software is primarily taking place in today's programming. The importance of carefully taking into account the needs of the external subroutines properly cannot be underemphasized in FORTRAN programming!

The structure that therefore emerges for a typical programming application is of the following form
A. Declaration statements
B. Data Input and Initialization of intermediate variables and parameters
C. Preprocessing operations
D. Main processing operations, involving calls to external subroutines
E. Postprocessing operations
F. Data output

This is illustrated in Appendix 4.A in a simple program which solves parametrically a system of nonlinear equations using Newton's method. The example is taken from the IMSL manual. Notice that the bulk of the computations load is spent in the L-U decomposition of the Jacobian matrix which is carried out using a call to an appropriate IMSL subroutine. This is typical for the numerical applications discussed in these notes, thereby reducing significantly the programming workload while significantly increasing the computational efficiency. Some information regarding the fundamental concepts behind the most popular algorithmic approaches and about the existing software where these algorithms are implemented, for key numerical tasks of usefulness to a computer-aided stability analysis, is supplied in the remaining sections of this chapter.


next up previous contents
Next: Linear algebraic problems Up: Fundamentals on Numerical Methods Previous: Fundamentals on Numerical Methods
Michael Renardy
1998-07-13