The purpose of this short section is not to teach FORTRAN. That would have required a full book and there are very good books already available for that task [1] which are highly recommended. Instead, what we aim to do here is to bring forth the principal concepts underlying a well-structured FORTRAN program given its close relationship with the numerical methods discussed below. Although most, if not all, of the numerical methods discussed in this chapter can also be implemented using a symbolic algebra software (such as Maple or Mathematica), it is still true that, for long and especially repetitious high precision calculations, the efficiency of a FORTRAN implementation remains unsurpassed. However, the availability of several well-tuned subroutine libraries nowadays makes the construction of large codes (from ``scratch'') largely unnecessary, at least for the casual computer end-user. All that this user needs now is a programming interface with available FORTRAN subroutines (such as those discussed further below in the next sections of this chapter, whenever appropriate). For learning the programming elements entering such a programming interface, we believe that the information contained in a short book, such as Ortega [4], is perfectly adequate. However, we also believe that in order to use computational resources judiciously and to avoid long frustrating debugging periods that may ``doom'' a particular numerical application, some good programming practices in handling the FORTRAN programming elements are advisable to be followed. Hopefully, the following discussion can help in strengthening, if not building, such a good programming practice. In addition, it is aimed to help better understand the numerical techniques exposed in the next sections since, at least for some of them, they were built with a FORTRAN implementation in mind.
FORTRAN (for FORmula TRANslator) is a high level language that is the most widely used one for scientific and engineering purposes. Since the time of its original development (in the late 50's (1957) at IBM from the working group of a Delaware engineer, John W. Backus), it has undergone several major changes that have eventually led to the current FORTRAN 90 standard (also, there exists a FORTRAN 95, but it is a minor update). A FORTRAN code is a collection of lines containing alphanumeric (ASCII) characters that is eventually processed by a compiler into a machine-readable (binary) set of instructions. In this version, FORTRAN 90 accommodates features (like pointers) characteristic of more advanced high level languages, like C, thus making it more appealing to experienced programmers who, though, might have preferred a more structural language altogether, as a replacement . Will that happen in the near future? To some extent yes, but not completely.
In my opinion, there are two primary reasons why C++, which is currently the most popular structural programming language (and the only one currently taught to the freshmen in institutions like MIT), still has not managed (and I feel that it will not also manage in the near future) to completely eliminate FORTRAN from high performance scientific and engineering calculations: First (and the most often quoted reason) there are zillions of FORTRAN computer codes already available, that need to be accommodated into new computational environments. However, although that would have explained why in new computers we still need FORTRAN compilers, it would not have explained why we still need to teach FORTRAN in our institutions. That, I feel can be explained by invoking a second argument (which is not so often heard), namely that of an inherent simplicity in using FORTRAN as opposed to C++. Whereas C++ thrives in the development of complex nested control structures providing to the user a lot of programming flexibility but also simultaneously a lot of programming responsibility, FORTRAN adheres to a small number of concrete constructs which make use of several user-hidden default programming actions. That makes it easier for non-computer programmers to develop computer codes. In particular, most of the heavy computer end-users among scientists and engineers have (so far, at least) found it easier to translate their algorithmic ideas for numerical methods into FORTRAN statements and FORTRAN programming than in C++ programming. This is why I feel that it is useful to start a discussion on numerical methods by reminding the user of the key features of the FORTRAN 90 language, anticipating that this is going to continue to be the language of preference in numerical analysis, based on past history.
However, we should not forget that what we have in the selection of the appropriate language really boils down quite frequently to a matter of taste. In other words, I believe that what we see in the competition between FORTRAN and C++ is really not unlike what we see between apple and pc users: it really boils down to a matter of taste between a well structured but complex and a simpler but less powerful programming environment. So far both coexist, and if the apple-pc analogy holds, both will continue to coexist for the foreseeable future. Moreover (again not dissimilar from what we see in the apple-pc race), as both the computational environment and the numerical algorithms evolve, we may continue to observe the current tendency for the two approaches to converge and merge with each other, which might indeed follow in the distant future as a natural evolution.
A FORTRAN program is first put together as a source code which
consists
of a number of FORTRAN 90 statements. The source code is then
subsequently ``compiled'', i.e., used as the data input to a
compiler
program which translates the source code into cpu and memory
instructions for the computer. The compilation is simply
performed by
invoking the FORTRAN 90 compiler using the source code program
name as
the first argument and, possibly, one or more compiler directives
for
compiler program specific work. For example, the compilation is
performed as
f90 main_program second_program third_program
fourth_program (libraries)
where the libraries involve needed
subroutine libraries names for the support of the primary (main)
and
secondary programs.
In the following, we will use all capitals to declare FORTRAN statements for clarity. Both capitals and small letters can be used in actual programs interchangeably as FORTRAN is not a case-sensitive language. The only place where one has to be careful with the case letters is in specifying the names of a file within a UNIX environment whenever that name is communicated through a FORTRAN statement. Also, note that whereas previous versions of FORTRAN only allowed for a specific FIXED syntax according to which the first five characters in a line were reserved for an optional numeric line label, the sixth for a possible continuation line declaration (if non-blank) and the seventh to 72 (max) for the actual FORTRAN statement, FORTRAN 90 also allows a free format style where each line can have up to 132 characters, a continuation line is preceded in the end of the previous one by & and a comment is indicated by the exclamation point (!) before it, placed anywhere on the line. Whereas this last option is also available in the fixed format style, in that style the comments are usually identified instead with the letter C appearing as the first character in the line. In both styles, multiple statements on a line are also allowed, separated by a semicolon (;).
There are five major categories of FORTRAN statements:
1. Declaration or
specification statements
These statements are used to specify the nature of
the various variables, functions, and other components of a
FORTRAN code, such
as, for example
INTEGER X(0:10, -4:5)
REAL VAR
LOGICAL FUNCTION L(V1,V2)
etc. Those statements, placed at the beginning of the program
and before
any executable statements (i.e., i/o or assignment statements)
typically specify
both the extent and the form of the information contained in each
one of the
corresponding variables. For example, in the previous example we
declare
variable X to correspond to a two-dimensional integer array of
110 elements and
a range of row and column indices from 0 to 10 and -4 to 5,
respectively. The
other two types of variables are LOGICAL, which can only assume
two values,
.true. and .false. subject to Boolean logic operations, and REAL
which contain
the numerical information in a ``floating point'' representation
corresponding
to a ``mantissa'', X, representing a real number between 0
and 1, and an
integer exponent, I, which is used to scale the final number
R as R = X
10I. For real numbers, the most important feature is the
number of
significant digits that are carried out in the finite
representation of the
mantissa. When 4 bytes are used for the real number
representation (called
REAL*4 or simply REAL), the mantissa is represented with a finite
number of
binary bits corresponding to 7-8 (depending on the operating
system) significant
digits in a base ten decimal representation. More useful for
numerical
calculations is an enhanced accuracy (called REAL*8 or DOUBLE
PRECISION)
representation corresponding to an allocation of 8 bytes for the
full real
number and 14-16 significant digits.
The form of the information contained in
a variable can also be declared implicitly. For example, the
declaration
statement
IMPLICIT DOUBLE PRECISION (A-H, O-Z)
can be used to define all
variables for which the first character is A-H or O-Z as double
precision reals
and all variables with first letter between I-N as regular
integers (INTEGER*4).
Although, such a statement simplifies the code by avoiding to
explicitly declare
the nature of all the variables involved, it can also become a
source of
frustration during debugging as with this implicit declaration,
misspellings in
the variables names remain undetected from the compiler and can
be potentially
difficult to detect during debugging. Thus, the current
practices now call for
either avoiding altogether the use of implicit declarations
(which is the
default if none appears but which also can be explicitly forced
through the
statement
IMPLICIT NONE
in the beginning of the declaration statements)
or, instead, using the statement
IMPLICIT LOGICAL (A-Z)
which, by forcing
a non-declared variable to be of a logical type (a type rarely
used and of
direct conflict with the implementation of arithmetic
operations), makes typing
errors in the names of variables detectable at compilation time.
Of course, the
downside with either one of these actions is that the type needs
to be
explicitly declared for all the variables used in the program.
Since the number of significant digits carried out in the mantissa determines the truncation error caused by a finite representation of real numbers in a computer, and since this truncation error usually amplifies in the course of numerical operations, it is primarily only the double precision representation that it is used in numerical calculations, especially when large scale calculations are involved. Indeed, Cray foresaw that when he decided to default the REAL declaration to double precision in CRAY computers; unfortunately the other manufacturers did not follow through (despite the fact that, due to extensive small scale parallelization inside the CPU and in the bus, it takes approximately the same number of cycles to perform single or double precision calculations). This different default now present only in CRAY computers and software has always been a cause for headaches, as one needs to manually switch from double to single precision all declaration, function, and subroutine calls as one transfers a code from a non-CRAY to a CRAY computer and vice-versa, given that automatic switches offered by the compiler have proven to be, to the author's experience, notoriously unreliable.
As far as the exponent of the floating point representation is
concerned,
depending on the single or douple precision declaration, it can
take integer
values bounded between certain limits like
or
,respectively (those limits also vary depending on the
operating system used). If the value falls outside those limits,
there is or
there is not an error declared (underflow for falling below or
overflow for
coming above), again, depending on the particular computational
environment.
Nowadays, underflow usually does not hinder a code from
continuing to execute
with the result replaced by a computer zero and this is not a
problem, however,
with certain systems (SGI, in particular) by default overflow is
also not
reported with the result (declared INF or NANQ, acronym for ``Not
a Numerical
Quantity'') continuing to participate in the calculations. This
can result in a
lot of wasted computer time, that can be avoided if suitable
compiler options
are chosen at compilation time. Finally, note that size limits
also apply to
all integer variables and those also depend on the byte size
allocated to the
integer values with the typical (default for INTEGER declaration)
value of
applicable for 4 bytes integers. Other integer sizes
are also
available in certain systems, but when used, care needs to be
exercised whenever
those integers appear as subroutine or common block arguments as
they can lead
to segmentation faults. Those faults created when not all bits in
a ``quantum
size'', which depends on the particular computer used, are
assigned, can be a
serious cause of frustration that can be avoided when no integers
are mixed with
reals in a common block, or if they always appear at the end of
subroutine call
arguments.
In addition to the declaration statements mentioned above, we
also have
declaration statements related to the allocation of certain
variables in the
physical memory of the computer. For example, the statement
COMMON /CB1/V1, V2, V3
allocates contiguous physical space to the variables V1, V2 and
V3 in that order, the statement
EQUIVALENCE /Z1, Z2, Z3/W1, W2, W3/
allocates the same physical space for the variables Z1, Z2, Z3
and the
variables W1, W2, W3 etc. Common blocks allow the transfer of
information
between various program segments. In addition, they have
acquired special
usefulness in parallel computing since they can be used to
effectively align
equivalent data structures within different processors so that
interprocessor
data transfers can be implemented using specialized communication
calls.
Thus, the user has to be especially careful that all variables
entering those
statements have been properly declared as parts of a common
statement.
Therefore, although for a time there was a tendency among
programmers to avoid
COMMON blocks, that does not seem to be the case any more. In
contrast, the
use of EQUIVALENCE statement is heavily discouraged considered to
be a possible source of
confusion. Their only previous justification (to save space while
allowing the
use of meaningful names to system variables) is not valid
anymore, given the
possibility of dynamic allocation of the variables, as explained
below.
Whereas these statements are implemented at compilation time and
the allocation
of the space is permanent for the duration of the run, FORTRAN 90
also allows for
dynamic variable allocation. For example, let us assume that you
may want to
allocate the real variable array V(1000) temporarily for some
intermediate
calculations within the code. You can achieve that through the
following series
of statements:
REAL, DIMENSION (:) : V
ALLOCATABLE V
ALLOCATE (V(1000), STAT=ISTAT)
DEALLOCATE (V, STAT=ISTAT)
where the first two statements are placed at the beginning of the
code together
with the other declaration statements of the program, the third
is placed right
before array V is utilized and the last one right after its last
usage within
the program. In this way, considerable savings on the system's
resources can be
accomplished without having to use and reuse the same variables
with possibly
irrelevant names.
2. Assignment statements
These are used to assign the value of a variable. There are
always
of the form
A = B
where A is a single variable or a variable array and B represents
a compatible
algebraic expression involving any one of the four arithmetic
operations (*, /,
+, -) and several functions operating on suitable data. Note
that A can also
appear as part of the B expression on the right-hand-side in
which case its
value is changed to the new one at the end of the statement's
execution. This
is a useful property when implementing recursive relations.
Moreover, when
using assignment statements for arrays, care needs to be
exercised that all the
operations appearing on the right-hand-side are allowed and lead
to the desired
results. When in doubt, one can always use an explicit
evaluation in terms of
individual components and a do loop structure. Inequalities or,
in general,
comparisons between two arithmetic data can also be performed in
FORTRAN, but
in this case, they do not result in new values, but simply in the
combinatiovariables plus comparison being considered as a logical
variable
assigned to a .true. or .false. value, depending on the case.
These
expressions find use as arguments of control statements, as
explained below.
3. Control statements
These statements control the flow of the execution of the
programming
instructions within the code. By default, the instructions of a
FORTRAN code are
executed
sequentially, one at a time. However, flow control statements
like
DO WHILE (K .LT. 3)
GO TO 300
etc. can change that execution order. As these two examples
suggest, there are
two types control statements: Conditional, the execution of which
depends on
the satisfaction of a certain set of criteria represented by
various operations
on logical variables (such as those resulting from the comparison
of two
arithmetic variables), and unconditional. The latter tend to be
deemphasized in
modern FORTRAN applications as their frequent use can create
havoc during
debugging.
The two most popular FORTRAN structures for the control of the
flow
of the execution of FORTRAN statements are the DO LOOP and the
IF-THEN-ELSE
conditional statement. Their combination gives rise to the DO
WHILE structure
and when all these are used in a nested fashion, they can
generate a literally
unlimited number of control structures. As mentioned, although
explicit
unconditional redirection of the flow to explicitly labeled lines
in the
FORTRAN code is still supported in FORTRAN 90, it is strongly
discouraged. The
corresponding structures for the DO LOOP and the IF-THEN-ELSE
that avoid using
explicit line references have the form
DO I=I1,I2,I3
(various FORTRAN statements)
ENDDO
and
IF (LOGICAL) THEN
(one set of various FORTRAN statements)
ELSE
(second set of various FORTRAN statements)
ENDIF
In the DO LOOP structure, the various FORTRAN statements included between the DO and ENDDO lines are executed as many times as necessary while the index variable I (which, by the way is a local variable with a proper value assigned with certainty only within the DO LOOP) takes consecutively the values I1, I1+I3, I1+2*I3 etc till its value becomes larger (or smaller, respectively) than I2 depending on whether I3 is positive (or negative, respectively). Note that in case when I1 > I2 (respectively, I1 < I2) and I3 is positive (or negative, respectively), the DO LOOP is skipped without being executed even once. The possibility to construct such type of do loops, which allow for much more compact codes, since they can arise as part of a general algorithm for various values of the index parameters I1, I2 and I3 (being, in general, variables within the code), as well as the possibility of using real variables as indices, are new features of FORTRAN 90. Note that care needs to be exercised if I, I1, I2 and I3 are declared as reals and I1 + an integer multiple of I2 is equal to I3 since truncation error in the computer, depending on how these values are represented in binary, can make an exact equality to be an inequality of either sign with the consequences as to whether the last iteration in the DO LOOP is or is not executed. Finally, note that in case that I3 is not specified explicitly, the default value taken by the program is (consistent to earlier FORTRAN versions) 1.
4. I/O (Input- output) statements
These statements are used to
control the transfer of data information in and out of the code
during the
code's execution. A considerable amount of work was devoted in
old times by
the students trying to devise intricate FORMATING FORTRAN
statements to
provide for a sophisticated FORTRAN I/O interface, including
graphics!
Fortunately (since FORTRAN is the totally inappropriate language
for building
graphical interfaces), the availability of excellent graphics
software
nowadays makes such a task unnecessary. All that the user really
needs to
know is how to transfer data between the program and (preferably
ASCII) files.
For a full accuracy preservation and also for convenience, this
is simply
achieved most easily using unformatted read and write statements
of the form
READ (10) A,B
WRITE (20) A,B
which read and write respectively, all the data content of the
variables A, B
from/to files associated with units 10 and 20, respectively. The
association
of these units with actual files is done either automatically (in
which case
the corresponding filenames are specified by the system, usually
as fort10 and
fort20, respectively) or via special declaration statements of
the form
OPEN (10, file = "input.dat")
The only time when a formatted i/o may be necessary is when a
particular
graphics program needs it. That can be easily constructed based
on the
notational convention of the FORMAT statements, to which also
FORTRAN 90
remains compatible. See [4] for further details.
5. Subprogram statements
These statements precede specific structural
subprograms within a
FORTRAN program, namely FUNCTIONS and
SUBROUTINES. For
example, the statements
LOGICAL FUNCTION L(A1, A2)
SUBROUTINE SAMPLE (B1, B2)
declare the statements that follow them till the statement
RETURN
is reached, as independent sections of the program that are only
executed when
the corresponding names are invoked. This invocation for a
function simply takes
place by using its name as another system function; the
invocation of a
subroutine takes place through a CALL statement, i.e., in the
previous example
CALL SAMPLE(V1, G1)
Note that the arguments can have a different name but they have
to be of the same
number as the one declared in the SUBROUTINE subsection of the
code. Finally, we
should note here that when a variable is passed as an argument of
a subroutine in
a subroutine call, what happens at compilation time is that the
corresponding
variable is of the type declared in the subroutine section and it
is physically
allocated starting at the same space as the starting position of
the variable
appearing in the calling part of the code. In this way, the two
variables can be
declared to be of different types; the type declaration appearing
within the
subroutine being the one used in the arithmetic manipulations
within the
subroutine. The only requirement placed is in the space
allocated in the outer
program which should be sufficient for all the uses within the
subroutine so as
to avoid erroneous references outside the domain of allocation
with unpredictable
results. In this respect, FORTRAN is especially forgiving, but
also places a
significant responsibility to the end user. Many errors have
resulted from
unsufficient space allocation in the calling routine, and/or
inappropriate
communication of the data. This is a very important point
especially since it is
through calls to external functions and subroutines that the use
of third party
software is primarily taking place in today's programming. The
importance of
carefully taking into account the needs of the external
subroutines properly
cannot be underemphasized in FORTRAN programming!
The structure that therefore emerges for a typical programming
application is of
the following form
A. Declaration statements
B. Data Input and Initialization of intermediate variables and
parameters
C. Preprocessing operations
D. Main processing operations, involving calls to external
subroutines
E. Postprocessing operations
F. Data output
This is illustrated in Appendix 4.A in a simple program which solves parametrically a system of nonlinear equations using Newton's method. The example is taken from the IMSL manual. Notice that the bulk of the computations load is spent in the L-U decomposition of the Jacobian matrix which is carried out using a call to an appropriate IMSL subroutine. This is typical for the numerical applications discussed in these notes, thereby reducing significantly the programming workload while significantly increasing the computational efficiency. Some information regarding the fundamental concepts behind the most popular algorithmic approaches and about the existing software where these algorithms are implemented, for key numerical tasks of usefulness to a computer-aided stability analysis, is supplied in the remaining sections of this chapter.