Home Up Intro Contents Chapter 1 2 3 4 5 6 7 8 9 10 Design Assert Timing EBNF Report Pas Last Changed: July 12th, 1997
This is a conversion from Oberon text to HTML, and from German to English. The converter software is still under development, and some features or information may be missing in this converted version. HTML hypertext facilities are not yet active in this document. To exploit the interactive facilities, use Oberon System 3 and the source of this text, available for download using binary ftp as Oberon System 3 archive. The converter from German to English is still under development as well. A previous version is also available for Oberon V4. To access this and other additional material use ftp.
For the convenience of our students, most of this information and the related material is available in German as well.

Introduction to Oberon

The Oberon Programming Language

G. Sawitzki <gs@statlab.uni-heidelberg.de>



12 Assertions, Tests, Invariants
-- This chapter should be accompanied by a proper case study.
A recommended case study is Mess.Mod,
accessible at statlab.uni-heidelberg.de --

This chapter deals with one special language construct, assertions, and a very special interpretation of this construct.
Formally, assertions are supported by a predeclared procedure which may come in one of two forms
  ASSERT(x, n) x: Boolean expression; n: integer constant
    terminate program execution if not x
  ASSERT(x)  x: Boolean expression
    terminate program execution if not x
The first form is roughly equivalent to the statement
  IF ~x THEN HALT(n) END
The second form uses a system defined termination code, instead of the user defined value n. The difference between HALT and ASSERT is that ASSERT may be handled in two different ways by compilers or systems: In most systems, a loader or compiler option is provided to enable/disable code generation for ASSERT procedures. If ASSERT is switched off, ASSERT calls will be ignored and no code is generated for them.

The special interpretation we give to this construct is to view ASSERT as a means to specify necessary conditions for the correctness of a program. With this interpretation, ASSERT is a means to integrate aspects of the program specification in program code, allowing at least partial verification.

With this interpretation in mind, compiler options to switch ASSERTs on/off should be handled with care. If it can be guaranteed that the boolean control condition is always true, a compiler is free and welcome to suppress code generation for an ASSERT statement, as it is for the comparable HALT statement if in a "dead" branch. (Dead code removal, i.e. suppression of code which cannot be reached, is a common compiler technique, and no special construct is needed for this.) If, on the other hand, the ASSERT condition cannot be guaranteed, the alternative cannot be excluded and will be encountered if a necessary condition for successful program execution is not satisfied. An exception should be taken, and if it is not provided by reasonable program code, a HALT is a proper bail out. So the recommendation is: do not use ASSERT as "debugging statement" which is switched off in production code:
  Never disable ASSERT.

On the other hand, an ASSERT call is an invitation to the compiler for closer inspection whether the ASSERT code can be omitted. That is the compiler is challenged to prove that the ASSERT condition is always guaranteed. Unfortunately, the evaluation of the boolean expression x may have side effects. For example, the evaluation may imply the evaluation and possible change of some state variable, which may affect other parts of the program execution. This is allowed, but makes program verification and automatic dead code removal difficult. So the second recommendation is: Don't allow side effects.
  No side effects in ASSERT conditions.

Both rules are recommendations, no obligations. They reflect our special interpretation of ASSERT.

ASSERT may be used in several typical roles. We go through the most typical ones.

Environmental conditions
Environmental conditions specify requirements on the computing environment. For example, to use LONGINT based integer numbers, you may want to check that you have at least 32-bit LONGINTs. Using
  ASSERT(SIZE(LONGINT)>=4,100)
in the module initialization part, where "100" is an ad hoc error code, may be a convenient check. Most compilers will handle this ASSERT already during compilation. Either the constant value of SIZE(LONGINT) is guaranteed to exceed 4, and no code for the ASSERT will be generated, or SIZE(LONGINT)<4, and the compiler will issue an error message and possibly abort compilation. This however may affect the portability of a program. Asserts checked at compilation time reflect the compilation environment. If the run time environment does not coincide with the compilation environment, problems may occur.

Preconditions
Preconditions test for the legal input to procedures. Usual application preconditions require parameters to be in certain ranges. For example, to time a procedure, you may use a timing routine
  PROCEDURE TimeIt(proctotime:PROCEDURE; runs:INTEGER)
with a precondition expressed as
  ASSERT(runs>0,100)
where "100" is again an ad hoc error code. While in this example it might be feasible to write the code for TimeIt to handle unexpected input gracefully, e.g. by reporting an appropriate message, in other cases a check on preconditions may be critical. For example, COPYI is a piece of code taken from vyBase, the "portability glue" part of Voyager. The intention of the code is to provide a very quick copy routine for non-overlapping parts of integer vectors. A safe alternative would be to copy by element.

  PROCEDURE COPYISAFE*(VAR src:ARRAY OF INTEGER;
    srcoffs,len:LONGINT;
    VAR dest:ARRAY OF INTEGER;destoffs:LONGINT);
  VAR from:LONGINT;
  BEGIN  FOR from:=srcoffs TO srcoffs+len DO
    dest[destoffs]:=src[srcoffs];
    INC(destoffs)
    END
  END COPYISAFE;

COPYISAFE would ignore negative length, and run into an index error if src or dest is too short. To gain speed, SYSTEM.MOVE can be used. But SYSTEM.MOVE is inherently unsafe. So it must be guarded by adequate checks.

  PROCEDURE COPYI*(VAR src:ARRAY OF INTEGER;
    srcoffs,len:LONGINT;
    VAR dest:ARRAY OF INTEGER;
    destoffs:LONGINT);
  VAR from:LONGINT;
  CONST sz=SIZE(INTEGER);
  BEGIN  ASSERT(LEN(src)>=srcoffs+len-1,100);
    ASSERT(LEN(dest)>=destoffs+len-1,101);
    ASSERT((len>=0)&(len<(MAX(LONGINT) DIV sz)),102);
    from:=SYSTEM.ADR(src[srcoffs]);
    SYSTEM.MOVE(from,SYSTEM.ADR(dest[destoffs]),len*sz)
  END COPYI;

In this example, it is almost compulsory to use ASSERT. It would be hazardous to use SYSTEM.MOVE without testing for a legal input state.

Postconditions
Postconditions check the outcome at the end of a procedure. This may be a strict check, or it may be a reduced condition. For example, a procedure calculating the norm of a vector may end by checking the postcondition
  ASSERT(Norm>=0,120);
  RETURN Norm

just before exit. In many cases, reduced checks like this one may be sufficient. But in critical cases, the postcondition has to specify a strict condition for successful termination of an algorithm.

Strict postconditions tend to be rather complex and may require state information about the input or specific check routines. As an example, consider a sort algorithm operating on a chain of items
  ItemPtr=POINTER TO Item
  Item=RECORD
    ...
    next: ItemPtr;
  END;
using an interface
  PROCEDURE sort(VAR first:ItemPtr);
A postcondition should at least verify that the result is actually sorted, and guard against the most common mistakes in sorting algorithms. For example, it should guard against dropping elements, or at least check that the number of elements is preserved. The latter may overlook duplication of elements at the cost of loss of others, so it is no strict check. But it is a necessary condition to be satisfied upon completion. To implement this check, a memory of the initial count and a check procedure for monotony is needed.
An implementation may have the skeleton
  PROCEDURE sort(VAR first:ItemPtr);
  VAR initialcount:LONGINT;
  PROCEDURE GetCount(from:ItemPtr):LONGINT;
  BEGIN
    ... (* walk the chain and count *)
  END GetCount;
  PROCEDURE Sorted(from:ItemPtr):BOOLEAN;
  BEGIN
    ... (* walk the chain and check order *)
  END Sorted;
  BEGIN
    ASSERT(first#NIL,100);
    initialcount:=GetCount(first);
    ... (* do the sort *)
    ASSERT(initialcount=GetCount(first),120);
    ASSERT(Sorted(first),121);
  END sort;
At this point, we have to relax our recommendation:
  Never disable ASSERT
  unless you know why.

Even if the algorithm were correct and the ASSERT conditions can be proven to be always satisfied, it is very difficult for a compiler to identify that the variable initialcount and the procedures GetCount and Sorted are not needed. Given present compiler technology, it is a friendly gesture to the compiler to disable ASSERT and allow removal of dead code once the ASSERT can be considered proven.

For practical purposes, it may be easier to have test variables and test procedures in a "scaffold", and keep checks of the postconditions separate from the target code. For example, you can have a procedure sorttest, possibly in a separate module, with scaffold
  PROCEDURE sorttest(VAR first:ItemPtr);
  VAR initialcount:LONGINT;
  PROCEDURE GetCount(from:ItemPtr):LONGINT;
  BEGIN
    ... (* walk the chain and count *)
  END GetCount;
  PROCEDURE Sorted(from:ItemPtr):BOOLEAN;
  BEGIN
    ... (* walk the chain and check order *)
  END Sorted;
  BEGIN
    ASSERT(first#NIL,100);
    initialcount:=GetCount(first);
    sort(first);
    ASSERT(initialcount=GetCount(first),120);
    ASSERT(Sorted(first),121);
  END sorttest;
where "sort" is reduced to the proper sorting code. But this is just a pragmatic separation.

Loop invariants and loop variants
The postconditions in the example above have two components: a check on the count of the chain, to check consistency, and a check on the sorting order, to check termination. Neither check has been strict, but it is sufficient for this discussion. We will stay with this informal level.
As consistency and termination conditions are necessary at the end of a procedure, their counterparts are necessary conditions during execution. However, they take different roles.
We return to the sorting example. The consistency condition to keep the number of elements is necessary at all steps of the sorting process. Typically, sorting is an iteration through the chain to be sorted, possibly implemented in recursion. At each step, the condition
  initialcount=GetCount(first)
needs to be satisfied. If we want to safeguard against loss of information, an appropriate ASSERT
  ASSERT(initialcount=GetCount(first),110)
has to be added in the iteration loop. Conditions which have to be satisfied in each "iteration", or rather corresponding check variables, are usually called loop invariants.

On the other side, the termination condition
  Sorted(first)
need only be satisfied at the end of the sorting process. During the sorting process, not all elements are sorted, i.e. there is a count "nrswaps" of unordered pairs in sequence which is zero if the queue is sorted, and is initialcount-1 at worst.

Sorting usually is implemented as a number of swaps of pairs or segments, and by looking at the algorithm it is not always clear that the algorithm will ever stop, let alone stop successfully. If termination is to be checked, we need a criterion to guarantee eventual termination. We can borrow a concept from other areas of mathematics. The general idea is that of a fixed point theorem. We need only the poor man's version: if we have a finite state space, and a transformation operating on this space, leaving it invariant, and an index which is bounded from below and reduced by applying a transformation, then an iterated transformation eventually stops at a fixed state. In short: if you walk down a finite ladder, you eventually hit the ground.
The general recipe for verifying that a process terminates is: establish that there is an invariant state space; that is, if you have to add subsidiary information, it does not grow beyond all bounds. Find an index which is never increasing, and guaranteed to be eventually decreasing. Under those conditions, the algorithm is guaranteed to end. The index used here is called a loop variant.

Ignoring the problem that an algorithm might introduce "fake elements", for our sorting problem the loop invariant
    initialcount=GetCount(first)
guarantees that our state space is finite: it corresponds to at most initialcount! permutations of our initial data set. If we have an algorithm which reduces the number of swaps, we have
    nrswaps<nrswapsold
for some ancillary variable nrswapsold, and if 0<= nrswaps we know that we can guarantee to stop eventually. Sometimes nrswaps may be not the best index. We may take into account of intermediate decrease of sorting order for some efficiency trade-off. But to get the general picture, this example may suffice.



Exercise:
Specify a condition which checks for stray duplication in a sorting process. If you cannot specify a strict condition, give a relaxed version.



Type invariants
Using object oriented programming, control flow may be more intricate than in classical procedural programming. While loop invariants/variants catch conditions in a step by step scenario, a corresponding strategy is needed for type specific invariants. A recommended way is to declare type specific procedures which check a type's invariant conditions, and use this in an ASSERT at the end of every procedure which must assert this invariant.

An example using ordinary procedures is to define
  PROCEDURE Invariant*(t:Type):BOOLEAN;
  BEGIN
    ...
    RETURN condition
  END Invariant;
and use it as e.g.
  PROCEDURE Recalc*(obj: MyObject);
  BEGIN
    ...
    ASSERT(Invariant(obj),130)
  END Recalc;
Using Oberon-2 type-bound procedures, the model is
  PROCEDURE (t:Type) Invariant()*:BOOLEAN;
  BEGIN
    ...
    RETURN condition
  END Invariant;
and uses it as e.g.
  PROCEDURE (obj: MyObject) Recalc*;
  BEGIN
    ...
    ASSERT(obj.Invariant(),130)
  END Recalc;


Variable modes
A related problem should be mentioned here, although no general implementation model can be given. A common pre/postcondition of procedures refers to the role of variables. Oberon supports passing parameters by value and or by reference. Usually, the intended semantics is more specific.
  - parameters passed by value may be intended as input-only
    parameter, although Oberon would allow to use
    parameters passed by value as local (temporary)
    variables. For input-only parameters, the value
    may not be changed during procedure execution.
  - parameters passed by reference may be intended as
    output-only parameter. For output-only
    parameters, the execution of the procedure
    should not depend on the parameter values upon
    procedure entry; no initialization should be
    assumed.
(For efficiency reason, it is common to pass more complex structures by reference, where actually passing by value is intended. So the remark about input-only parameters applies to parameters passed by reference as well.)

There is no easy model to check this sort of conditions so far. At present, a recommendation is to add comments expressing the intended role, as in
  PROCEDURE abc ( (*in*) xyz: TypeIn);
  ...
  PROCEDURE cde (VAR (*out*) zyx: TypeOut);

Assertions: the big picture
On a low level, assertions can be considered a special debugging feature. With our interpretation, assertions aim at a more demanding target: to provide information to judge the correctness of a code. As such, they represent parts of the specification of a program. This representation is on code level and allows verification of these parts.
Ideally, a program can be seen as a series of transformations, transforming an input state satisfying certain conditions to a well-defined output state. If the postconditions define the output state, i.e. if we have strict postconditions, and if the postconditions can be derived formally for any input satisfying the preconditions, then we can prove that the program is correct. Loop invariants are landmarks to assert that we have not left the path, and loop variants provide milestones to see if we make any progress (or run in circles, or worse).
At present, we are far from reaching this goal of having proven correct programs. Nevertheless, this is a big picture to keep in mind as a goal.

Documentation
In a program, we see both an intent and its implementation. The implementation is restricted by the formal rules of the programming language and the skill of the programmer. A traditional sideway is to add the intent as comments, escaping to a meta level. As the saying goes, comments are either wrong, or outdated, or both. Assertions offer a different possibly to express the programming intent in a way that guarantees consistency.
Considered this way, assertions are a possibility to imbed information about programming intent in the code in a way that can be recovered for documentation. Recovering the role from the code is too complicated for now. Instead, we have to rely on conventions. Unfortunately, these conventions are not universal. For example, Oberon System 3 and Oberon/F (aka Black Box) use conventions bases on the termination code:

  Oberon S3          Oberon/F
      free        0..19
  100..109  preconditions      20..59
  110..119  invariants      100..120
  120..129  postconditions      60..99
      reserved      121..125
      not yet implemented    126
      interface procedure called  127





Project exercise:
Write a "pretty printer" which is fail-safe. The pretty printer should work correctly on a correct Oberon source input, reasonable for "cut&paste" mess as an input, and allow extensions to cope with input from other languages.

You can take Mess.Mod from statlab.uni-heidelberg.de as a starting point. The general process is to parse the input text into low-level tokens, combine these to higher level tokens, and to rewrite them.
The strategy for fail-safe rewriting is to allow scratch parts. Any token may have a pre- or post-part which may be tokens, comments, or scratch.
The first role of assertions is to guarantee that no material is lost. So the combination part needs to be guarded by assertions.
The second role of assertions is to stabilize extensions. Mess.Mod allows the addition of "plug-ins". These plug-ins may install call-back functions.
As a master module, Mess has to safeguard against protocol violations of plug-ins which must work with source code other than Oberon source. So far, plug-ins are available for C and Pascal.

Allow one month for each of these tasks.
Allow half a year if working from scratch.





Part of this chapter has been adapted from S. H.-M. Ludwig's "Conventions for Programming". Thanks to G. Valiente and A. Fischer for hints and comments.


Introduction to the Oberon language. ItO/Ch12.Text
gs (c) G. Sawitzki, StatLab Heidelberg
<http://statlab.uni-heidelberg.de/projects/oberon/intro/>


Einführung in die Programmiersprache Oberon. Kurs/Kap12.Text
gs (c) G. Sawitzki, StatLab Heidelberg
<http://statlab.uni-heidelberg.de/projects/oberon/kurs/>

Home Up Intro Contents Chapter 1 2 3 4 5 6 7 8 9 10 Design Assert Timing EBNF Report Pas