Home Up Intro Contents Chapter 1 2 3 4 5 6 7 8 9 10 Design Assert Timing EBNF Report Pas Last Changed: July 12th, 1997
This is a conversion from Oberon text to HTML, and from German to English. The converter software is still under development, and some features or information may be missing in this converted version. HTML hypertext facilities are not yet active in this document. To exploit the interactive facilities, use Oberon System 3 and the source of this text, available for download using binary ftp as Oberon System 3 archive. The converter from German to English is still under development as well. A previous version is also available for Oberon V4. To access this and other additional material use ftp.
For the convenience of our students, most of this information and the related material is available in German as well.

Introduction to Oberon

The Oberon Programming Language

G. Sawitzki <gs@statlab.uni-heidelberg.de>



11: Program Development

With Oberon the program is replaced by a concert of components. The place of a classical program design is now taken by the choice and adapted evolution of components and by effective combination. Extensibility and reusability are no coincidental side effects, but must be designed in. This requires suitable methods for program design and the program development.

Methods of the program design are at best some assistance which can support creative work. They cannot replace creativity however. Nevertheless some guiding principles should be kept in mind.

- Do not re-invent the wheel! The treasure chest of available modules, developments already made and usable implementations is enormous. In Oberon, services of available modules can be accessed freely . Make use of this possibility.

- Plan for reusability! Reusability of modules requires flexible usefulness. This should be considered at the design of modules. A generally usable solution is to be preferred to a special solution for just one special case. The special solution can still be meaningful for instance for efficincy. This however needs to be justified for each individual case. A compromise strategy is it to ask for suitable general abstractions and to then divide an implementation in a general re-usable section and an extension for a special case.

- Keep different functionalities separate! Better implementations for individual functionalities can become available in the course of the time. Architecture should permit exchanging individual modules without changing the total system. Different functionalities should not be mixed. A module should have only a functionality. There are again necessary compromises here too. For example, for source management it can be helpful to bundle different functionalities even if they were separable by design and implementation.

- Plan for testability! Each module should have only well-defined "internal status" values. Inputs and outputs should be testable using explicitly defined interfaces.

Definition/relaxation strategy
We present here a special design strategy. This strategy operates iteratively and is related to the classical method of the stepwise refinement. In the each round we make a set of detail steps:
- problem definition.
  This is a more or less strict specification.
- relaxation.
  We replace the defined problem by a simpler related problem.
- deficit analysis.
  Determine the difference between the relaxed problem and
  the containing full problem
- implementation of the relaxed solution.
  This solution can be given ad hoc, or we iterate the strategy.
- deficit recovery.
  This is a new detail problem, where the solution can assume that
  the relaxed problem has been solved.
As a routine, as soon as a new data structure are introduced we examine whether reusability recommends a more general structure or whether separation of modules by functionality suggests splitting modules.

To keep our modules are re-usable, we examine with each step whether a generalization or an extension is meaningful.

By routine we do a clean up:
- we check the sources of program for outdated components, in particular for proportions, which cannot be reached in the program flow any more (dead code analysis).
- we check the sources of program for repeated structures and for components of same functionality, which are implemented more than once.
We do this clean up whenever a defined function has found a stable implementation.

Case Study: Report Generator
Example implementations for this case study have names beginning
ItO... (Introduction to of Oberon).
As example we take the task of writing a "report generator". The purpose is to given a remedy for a chronic problem: keeping programs and documentation consistent. If documentation and program are separate, then each program modification must be rflected in another place, in the documentation. In order to reduce the problem, we want to embed the documentation in the program. We use comments for this. A tool is wanted to help extracting the documentation from the commentated program source.

The tool should be usable also during the program development. We must identify, which questions are still open and where still detail steps are to be made. The documentation has thus different roles. We use marked comments in order to indicate these roles. For these marks, we reserve the first character of the comment.

Comments, which begin with an asterisk are exported comments. These are to appear in the documentation.
  (** ..... *)  exported comment
To control the appearance, we use variants: repeated asterisks mean higher weight. The corresponding comments should be highlighted increaingly, for instance corresponding to "header levels" in HTML. To allow for some additional layout in the source format, we ignore trailing asterisk marks, i.e.
  (**** ..... *)
    exported comment, weight +3

could as well be marked
  (**** ..... ****).
Comments starting with a = typically are explanations. These are exported for the documentation, but not highlighted.
  (*= ..... *)  exported comment, explanatory.
In particular during program development we use comments for additional purposes.
  (*! ..... *)   "To do" comment.
    Marks a necessary extension or fix.
  (*? ..... *)   Open question or point for discussion.
  (*: ..... *)   Synchronisation mark for
    source code management
If the comment starts with a character different from those listed here, it is ignored for the purposes of the documentation.

As command syntax we chose
  ItOReport.Do source name

We start with an extreme relaxation. In a first step, we just want to make sure that we can interpret the command input correctly. At his level, an obvious generalization is to report any directly accessible information about the file which is interpreted as source. We write this information to the Oberon log.

An implementation is in
  ItO/ItOReport.01.Mod
and calling
  ItOReport.Do ItO/ItOReport.01.Mod
is a first test case.

The obvious deficit is that this implementation supplies no information about the contents. In the second round we extract comments. In the relaxed solution, we report indiscriminately all comments. The deficit is that the different roles are not considered yet. This deficit is to be recovered in the next round.

For the second round we must identify comments. The formal definition is determined by the syntax by Oberon: Comments begin comments with the character sequence (* and end with a closing *). In Oberon, comments can be nested, and only if the complete nesting is closed, the comment is closed. The nesting level can be tracked using a counter; the counter reading 0 is to mean that we are not in the area of a comment.

A relaxation of this problem is the search for character strings, which are limited by (*... *). For this relaxed problem a new data type is meaningful: if we represent the entire source of program with the pre-defined type text, then we now have text segments, characterized by starting position and length, with an additional attribute. This attribute can mark a segment as comment chain, or as something else - not of interest for us and marked as "stuff".

Segmenting a program source is potentially of more general importance. Therefore we implement this structure in a separate module. Segmenting leads to a separation into tokens - for now, we need only two token classes "comment" and "stuff/source text". Looking ahed, we select terms and definitions which correspond to the more general framework.

An implementation is in
  ItO/ItOScan.01.Mod
with an implementation of the corresponding command in
  ItO/ItOReport.02.Mod

The deficit of this implementation is: comment delimiters are effective only on symbol level. Comment delimiters in strings must be ignored. This deficit is to be still recovered. It is an internal deficit of the Scan module. We can recover it there, by introducing strings as new token variant.
An implementation is in
ItO/ItOScan.02.Mod
Because this implementation changes only the Scan module internally, the commands in ItOReport.02 remain valid.

With this round we have a "report generator", which extracts the comments reliably. The open deficit is: the different roles of the comments must be still analysed. Before we recover this deficit, it is time for a clearing step. We examine the source code, remove unreachable/unused components, and standardize repeated implementation of equivalent functionality.
Cleaned up versions are in
ItO/ItOScan.03.Mod
and
ItO/ItOReport.03.Mod

On proceed starting with these cleaned up versions. We have a reliable identification of comments. Now we want to reflect their different role. To do this, we must identify the different comment types and represent them then in an appropriate way. First we extend the data type tCommentToken to allow for the additional information. Again this is an internal modification in the Scan module and does not influence any calling module.
The extended Scan module is in
  ItO/ItOScan.04.Mod
and the commands are still in
  ItO/ItOReport.03.Mod
After this round we can check whether we can identify the comment types correctly.

In the next step we handle the layout. Comments with different roles need a distinct appearance. We must consider different aspects: for many users, differences in color are clear signals. But versions of color blindness are far too common. Differences in color should be complemented by another stylistic feature. For printing, color print is not yet generally available, so that we must support a different marking in any case. These aspects require a complex handling of style attributes in practice.

For our purposes we restrict ourselved to a simplified solution and use only color identification. This is implemented in
  ItO/ItOScan.05.Mod
with the appropriate command calls supported in
  ItO/ItOReport.04.Mod
Example:
  ItOReport.Do ItO/ItOReport.04.Mod ~
  ItOReport.Do ItO/ItOScan.05.Mod ~


Projekt Exercise:
Check and correct the report program. It should fulfill the named specifications reliably. Set up a list of the assumptions upon the source structure and the comment types. Check the program for these cases
a) if the assumptions are fulfilled
b) if one of the assumptions is not fulfilled, but the others apply
c) with sample programs (should operate correctly)
d) with arbitrary example texts
(should not lead at least to system failure)
Correct the program.


We continue to work with the model implementation in ItO/ItOScan.05.Mod or ItO/ItOReport.04.Mod and try to generalize the solution: we want to analyze the source text itself. This is not part of the specifications, but an obvious extension of general importance. The context of the comments can help us to give more meaningful reports than relying only on comments.

To solve this partial problem, we must analyze the source code itself. We have assigned this task to the scan module. But for now, the scan module can only give us a very rough segmentation.

The functionality we need here is already present in various modules, for exampe the compile, Watson and related tools and utilities. Unfortunately, the compiler and related modules are not yet sufficiently designed for re-usability. They do contain the functionality we nee, but constants and other defining components are not exported. We have to take a decision at this point. We can restrict to the implementation we have at hand, and use the compiler components accessible to us as they are (using all undocumented features we can use). Or we want an implementation-independent version. Unfortunately In this case we must implement sections of the compiler again.

We decide for an implementation-independent version on. We extend the data type tToken of our scanner by a class tSymbolToken. This corresponds roughly to the tokens used by the compiler. But for our purposes we need to detect only some constituents for our purposes such as procedure declarations etc. As far as we can foresee, we do not need to analyse e.g. arithmetic expression. In particular we (still) can avoid all difficulties of the recognition of real numbers. Our type tSymbolToken may also represent "raw tokens", which would need a further processing step to extract compiler tokens.

For us these tokens are just text segments, which we use as input unit for further steps. For sources of Oberon the start of these segments can be detected by their (left) context and their first character. Outside of strings and comments the important start signals are:
  A-Z, a-z  starting name or reserved word
  0-9  starting number
  <,=,>,&,...  starting symbol
We adapt our symbol codes of the current version (S3 R2.2) of Oberon. We use a table in order to detect the token start. A token decoding is implemented in
  ItO/ItOScan.06.Mod
In principle this is equivalently to our previous scanner - with the difference that this scanner can detect additional token classes sLParen , sIdent, sNumber. Since we do not evaluate these token classes yet, we receive no additional information.

The problem at hand is to include information from procedure declarations into our report. We relaxe the problem to the simpler problem of reporting appearances of the code word PROCEDURE, followed by a name, and to integrate this with our report. It is now a routine decision to fix where this functionality should be implemented. Identification of tokens is a functionality which can be implemented on scanner level, and a simplified version is already available in ItO/ItOScan.06.Mod as procedure IsTokenText. ItO/ItOScan.07.Mod contains a refinement, which is suitable in principle, for detecting reserved words and symbols. We use these in ItO/ItOReport.05.Mod, in order to receive a testable (and usable) version of our report generator. This version does not detect procedure declarations correctly. It looks only for the pattern "PROCEDURE <name>".


Project Exercise:
Write a " Report" program, which extracts the procedure declarations from a source of program and embeds then in the report.
Time: about one week.


This chapter should illustrate a design strategy which is related to "stepwise refinement", as used with earlier languages. For an extensible system such as Oberon this strategy must be modified to a strategy of gradual prototypes, which we presented here by an example. It is now left as an exercise to develop a working report generator - an idle exercise, because if necessary there are enough good report generators ready for use. The strategy was the important issue here, and we leave the exercise section with a semi finished result.

In closing, we want to add one detail. The handling of formatting is still rudimentary, and it asks for a new data type, which represents an abstraction of the output style. This is again of more general importance and should by a implemented in a separate module be represented.

A point of detail may still be discussed, because we meet a typical problem. Output in Oberon has an iconographical model: a character set is a library of characters. However we need characters with attributes (emphasized, critically, of processing). We meet the task of fusing two different models. This is not trivial. In ItO/ItOStyles.Mod an abstraction is implemented, which translates between an attribute system (as required) and a library system (as available). ItO/ItOReport.06.Mod contains a slight modification, which uses this attribute system at least for the heading.

To install this version to close with, use
  System.Free ItOReport ItOScan ItOStyles~
  Compiler.Compile ItO/ItOStyles.Mod \s ~
  Compiler.Compile ItO/ItOScan.07.Mod \s ~
  Compiler.Compile ItO/ItOReport.06.Mod \s ~
and for a test, use
  ItOReport.Do ItO/ItOScan.07.Mod ~
  ItOReport.Do ItO/ItOReport.06.Mod ~
  ItOReport.Do ItO/ItOStyles.Mod ~


Project Exercise:
Write a "Report" program which supports a marking with color, character type and style. An abstract data type for "Styles" should be introduced. Consult ItO/ItOStyles.Mod as model. Guarantee in particular that the output style is preserved independently of the input style.
Time: about one month.



More literature: Reiser&Wirth, Ch 10
More exercises: Reiser&Wirth 10.2, 10.3, 10.4


Introduction to the Oberon programming language. ItO/Ch11.Text
gs (c) G. Sawitzki, StatLab Heidelberg
<http://statlab.uni-heidelberg.de/projects/oberon/intro/>

Home Up Intro Contents Chapter 1 2 3 4 5 6 7 8 9 10 Design Assert Timing EBNF Report Pas