Semantic Inspection of Software Artifacts From Theory to Practice

(1)

Link¨oping Studies in Science and Technology Dissertation No. 725

Semantic Inspection of Software Artifacts

From Theory to Practice

by

Tim Heyer

Department of Computer and Information Science Link¨opings universitet

SE–581 83 Link¨oping, Sweden Link¨oping 2001

(2)

(3)

Abstract

Providing means for the development of correct software still remains a central challenge of computer science. In this thesis we present a novel approach to tool-based inspection focusing on the functional correctness of software artifacts. The approach is based on conventional inspection in the style of Fagan, but extended with elements of formal veriﬁcation in the style of Hoare. In Hoare’s approach a program is annotated with assertions. Assertions express conditions on program variables and are used to specify the intended behavior of the program. Hoare introduced a logic for formally proving the correctness of a program with respect to the assertions.

Our main contribution concerns the predicates used to express assertions. In contrast to Hoare, we allow an incomplete axiomatization of those predicates beyond the point where a formal proof of the correctness of the program may no longer be possible. In our approach predicates may be defined in a completely informal manner (e.g. using natural language). Our hypothesis is, that relaxing the requirements on formal rigor makes it easier for the average developer to express and reason about software artifacts while still allowing the automatic generation of relevant, focused questions that help in finding defects. The questions are addressed in the inspection, thus filling the somewhat loosely defined steps of conventional inspection with a very concrete content. As a side-effect our approach facilitates a novel systematic, asynchronous inspection process based on collecting and assessing the answers to the questions.

We have adapted the method to the inspection of code as well as the inspection of early designs. More precisely, we developed prototype tools for the inspection of programs written in a subset of Java and early designs expressed in a subset of UML. We claim that the method can be adapted to other notations and (intermediate) steps of the software process. Technically, our approach is working and has successfully been applied to small but non-trivial code (up to 1000 lines) and designs (up to ﬁve objects and ten messages). An in-depth industrial evaluation requires an investment of substantial resources over many years and has not been conducted. Despite this lack of extensive assessment, our experience shows that our approach indeed makes it easier to express and reason about assertions at a high level of abstraction.

(4)

(5)

Acknowledgements

First of all, I would like to thank my supervisors Ulf Nilsson, Anders T¨orne, and Staﬀan Bonnier, for their inspiring ideas and many fruitful discussions. Furthermore, I thank the rest of the members of the Theoretical Computer Science Laboratory (TCSLAB) and the Real-Time Systems Laboratory (RTSLAB) for the good and stimulating working environment.

I am grateful to Ulf Hammar and Stefan Frennemo at ABB Industrial Systems AB for remarks that have led to improvements of the semantic code inspection approach. Many thanks also to P¨ar Emanuelson, Tony Olsson, and Jan Lindgren at Ericsson SoftLab AB. Their experience in the development of large-scale, industrial software helped to improve the semantic design inspection approach. Furthermore, I would like to express my gratitude to NUTEK and VINNOVA for their ﬁnancial support.

Last, but certainly not least, I thank my wife Katrin Wand and my family for their patience and for enduring our long separation.

(6)

(7)

1. Introduction

Providing means for the development of high-quality software artifacts (e.g. a design or an implementation) remains a central challenge of computer science. The overall aim of the work described in this thesis is to develop a new method for increasing the conﬁdence in the correctness of software. Our focus is on functional properties (and not on real-time properties, performance, reliability etc). To achieve our goal we integrate formal and informal veriﬁcation methods. The result is a novel approach to the inspection of software artifacts.

Today’s most important method to ensure the correct behavior of software is testing (see Sect. 2.1). That is, the software is executed with the intention to detect defects. However, executable artifacts are available rather late during the software development. Therefore, testing (and especially removing the defects discovered) is usually considered an expensive and time-consuming activity. Other approaches to increase the conﬁdence in the correctness of the software rely on artifacts available earlier than executable arti-facts. These approaches are:

Formal verification. Formal verification (see Sect. 2.3) is based on mathematical prin-ciples to demonstrate the correctness of a formal artifact with respect to a formal specification. The required knowledge of the formal notation and its proof system is often experienced as a significant barrier against its industrial use. Hence formal verification is typically used only for especially critical components.

Inspection. Inspection (see Sect. 2.2) is the process of finding defects in the artifact by human examination. Several techniques for the inspection phase are in use. Inspection is reported to be very effective both with respects to defect detection and cost (see Sect. 7.2). However, generally the focus is on style rather than on functionality. Since functionality changes from artifact to artifact it is difficult to give general guidelines for how to systematically check an artifact for defects with respect to its intended functionality. Thus the effectiveness of inspection varies very much with the experience and discipline of the personnel involved.

In this thesis we present novel principles for tool-supported inspections. The inspections focus on the functional correctness of software artifacts. The approach combines today’s

(12)

inspection with elements of formal veriﬁcation to yield a practical and systematic in-spection process. We believe that there are several advantages of integrating formal and informal approaches (see e.g. [17]):

• Formal and informal approaches complement each other. Formal approaches con-tribute with their precise semantic basis (enabling advanced tool support) whereas informal approaches provide a simpler and more intuitive view.

• Integrated approaches are easier to introduce in an industrial organization because changes to the development process in place are smaller.

• Specification and verification are possible at varying levels of formality depending on the desired confidence level.

Our idea is to annotate software artifacts with assertions describing presumed and de-sired behavior of the software. These assertions may partly be expressed in an informal manner, e.g. in natural language. Nevertheless, if the semantics of the notation (apart from the assertions) used to describe each artifact is sufficiently formal then it is pos-sible to automatically generate the questions that need to be addressed to ensure the (functional) correctness of the artifact. These questions form the basis for our novel type of inspection. We fill the somewhat loosely defined steps of the conventional inspection with a very concrete content, specifying a highly structured and tool supported protocol for inspecting software artifacts. Our approach was developed over the last six years in the scope of two projects:

Verification Automation in Software Development. The project started in November 1995 and ended in November 1997. It was one of several projects within the competence center for Information Systems for Industrial Control and Supervision (ISIS), and was carried out in cooperation between ABB Industrial Systems and the Real-Time Systems Laboratory (RTSLAB) at the Department of Computer and Information Science (IDA), Link¨oping University. The activities in ISIS are funded by the University, industry, and the Swedish National Board for Industrial and Technical Development (NUTEK).

The general goal of the project was to develop practical means for the early ver-iﬁcation of code. Our strategy was to provide semi-formal support both for the development of code and for its inspection. To achieve our goal, we suggested Compass, a comprehensible assertion method (see [13]). The method supports the automatic generation of those questions which are relevant for the correctness of the code, and whose answers hence provide a systematic explanation of why the code works as intended. Such an explanation helps either in pinpointing errors, or in convincing the inspection team of the correctness of the code.

(13)

programming. An assertion speciﬁes a condition that program variables must sat-isfy each time a certain point in the program execution is reached. Thus, assertions may be used to specify the intended result of executing a piece of code. In 1969 Hoare (see [53]) introduced a logic for reasoning with assertions. His theory pro-vides means for formally proving the correctness of a program with respect to given assertions. Dijkstra (see [29]) noted that verifying code after it has been developed is not entirely realistic. He proposed that code should be developed along with the arguments for its correctness and suggested a discipline of programming based on Hoare Logic.

Both Hoare’s method and Dijkstra’s program development discipline are, within academia, well established since a very long time. However, the methods are hardly known in industry and even less used. We believe this lack of understanding of Hoare’s and Dijkstra’s ideas is due to the fact that most expositions approach the subject from a quite formalistic point of view.

In Compass we relax the requirements on formal rigor in a controlled manner to achieve a method which is more easily used in practice but which still remains partially mechanizable. The key idea inCompass concerns the predicates which are used for expressing assertions: inCompass we allow such predicates to be used without an associated formal definition even when the possibility of a formal proof is disabled. Instead we expect that the predicates are defined in natural language. We developed a prototype tool for the inspection of programs written in a subset of Java and successfully applied it to small but non-trivial programs. Even though we have not conducted an industrial field study, our experience indicates that our approach enables expressing and reasoning about assertions at a high level which makes the algorithmic content of a program explicit.

Tool Support for Design Inspection. The project started in April 1999 and ended in August 2001. It is funded by the Swedish Agency for Innovation Systems (VIN-NOVA) and is carried out in cooperation between Ericsson SoftLab AB and the Theoretical Computer Science Laboratory (TCSLAB) at IDA, Link¨oping Univer-sity.

The goal of the project was to develop methods that facilitate tool support for the inspection of early designs expressed in a subset of UML. To achieve our goal we applied the same principles used for the inspection of Java programs to the inspection of early UML designs. However, since early UML designs are typically incomplete, scenario-based, graphical, and on a high level of abstraction we face new challenges. Nevertheless, the basic idea is to automatically generate those questions that need to be addressed during the inspection from a speciﬁcation and a design. Both the speciﬁcation and the design are expressed in a subset of UML

(14)

extended with assertions.

To evaluate our approach we implemented a prototype and applied it to small but non-trivial designs. Thus, we demonstrate that it is possible to automatically generate those question that have to be addressed in the inspection to check early, incomplete designs for functional correctness. We have not conducted an in-depth industrial evaluation but our experience shows that our approach makes it easier to express and reason about assertions at a high level of abstraction which emphasizes the algorithmic content of the design.

The remainder of the thesis is organized as follows: in Chap. 2 we describe existing ap-proaches to verify software artifacts (namely testing, conventional inspection, and formal verification). The strong and weak points of the different approaches are discussed. How our approach combines conventional inspection with elements of formal verification to en-able the automatic generation of questions to be asked during the inspection is presented in Chap. 3. The application of the general principles to code artifacts is demonstrated in Chap. 4 and the application to design artifacts in Chap. 5. Both chapters include detailed descriptions of the involved notations, methods, prototype tool implementa-tions, and examples. In the following Chap. 6 we briefly introduce an inspection process that exploits our automatic generation of questions. Chapter 7 presents related work. Finally, Chap. 8 contains conclusions and opportunities for future work.

(15)

2. Background

As mentioned, the general objective of this work was to suggest improvements to the current software development practice, enabling more rigid but still practically accept-able methods for the verification of software. In this chapter we briefly describe and compare three existing approaches to verify software (focusing on functional properties). In Sect. 2.1 we discuss testing, in Sect. 2.2 conventional inspection, and in Sect. 2.3 formal verification.

2.1. Testing

The single most commonly used verification approach of today is certainly testing. Test-ing is the process of executTest-ing software to detect the potential presence of defects. TestTest-ing is closely related to debugging, i.e. the process of locating and correcting these defects. Even though testing is today’s most important verification method it has several impor-tant drawbacks compared to conventional inspection and formal verification:

• Testing relies on an executable artifact. Therefore, it is used later than inspec-tion and formal veriﬁcainspec-tion. To correct defects which are detected late is often expensive.

• Testing is in general incomplete. Except for very small programs it is not possible to test the software for all possible inputs. Thus, testing as the only means to guarantee the correctness of the software cannot be completely relied upon. • Testing usually is on a very low level of abstraction making it e.g. cumbersome to

specify test cases.

• Testing does not provide much guidance for the development of correct software, in particular when compared to formal veriﬁcation in the style of Hoare. Test cases specify what the software has to accomplish but they give no hints on how. Because of these drawbacks the costs of testing software systems are rapidly increasing compared to the overall development costs. Testing must therefore be complemented

(16)

with other means to help increasing the conﬁdence in the correctness of the software. These alternative means should preferably be applicable much earlier in the software development process and they should provide some support for how to develop correct software artifacts right from the beginning.

2.2. Inspection

In contrast to testing inspection allows the early detection of defects. Inspection, ﬁrst introduced by Fagan in 1976 (see [34]), is the process of ﬁnding defects in the artifact by human examination. Typically, the inspection process consists of several phases (a detailed description of software inspection can be found e.g. in [43] and in [57]):

1. Planning and overview phase

The phase starts when the author of a software artifact requests its inspection. A selected inspection leader then checks whether the artifact is ready for inspection. If the artifact is ready for inspection the leader determines who should participate and schedules a kick-oﬀ meeting. The meeting is intended to ensure that all par-ticipants understand the artifact to inspect as well as their roles in the inspection. 2. Inspection phase

The participants (equipped with rules, procedures, and checklists intended to help to discover defects) then individually check the artifact. Afterwards the members of the inspection team meet to discuss their ﬁndings with each other. The (presum-ably) found defects are logged and it is determined who is responsible for resolving each defect.

3. Rework and follow-up phase

The inspection leader controls that the defects are being repaired or in some other way dealt with.

The most signiﬁcant phase is the inspection (defect-detection) phase. Several techniques for the inspection phase are in use (see Sect. 7.2). Most approaches rely on checklists of some form to facilitate the inspection phase. However, generally the focus of checklists is on style rather than on functionality. For example, in code inspections the inspection team usually checks that all naming conventions have been followed, that the indentation is as required etc. This is certainly important in order for the code to be readable and uniform (e.g. for maintainability), but it is not suﬃcient for ensuring correctness. In most cases there is a very systematic way to carry out these checks. Indeed, the checks could in principle often be automated. In addition, code inspection checklists sometimes require the inspector to check that the code behaves as intended. But the checklist provides no means for checking functional correctness in a systematic manner. Since

(17)

2.3. Formal Veriﬁcation

functionality changes from artifact to artifact it is diﬃcult to give general guidelines for how to systematically check the artifact for defects with respect to its intended functionality. Despite its problems, inspection is reported to be very eﬀective both with respect to defect-detection and cost (see Sect. 7.2).

Nevertheless, the effectiveness of inspection is very much depending on the experience and discipline of the personnel involved. Several tools to support the inspection process are available. For example, Macdonald and Miller compared 16 tools in 1999 (see [79]). However, the focus of these tools is almost entirely on administrative tasks like scheduling meetings and collecting defect reports. What is missing are guidelines that focus on the functional correctness of the particular artifact, i.e. guidelines that precisely state which questions to address to find defects. Ideally, such questions should be generated automatically from a given artifact. The semantic inspection approach introduced in this thesis has been designed to fill this need.

2.3. Formal Verification

Another approach that allows the early detection of defects is formal verification. Formal verification is based on mathematical principles to demonstrate the correctness of a formal software artifact with respect to a formal specification.

A code verification approach that has received a great deal of attention is Hoare Logic introduced by Hoare (see Sect. 7.5.1). This approach has had a significant impact on later formal methods for designing and verifying imperative, sequential computer programs. Hoare Logic is an axiomatic method for proving programs correct also known as the partial correctness assertion method. An assertion specifies a condition that program variables must satisfy each time a certain point in the program execution is reached. By associating with an operation (i.e., a method of an object) two special assertions called the pre- and the postcondition of the operation, assertions may be used to specify the effect executing the operation is intended to have on the computation state. In 1969 Hoare (see [53]) introduced a logic for reasoning with assertions. His theory provides means for formally proving the correctness of a program with respect to given assertions which have been added to the program after its development. The formulae of Hoare Logic are so called Hoare Triples {P } S {Q}, where P and Q are assertions, and S is a piece of code. It is to be read “ifS starts executing in a state where P is satisfied, and if the execution of S terminates, then Q is satisfied upon termination”. The method proposed by Hoare for proving such formulae, presupposes the existence of proof rules for the programming language under consideration. The method may be considered to consist of the following three phases (see Fig. 2.1 on the following page):

(18)

Accept Reject Formal proof

Assertions Program

Automatic generation of verification conditions

Fig. 2.1. The three steps of Hoare’s partial correctness assertion method

1. Development of code with assertions inserted at appropriate places

In the ﬁrst phase the code is developed and corresponding assertions are added. The assertions specify the intended behavior of the code.

Example 2.1

The following code with assertions swaps the values of two integer variables without using an additional temporary variable:

{x = x@pre ∧ y = y@pre}

x = x + y; y = x − y; x = x − y;

{x = y@pre ∧ y = x@pre} The postcondition expresses that, after the execution of the three assignments,x equals the initial value ofy (denoted as y@pre) and y equals the initial value of x (denoted asx@pre). The precondition simply expresses that before the execution bothx and y equal their initial values.

Although the program consists only of three assignments it is not obvious that it actually is correct.

Dijkstra (see [29]) noted that verifying code after it has been developed is not entirely realistic. Dijkstra instead proposed that code should be developed along with arguments for its correctness. For this purpose he suggested a discipline of programming, based on Hoare Logic, where one ﬁrst states the assertions the code is to establish, and then uses the assertions to guide the development of the code. 2. Generation of a set of logical formulae (called verification conditions)

In the second phase so called veriﬁcation conditions are generated from the code and the assertions. Veriﬁcation conditions are logical formulae. If each of them can be proven then all assertions except the precondition are valid, i.e. the code is

(19)

2.3. Formal Veriﬁcation

correct with respect to the assertions. Veriﬁcation conditions are generated using the axioms and inference rules (axiomatic semantics) of the programming language under consideration.

Example 2.2

Consider again the program from Example 2.1 on the preceding page: {x = x@pre ∧ y = y@pre}

x = x + y; y = x − y; x = x − y;

{x = y@pre ∧ y = x@pre} Informally, the veriﬁcation condition is generated like this:

1. For the postcondition to be satisfied after the execution of the third assign-ment, everything that is supposed to be valid forx in the postcondition must be valid forx − y immediately before the execution of the third assignment. That is, the following condition must be satisfied before the final assignment: x − y = y@pre ∧ y = x@pre.

2. Accordingly, before the second assignment the following condition must be satisﬁed: x − (x − y) = y@pre ∧ x − y = x@pre.

3. Finally, before the ﬁrst assignment the following condition has to be satisﬁed: (x + y) − ((x + y) − y) = y@pre ∧ ((x + y) − y) = x@pre.

The veriﬁcation condition is that the condition just derived must be implied by the precondition:

x = x@pre ∧ y = y@pre =⇒

(x + y) − ((x + y) − y) = y@pre ∧ ((x + y) − y) = x@pre

Veriﬁcation conditions for more complex programs are generated in a similar way.

It should be noted that the meaning of “+”, “−”, and the predicate “=” is not involved in the above generation of veriﬁcation conditions. In fact, the generation of veriﬁcation conditions comprises mainly the textual substitution of variables by expressions and can be performed automatically without taking the meaning of involved predicates and operations into account.

3. A formal proof of the verification conditions

The third phase is a formal proof of the veriﬁcation conditions, using axioms and proof rules for the domain over which the program variables range.

(20)

Example 2.3

The veriﬁcation condition generated in the previous Example 2.2 on the preceding page is:

x = x@pre ∧ y = y@pre =⇒

(x + y) − ((x + y) − y) = y@pre ∧ ((x + y) − y) = x@pre Using conventional arithmetic laws the above formula can easily be simpliﬁed to:

x = x@pre ∧ y = y@pre =⇒ x = x@pre ∧ y = y@pre

The proof of the above formula is trivial. Hence the code is correct with respect to the assertions, i.e. the code in Example 2.1 on page 8 indeed swaps the values of two integer variables.

Both Hoare’s method and Dijkstra’s program development discipline are, within aca-demia, well established since a very long time, and are also recognized to be the pre-dominant methods for formal development and verification of sequential programs in imperative languages (see Chap. 7). In this perspective, it is quite remarkable that the methods are hardly known in industry, and even less used. Indeed, to the extent asserted programs are developed at all, the assertions are mostly used as run-time checks during debugging and testing (see Sect. 7.5.2). We believe that the lack of understanding of Hoare’s and Dijkstra’s ideas is due to the fact that most expositions approach the sub-ject from a quite formalistic point of view, and thus give the feeling that full formality is a requirement for its applicability. The demand for full formality arises from the third phase of Hoare’s method: to be able to perform a formal proof of the verification condi-tions, a formal axiomatization is required for each predicate and function symbol which occurs in the assertions and which hence is used to express the intended behavior of the program. However, such axiomatizations do often have a non-obvious connection to the intuitive understanding of the property the predicate is to represent (see Example 3.2 on page 13). As a consequence, they are both difficult to state and to reason with. Even for other formal approaches the required knowledge of the formal notation and its proof system is often a significant barrier against the industrial use. Hence formal verification is typically used only for especially critical components (see e.g. [106] for a discussion on why formal methods have not been adopted by industry to the extent one would expect).

(21)

3. Our Approach

The general aim of our work is to develop practical means for increasing confidence in software correctness. The approach presented in this thesis combines conventional inspection with elements of formal verification. Traditional inspection lacks guidelines that precisely state which questions need to be addressed to find defects. Formal veri-fication is often perceived as very difficult due to the required knowledge of the formal notation and its proof system. Our basic hypothesis is that a method which is easier to use than formal verification and which remains partially mechanizable, may be achieved by relaxing the requirements on formal rigor in a controlled manner. The idea is to automatically generate the questions to be answered during the inspection. The ques-tions are generated from a given annotated software artifact. We are not aware of other approaches that extend conventional inspection with elements of formal verification to yield a systematic software inspection. We call our inspection semantic inspection for two reasons:

• We are interested in functional properties of the software artifact, i.e., we are interested in what the artifact means and what it does.

• Our approach requires that the meaning of the underlying notation (which is used to describe the inspected artifact) is formally deﬁned. For example, we need to know the semantics of the programming language Java and of the UML diagrams for early designs since we apply our semantic inspection later on artifacts expressed in these notations (see Chap. 4 and Chap. 5).

The cornerstones on which our semantic inspection is based are Hoare’s method for proving programs correct (see Sect. 2.3) and Fagan’s work on code inspection (see Sect. 2.2). Our semantic inspection method is similar to Hoare’s partial correctness assertion method. However, our approach diﬀers in two important ways from Hoare’s approach:

• The key idea in the semantic inspection approach concerns the predicates and functions ranging over the domain of artifact variables, and which hence are used for expressing assertions. In our approach such predicates and functions are al-lowed to appear without an associated formal deﬁnition (axiomatization) beyond

(22)

the point where a formal proof of the correctness of the artifact may no longer be possible. Instead the predicate or function is deﬁned in any way suitable for humans to understand its meaning (e.g. in natural language). The point is that such predicates and functions may still have a perfectly legal informal interpre-tation, and by expressing it informally rather than formally its meaning is more directly accessible to the human reader. For convenience we refer to predicates and functions without formal axiomatization as informal predicates and informal functions.

• We apply the idea of generating veriﬁcation conditions from code and assertions to other artifacts as well. For example, in Chap. 5 we describe how veriﬁcation conditions can be generated from annotated early designs expressed in a subset of UML.

With the introduction of informal predicates and functions we relax the requirements on formal rigor in a controlled manner obtaining a method which is more easily used in practice but which still allows automatic tool support. As mentioned, we allow informal functions and predicates to such an extent that a formal proof of the correctness of the artifact may no longer be possible. Therefore, we replace the formal proof with a human inspection and get the following steps (see Fig. 3.1 on page 14):

1. Development of software artifact and assertions

The software artifact is developed as usual (i.e., with an existing development pro-cess). However, the artifact is annotated with assertions that may contain informal functions and informal predicates. Informal functions and predicates make it easy to express conditions in terms of assertions. Of course, using natural language increases the risk of ambiguous function and predicate definitions. Therefore, our inspection may be considered “weaker” than formal verification. However, even formal verification faces that challenge since natural language is the most widely used notation for stating requirements which may later be translated into formal specifications (see e.g. [61] on how ambiguities in requirements can be detected). Example 3.1

An assertion that states that an array of integers is sorted in the interval from left to right could be expressed with help of the informal predicate symbol sorted. The deﬁnition of the symbol is then supplied e.g. in a special comment in the code:

define sorted(array, left, right) informally

The array “array” is sorted in increasing order between “left” and “right”. end define ;

The predicate symbol sorted may now be used in assertions either by itself or com-bined with other predicate symbols (both formally and informally deﬁned ones).

(23)

It should be noted that assertions provide a structured documentation of the arti-fact and thus improve the communication between diﬀerent developers (and pos-sibly customers). The importance and role of accurate, structured documentation has been described e.g. by Parnas in [94]. Apart from being used for documen-tation purposes, the assertions may be exploited to drive the development of the artifact in the manner of Dijkstra’s discipline of programming (see [29]).

2. Automatic generation of verification conditions

The generation of veriﬁcation conditions is not dependent on the meaning of the predicates appearing in the assertions. Thus it is possible to automatically generate the veriﬁcation conditions necessary to verify the artifact.

3. Human inspection of the verification conditions

The verification conditions typically contain informal functions and predicates, generally disabling the possibility of a formal proof. Therefore the formal proof of Hoare’s method is replaced by human inspection. During the inspection the verification conditions are informally justified by the inspectors. However, informal functions and predicates facilitate human reasoning with assertions and verifying verification conditions on a high level of abstraction making the algorithmic content of the artifact visible.

Example 3.2

In the formal veriﬁcation of a program that sorts an array of integers the following veriﬁcation condition may arise:

∀i, 1 ≤ i ≤ n − 1 : a[i] ≤ a[i + 1] ∧ ∀i, 1 ≤ i ≤ n : a[n + 1] ≥ a[i] =⇒

∀i, 1 ≤ i ≤ n : a[i] ≤ a[i + 1] Using a predicate symbol sorted and a function symbol max the same veriﬁcation condition may be presented in the following form instead:

Assume:

1. sorted(a, 1, n)

2. a[n + 1]≥ max(a, 1, n) Then:

1. sorted(a, 1, n + 1) Is the conclusion satisfied?

The predicate symbol sorted (and the function symbol max) describes an informal predicate (and function) as deﬁned in Example 3.1 on the facing page. Thus, the ﬁrst condition with the formal axiomatization can formally be proven while

(24)

Human Inspection Accept Reject Assertions

Automatic generation of a set of questions Artifact

Fig. 3.1. The three steps of the semantic inspection approach

the second condition requires a human to decide whether it is satisfied or not. However, the informal version is much easier to read and to understand; a human reader is able to answer the question easily without having to struggle with the formal axiomatization. The second verification condition is also presented in a simpler (Horn clause) form avoiding nested formulae and quantifiers.

Each verification condition is presented as a set of questions of the form “Assume that P is satisfied, is then Q satisfied as well?” where P and Q are formulae as described later in this chapter. If all questions can be answered with “yes” then the artifact is assumed to be correct with respect to the assertions (see also Chap. 6). In later chapters we describe how the semantic inspection approach can be applied to Java programs (see Chap. 4) and also to early UML designs (see Chap. 5). Certain types of assertions may be attached both to code and to designs (e.g. operation pre-and postconditions), whereas other types may not. However, common to all assertions is that they are expressed in a notation similar to first-order predicate logic. In a first order theory some set of objects is selected, and all the statements of the theory are statements about these objects. The objects of primary concern here are the abstract data structures of the software artifact (i.e., assertions specify conditions on the state of the modeled system). The initial focus of our work has been on the application areas of our industrial partners (see Chap. 1), i.e., process control and telecommunications. The software artifacts we have studied are implementations expressed in a subset of the programming language Java (see Chap. 4) and designs expressed in a subset of UML (see Chap. 5). Both the design and code notations are based on object-orientation. Our assertion notation which is presented in this section has been developed to support the application areas and software artifacts mentioned.

An assertion is a well-formed formula. Well-formed formulae may contain predicates which have arguments. The arguments of predicates are well-formed expressions which we define first. To define expressions we use the following syntactic categories and

(25)

meta-variables which range over constructs of each category: • n will range over numerals, Num

• x will range over variables, Var • E will range over expressions, Exp

• f will range over (user-deﬁned) function symbols, Fun

All variables that occur in an artifact can be used in assertions (e.g. for Java programs this includes qualified variables). These variables are called artifact variables (or program variables if the artifact is a program and design variables if the artifact is a design). In principle, assertions in the style of Hoare specify a condition on the state of the software as captured in the program variables. The program variables occur as free variables in the assertions. However, often it is necessary to refer to initial values of artifact variables. For example, if we want to specify what an operation accomplishes we may need to refer to the initial values of its parameters, e.g. to express that a car has half its initial speed after breaking shortly. Therefore, in Hoare Logic e.g. initial values of variables are captured in logical variables which do not appear in the artifact. Unlike artifact variables, logical variables do not change their values during the execution of the software. The logical variables are (implicitly) universally quantified. To distinguish logical variables from artifact variables they have a special appendage “@pre” or “#n” (where n is a natural number). For convenience we implicitly define for each artifact variable a corresponding logical variable with the name and appendage “@pre”. These logical variables refer to the initial value of the artifact variable with respect to the context of the assertion. For example, an assertion that expresses that a car has half its initial speed after the “brake-shortly” operation could be written as 2 ∗ speed = speed@pre. With logical variables, assertions (in particular postconditions) specify a relation between two states rather than one state. More details on the role of logical variables and the relationship between Hoare Logic and VDM (the Vienna Development Method, see e.g. [60]) can be found in [63].

3.1 Definition (Well-formed expression)

Numerals and variables are expressions. Complex expressions can be obtained by com-bining expressions with predeﬁned operators or user-deﬁned functions. The abstract syntax of expressions is:

E ::= n

| x | this | result

| −E | E − E | E + E | E/E | E ∗ E | f(E, . . . , E)

(26)

truncating division is the standard one. We actually require our language to be typed with a type system that depends on the artifact. Since we are dealing with several kinds of artifacts here we defer from developing such type systems and assume that all expressions are type correct.

Some additional comments on expressions: the variables this and result are special artifact variables. The variable this generically refers to each single instance of a class, i.e., it is a place-holder for the name of that instance (often when assertions are specified the name of the instance is not known). By using this it is possible to express a condition on the state of each single instance of a class (see also Sect. 4.1 and Sect. 5.3). The variable result refers to the return value of an operation in case it has one (see also Sect. 4.1). Moreover, all expressions are classified according to their type, i.e., the kind of values they can assume. Possible types are integer (i.e., negative and positive natural numbers) and classes. The notion of a class is the cornerstone of object-orientation. A class can be seen as an abstract data type with attributes and operations to manipulate these attributes. As mentioned, we assume for the remainder of this thesis that all expressions etc. are type correct, in fact when it comes to predicate and function definitions we do not even list the types of parameters like it is done e.g. in Java.

To deﬁne formulae we use the following additional syntactic categories and meta-vari-ables which range over constructs of each category:

• p will range over (user-deﬁned) predicate symbols, Pre • F will range over formulae, For

3.2 Definition (Well-formed formula)

The boolean constants true and false are formulae. Predefined and user-defined predi-cates are atomic formulae. Complex formulae can be obtained by combining formulae with predefined operators. The abstract syntax of formulae is:

F ::= true | false

| ¬F | F ∧ F | F F | F ∨ F | F F | F =⇒ F | E < E | E ≤ E | E > E | E ≥ E | E = E | E = E | if F then F else F

| p(E, . . . , E)

The meaning of the predefined predicates and of negation, disjunction, conjunction, and implication is like in conventional logic. The connectives and are conditional variants of∧ and ∨ to handle partial functions. The result of a conditional conjunction is false if the first argument is false and the result of a conditional disjunction is true if the first argument is true. The meaning of if F then F₁elseF₂is (F =⇒ F₁)∧ (¬F =⇒ F₂). Moreover, we assume that all well-formed formulae are type-correct.

(27)

Partial functions are common in software engineering. For example, the value obtained by addressing an arraya with an index i outside the index set of a is typically undeﬁned. Several approaches to treat partial functions have been presented. We use Dijkstra’s (see [29]) approach of conditional (asymmetric) conjunction and disjunction. Dijkstra’s ap-proach impose an evaluation procedure to allow the use of partial functions. A symmetric approach to treat partial functions without using a three-valued logic is described by Parnas in [91]. Programming languages typically describe in which order expressions are evaluated. Since questions generated from code usually contain programming language expressions, we chose Dijkstra’s approach. However, other approaches could be used as well.

Some comments on equality: if both arguments of the equality relationship have the same value then they are equal. It is not necessary that both arguments have the same name (or address when it comes to implementations). For example, if we have two variables joesCar and johnsCar then joesCar = johnsCar if and only if both cars are equal with respect to all their attributes (i.e., if they have the same color, the same age, the same number of horse powers, . . . ). If the car owner is an attribute of a car then joesCar and johnsCar are not equal.

Example 3.3

A formula expressing that a phone phoneA is oﬀ-hook and connected to another phone phoneB over a network net could look like this:

oﬀHook(phoneA) ∧ connected(phoneA, net, phoneB)

The formula uses two user-defined predicate symbols, namely offHook and connected. The symbols are defined formally or informally e.g. as shown in Example 3.4 on the next page.

A function or predicate symbol may be defined formally (using an expression respectively a formula) or informally (using e.g. natural language). The point is that any means is possible as long as it provides a unique interpretation of the function or predicate to the human reader. In this thesis function and predicate symbols are usually defined in natural language. An informal definition is given in the following form:

define p(parameter-list) informally

Informal function or predicate deﬁnition end define ;

A function or predicate symbol may be defined formally using an expression as defined in Def. 3.1 respectively a formula as defined in Def. 3.2. Since formulae may not contain quantifiers, it is in general not possible to completely axiomatize a predicate. That is,

(28)

the formula used to define a predicate usually contains informally defined functions or predicates. A formal definition is given in the following form:

define p(parameter-list) formally

Formal function or predicate deﬁnition end define ;

In both cases (informal and formal deﬁnition), p is the name of the function or predicate and parameter-list is the comma-separated list of the formal parameters of the function or predicate.

Example 3.4

The informal deﬁnition of a predicate symbol connected that expresses that two phones are connected over a network could look like this:

define connected(a, n, b) informally

The phone “a” and phone “b” are connected over a dedicated, full-duplex line through the telephone network “n”.

end define ;

As shown the predicate symbol connected has three parameters. The parameters are substituted with arguments in a formula where the predicate symbol is used (see e.g. Example 3.3 on the preceding page).

Example 3.5

That a program computes an integer approximation of the square root of a number may be expressed with the predicate symbol maxApproxSquareRoot. An informal deﬁnition may look as follows:

define maxApproxSquareRoot(n, x) informally

“x” is the largest natural number that is smaller than or equal to the positive square root of “n”.

end define ;

An alternative, formal deﬁnition of maxApproxSquareRoot could look like this: define maxApproxSquareRoot(n, x) formally

0≤ x ∧ x ∗ x ≤ n ∧ n < (x + 1) ∗ (x + 1) end define ;

(29)

4. Code Inspection

In this chapter we will demonstrate how the principles of semantic inspection introduced in Chap. 3 can be applied to the inspection of code. For the inspection of code written in an annotated subset of the Java programming language we developed Compass, a comprehensible assertion method (see [13]). Our method supports the automatic generation of those questions which are relevant for the correctness of the code, and whose answers hence provide a systematic explanation of why the code works as intended. Such an explanation constitutes the heart of code inspection, and helps either in pinpointing errors, or in convincing the inspection team of the correctness of the code. The method consists of the following three steps (see Fig. 4.1 on the next page):

1. Development of code with assertions inserted at specific places

Assertions express conditions that are assumed to, respectively supposed to, be valid at speciﬁc points during the computation. In particular, assertions specify both what a piece of code is supposed to establish and what it may expect to be valid. Our assertions may contain predicates without associated formal deﬁnition. This enables the formulation of and reasoning with assertions on a high level of abstraction. Moreover, the assertions provide a structured documentation of the code even if they are not used for the generation of the questions.

2. Generation of a set of questions (i.e., verification conditions)

The generation of the questions is not dependent on the meaning of the predicates appearing in the assertions. However, the meaning (i.e., the semantics) of the programming language has to be deﬁned. Then the questions may be automatically generated from the assertions and the code. The questions are the only questions needed to be addressed in the code inspection (for functional correctness). 3. Human inspection of the questions

The questions generated in the second step in general contain informal predicates, thus disabling the possibility of a formal proof. Therefore the questions are pre-sented to a human inspector who will answer them (and informally justify the answer). If all questions can be answered positively then the code is assumed to be correct with respect to the assertions.

(30)

Human Inspection Accept Reject Assertions

Automatic generation of a set of questions Code

Fig. 4.1. The three steps of the semantic code inspection method

It should be clear that not any program nor any property is amenable to verification along the principles of the Compass method. We mentioned earlier that we are interested in functional properties alone. To be more precise, we focus on partial correctness as opposed to total correctness. That is, we verify that the program is correct if it terminates but we do not verify that it terminates. To prove the termination a so called bound function (or variant function) is provided by the programmer for each loop. Then it is verified for each loop that the bound function is bounded from below as long as the loop has not terminated, and that each loop iteration decreases the bound function. If both conditions are satisfied it can be concluded that the loop terminates. Both conditions can be expressed as Hoare Triples. Thus, it is in principle possible to verify termination with our approach. Recursive procedures are treated in a similar manner (see e.g. [51], [1], and [82]). However, our attention is not on termination but on high level functional properties representable as relations between program variables. Furthermore, the code has to be well-structured in order for the method to be applicable (i.e., essentially it has to be developed according to Dijkstra’s discipline of programming). We continue in Sect. 4.1 with a description of the various types of assertions that are used to specify the intended behavior of code (in the Java programming language). This is related to the first step of our semantic code inspection method introduced earlier. In Sect. 4.2 we give an overview of Dijkstra’s discipline of programming (i.e., how to develop code by first specifying in assertions what the code should accomplish, and then by exploiting the assertions to write the code). The second step, how the questions are generated from an annotated program, is presented in Sect. 4.3. We have implemented a prototype tool to demonstrate the feasibility of our approach. This tool is described in Sect. 4.4. Finally, Sect. 4.5 contains a small but non-trivial example to illustrate our approach. The third step of the semantic code inspection method is the same as for the semantic design inspection method described in Chap. 5. What the human inspection of the questions generated in the second step may look like is therefore described in Chap. 6.

(31)

4.1. Speciﬁcation of Java Programs

4.1. Specification of Java Programs

The notion of informal predicates has been introduced to enable reasoning about the high level algorithmic content of a program. Therefore, the use of data abstraction is central to our Compass method. By formulating assertions in terms of objects of an abstract data type, veriﬁcation of a high level algorithmic nature can be separated from the veriﬁcation of low level invariants of the representation of the data type (such as e.g. non-corruption of the data representation).

To be able to evaluate theCompass method in an industrial setting, it was first adapted to the programming language C. However, data abstraction is not supported in C. Al-though it is possible to apply data abstraction without support in the language, it is not realistic to expect such a discipline to be followed. Another property of C that counter-acts the intention with informal predicates is C’s primitive memory management which makes it difficult to reason about program correctness without taking low level proper-ties into consideration. Even though it theoretically might be possible to include such properties into the verification, the complexity does in practice become unmanageable. Thus, the programming language to which theCompass method is applied should have the following properties for a smooth and practical application of the method:

1. Support for data abstraction and

2. Simple memory management (e.g. some form of automatic garbage collection). With respect to the above considerations all examples in this thesis concern code written in a subset of the Java programming language. Java supports data abstraction (object-oriented), has automatic garbage collection and is hence an appropriate candidate for ourCompass approach.

As mentioned, assertions are used to specify the intended behavior of a program. To be able to easily compile the Java code with its speciﬁcation included, all additional constructs that are introduced for our Compass method are encapsulated in special comments. These comments start with/*+ and end with */. In the following Sect. 4.1.1 we describe our subset of the Java programming language in general terms. The notion of class invariants is introduced in Sect. 4.1.2. Finally, in Sect. 4.1.3 we explain how the Java function or method interfaces are speciﬁed.

4.1.1. Our Java Subset

We choose a subset of Java to simplify the implementation of a prototype tool to support code inspection. Some of the restrictions are of a syntactic nature and are not limiting the expressibility of the language. Other restrictions are real semantic restrictions and

(32)

have partly been introduced to keep the prototype tool simple and partly to avoid some of the diﬃculties described later in Sect. 4.3.1. Some of the major limitations are:

• Single thread assumption. Why our focus has been on sequential programs is explained in Sect. 5.4.1.

• Class variables have to be declared as private and methods as public. We explain the reasons for this restriction in Sect. 5.4.1.

• Java interfaces and class inheritance are not supported. Some principal difficulties related to inheritance are presented in Sect. 4.3.1. However, the main reason for this restriction is to simplify the development and implementation of our approach. • Functions, i.e. methods that return a value, are neither allowed to modify any of their arguments nor class variables. It is not difficult to handle functions which have side-effects, but such functions make the questions more complex. The restriction allows us to substitute every occurrence of a function invocation with the result specified in the postcondition of the function.

• Local variable declarations are allowed only directly after the beginning bracket of the body of a method. This restriction is introduced for practical reasons alone and not because of principal diﬃculties.

• No static variables are allowed. A static variable could be seen as a class variable that is visible only in a single method (i.e., there are no principal diﬃculties to handle static variables). We do not support static variables for practical reasons. • While loops are permitted, but not for and repeat loops. There are no principal

diﬃculties to handle for and repeat loops; the reasons for not supporting them are purely practical.

• Assignment in expressions is not supported. Assignments in expressions are like function with side-eﬀects (see above).

• Java’s exception handling is not considered because it would make the development of our approach more complex. However, there are no principal diﬃculties. It may seem that many Java programming language constructs are not supported. How-ever, in many cases it is rather straight forward to express an unsupported construct with the help of the remaining ones.

4.1.2. Class Invariants

We support the speciﬁcation of a class invariant, i.e., a property shared by all instances of a class. This property is supposed to be preserved by the methods of the class. This

(33)

means, that if the class invariant is satisfied for an object before the invocation of a method then it is satisfied after its completion (for the constructor method the invariant only needs to be satisfied after the execution). The notation for a class invariant is:

maintains

Speciﬁcation of class invariant ;

The class invariant is a formula as described in Chap. 3. It is supposed to be satisfied whenever the object is accessed. Only the program variables this (or functions on this) are allowed to occur in the class invariant (since class attributes are not visible outside the class but the invariant is). A class invariant is written in a Compass comment directly after the class declaration. If the invariant (or any other assertion) contains user-defined predicates or functions, then their definitions are provided in the same file as the invariant in the textual form presented earlier in Chap. 3.

Example 4.1

A class for simulating traﬃc lights may have an invariant which states that an instance of the class always shows a valid combination of lights:

public class TraﬃcLight /*+ maintains

validSignal(this) ; */

To access particular properties of a class functions on this have to be used (the prop-erties may be attributes or values derived from attributes). For example, we could use user-deﬁned functions to access the individual lights of a traﬃc light. An invariant using a 3-ary predicate symbol validSignal could look like this: validSignal(redOn(this), yellowOn(this), greenOn(this)).

4.1.3. Interface Specifications

An interface specification describes the name and the formal parameters of a function or method. In addition, it specifies under which conditions the function or method may be invoked and what result it is intended to deliver. An interface specification consists of the function or method head followed by a special comment containing a precondition, a postcondition, and a so called modifies-clause (for methods only). The function or method head is written in standard Java notation:

(34)

For functions the type describes the type of the returned value. For methods the type is void. The name is the name of the function or method, and parameter-list is a comma-separated list of formal parameters (type and name).

Formal parameters may be defined to be constant by writing the key-word const in a Compass comment in front of the parameter’s type. A constant parameter is assumed to remain completely unchanged during the execution of the function or method. However, we do not verify that constant parameters actually remain unchanged. If the constant parameter is a compound object then it is assumed that no subcomponent is changed either. Constant parameters simplify the specification of assertions. Without constant parameters it may be necessary to explicitly state that some parameters remain un-changed, i.e. that their current value always equal their initial value. The problem that it in general is not sufficient to specify what a program does change but also what it does not change is known as the frame problem. For example, that a car car has reached a certain speed s after accelerating could be expressed like this: speed(car) = s. However, methods which ensure that this condition is satisfied after their execution may also turn on the car’s head lights. Intuitively, we would want that only those variables necessary to establish the condition are changed. Therefore, it is not enough to specify only what changes but also what does not. Several approaches to deal with this problem have been suggested (see e.g. [7] and [14]). We tackle the frame problem with constant declara-tions, the modifies-clause introduced later, and by explicitly stating in predicates which variables do not change.

Example 4.2

The following could be the head of a function that searches for the ﬁrst position of an element of array a with the same value as x:

public int search(/*+const*/ int x, /*+const*/ int[ ] a)

Since both x and a are constants it is assumed that they do not change their value during the execution of the body, i.e. it is assumed that x = x@pre and a = a@pre always are valid. This means that neither a itself nor any of its elements are modiﬁed during the execution of the method’s body.

In aCompass comment following directly after the head of the function or the method, the precondition (requires-clause), the postcondition (ensures-clause) and a list of vari-ables which are modiﬁed by the method (modiﬁes-clause) are given. The precondition and postcondition have the following forms:

requires

Speciﬁcation of method precondition ; ensures

(35)

Speciﬁcation of method postcondition ;

Both the method precondition and postcondition are formulae as described earlier in Chap. 3.

The precondition expresses a condition that is supposed to be satisﬁed before the exe-cution of the body of the method. Thus, the formula is only allowed to contain logical variables, namely the ones referring to the initial values of the parameters (and the object).

The postcondition, on the other hand, expresses a condition on the program variables that is supposed to be satisﬁed when the method invocation returns. That is, the assertion expresses what the method is intended to accomplish. If the method actually is a function, the postcondition always has the form result = f, where result denotes the return value of the function and f represents the (formally or informally deﬁned) function that is performed by the method.

Example 4.3

In a class Computer we may have a function uptime that returns the time the computer has been continuously on. A corresponding interface speciﬁcation may look like this:

public Time uptime() /*+ ensures

result = upTime(this@pre) ; */

The function upTime is deﬁned elsewhere in a Compass code comment. For example, an informal deﬁnition could look like this:

define upTime(computer) informally

The function returns the time since the last boot of the computer “computer” in seconds.

end define ;

The requires-clause and the ensures-clause are followed by the modiﬁes-clause which has the following form:

modifies

variable-list ;

The variable-list is a comma separated list of the (program) variables which are modiﬁed by the method.

(36)

The modifies-clause describes the variables that may be altered due to execution of the method’s body. The variable list may only contain formal parameters and/or a reference to the object itself using this. Only elements of a structured type may be modified by a method, since only structured types are handled by reference in Java. Objects and arrays are structured types. Primitive types are handled by value, i.e. the actual values are passed to methods, and changes to these values are not reflected in the actual arguments. The variable this has to appear in the variable list when class variables may be modified by the method (see Sect. 4.3.1 and Sect. 4.1.1). Finally, functions do not have a modifies-clause, since they are not allowed to modify the arguments or the state of the object (see Sect. 4.1.1).

Any of the three clauses may be omitted. If the precondition is omitted then it is assumed to be true. If the postcondition is omitted it defaults to true as well. The modiﬁes-clause defaults to the empty list, i.e., it is assumed that the method is neither altering any parameters nor the object itself.

Example 4.4

An interface speciﬁcation for a method that sorts an array between a left and a right index may look as follows:

public void sort(int[ ] v,/*+const*/ int left, /*+const*/ int right) /*+ ensures

sorted(v@pre, v, left@pre, right@pre) ; modifies

v ; */

The requires-clause is omitted since the method has no restrictions on the state prior the execution of its body (except of course type correctness which is checked by the compiler).

4.1.4. Other Assertions

Apart from the assertions needed to specify the class invariant and the method interface, two other assertions, namely loop invariants and intermediate assertions, may appear within the code in a method’s body. Loop invariants are speciﬁed like this:

maintains

Speciﬁcation of loop invariant ;

A loop invariant is a formula as described in Chap. 3. It expresses a condition on the state of the computation which is satisﬁed after each execution of the body of the loop,

(37)

assuming that it is satisﬁed before the execution. More about loop invariants may be found in Sect. 4.2 and Sect. 4.3.2. Loop invariants have to be provided for the generation of the veriﬁcation conditions.

Example 4.5

A program that given x and array a determines the ﬁrst occurrence of x in a (it is assumed that such an element exists) may be implemented like this:

while(a[i]= x) /*+ maintains

inSecondPart(x@pre, a@pre, i) ; */

i = i + 1;

The invariant expresses that the element x@pre is not among the ﬁrst i − 1 elements of a@pre but among the rest of the elements (i.e., it is in the second part of the array). The other additional assertions that may be speciﬁed along with the code in a method’s body are intermediate assertions. Intermediate assertions look like this:

ensures

Speciﬁcation of intermediate assertion ;

An intermediate assertion is a formula as explained in Chap. 3. It is used to express a condition which is expected to be satisﬁed each time execution reaches the point of the assertion.

Example 4.6

The following piece of code may be part of an implementation of the quicksort sorting algorithm:

split(v, left, right, i); /*+ ensures

partition(v@pre, v, left@pre, right@pre, i) ; */

quicksort(v, left, i.val()− 1); quicksort(v, i.val() + 1, right);

The intermediate assertion states that, after the invocation of the method split, the array v is partitioned in a particular way (namely, all elements to the left of i are less than or equal to v[i] and all elements to the right of i are greater than or equal to v[i]).

(38)

4.2. Program Development

The ﬁrst phase of the Compass method consists of the development of code with em-bedded assertions. In 1976 Dijkstra suggested a discipline for developing code with assertions along with arguments for its correctness (see [29] and [28]). The development method that is part of ourCompass method is the same as proposed by Dijkstra except for the informal predicates that are allowed to occur inCompass assertions. Since the program development method is described at length in the literature (see e.g. [46] and [5]) only a brief overview will be given here.

Programming is considered a goal-oriented activity, i.e., the desired result (postcondi-tion) plays a more important role than the precondition. Therefore, before trying to solve a problem, one should make oneself confident with the problem and develop the corre-sponding pre- and postconditions. Then, given the postcondition (and precondition), the aim is to develop a program that terminates in a state satisfying the postcondition. The correctness of such refinement steps from a specification to a program have been studied by Back (see e.g. [2] and [3]).

Two common building blocks for a program are the conditional statement and the iter-ative statement :

• To invent a conditional statement (e.g. an if or switch statement in Java), a state-ment S has to be found, that establishes the desired postcondition Q in at least some cases. A boolean expression that is a precondition for the command S and postcondition Q may be used as a guard (i.e., as the condition when to execute the corresponding statement) for the conditional statement. This process has to be continued until the precondition implies that at least one guard (or the default case) is true.

• Given pre-, postcondition, and a loop invariant, an iterative statement may be developed as follows:

1. The loop invariant has to be established before the ﬁrst execution of the loop by appropriately initializing the involved variables.

2. The guard must be developed. A boolean expression whose negation in con-junction with the loop invariant implies the postcondition may be used as the guard.

3. Finally, the loop body is developed in such a way that it improves towards termination while reestablishing the loop invariant.

The problem, how to discover a loop invariant remains. However, in general one already has a certain algorithm in mind when developing a program. Writing down

Semantic Inspection of Software Artifacts From Theory to Practice