CHAPTER 6
Consider, for example, a maintainer engaged in discovery on a large C program. The maintainer is looking at a particular line in one of the source files for this program, and sees a variable. Most information about this variable will not be local, that is it will not be contained in the next or previous line of the source text. In many cases, the information won't even be visible on the screen - again, only control flow is local. Information about the variable like its data-type, the slots, method, or superclasses of that data-type, the places where the variable is accessed, changed, etc., is someplace else. Thus we say this information is delocalized in the programming language source code.
In the KBEDS CSIS representation, specifically the code-level ontology, all this information can be made local. In the source text only control flow is local, but in the code-level ontology control flow is just another role. The data-type of a variable (as well as references to the variable, changes, etc.) are represented in such a way that they are as easy to find as the next action in the control flow.
This chapter will cover the parts of the ontology that serve to further localize useful information, and the ways in which the user interface can facilitate access to this information. Specific scenarios in which this information is used to recognize vestigial code and a delocalized plan are found in Section 7.2.1.1 and Section 7.2.1.2.
It is important to note that all this localization comes as a result of the representation, though a developer is not required to specify any more information than would be represented in a conventional programming language.
For the code-level ontology it is important to note that it is not the invocation of a procedure that is part of the decomposition, it is the procedure that is invoked itself. Therefore the same procedure may appear in the decomposition many times. All procedure invocations within the implementations of the second-level procedures make up the decomposition of the procedure, and are the third level of decomposition of the program.
Clearly determining the actual decomposition of a system would be a tedious process for a maintainer to engage in. If a software system has been represented using the code-level ontology described here, the decomposition of a program or of any method is computed automatically with one simple rule:
code-block--> has-decomposition (has-implementation call-method)
has-implementation role.
This rule may seem odd, however, because there are many kinds of code-level actions that could fill the has-implementation role: assignments, switches, returns. None of these kinds of actions invoke methods themselves (they may do so indirectly, by using the return value of a message as a value, but this is covered in the discussion of hidden methods in Section 4.2.2). We need a way to say "get only the fillers for the has-implementation role that are individuals of message, and for each of those get the fillers of their call-method role." This is taken care of by the path facility (discussed in Section 2.2.4.2), since every code-level concept in the ontology except message has an (at-most 0 call-method) restriction. When the path facility gets to an intermediate individual in a path that has no value for the next role in the path, it simply ignores that individual. This rule, then, will ignore individuals of other actions because there can be no fillers for the call-method roles of any actions other than messages.
This rule will generate one level of decomposition. The complete decomposition, or more accurately, a list of every method in the decomposition of a code-block, could be derived by adding an immediate and inherited subrole of has-decomposition, modifying the above rule to infer has-immediate-decomposition instead of has-decomposition, and adding:
code-block--> has-inherited-decomposition (has-immediate-decomposition has-decomposition)
self-variable--> has-data-type (self-variable-of method-of)
Again, we see an example of how localizing some information helps to improve discovery. In this case we are dealing with information that would be extremely delocalized in source code (every message that invokes a method).
For example, an assignment statement can be described as giving a new value to the variable (or slot) that fills its changes role. That new value is whatever fills its new-value role. Again, as with the rule of thumb for self-variable data-types, this is too complex for a simple rule or path expression, because it involves the combination of two different paths into one filler. In other words, for assignment we want to fill the description role with a string of the form: "Assigns <variable> <- <new-value>," where:
> is the name of the variable (or slot) that fills the changes role of the assignment. This name is not the Classic name of the individual that represents the variable, but a string which fills the name slot. In other words, it is the string at the end of the path (changes name).
software-value that fills the new-value role of the assignment. The path for this string is (new-value name).

A more concrete example of this inference is shown in Figure 6.1. The assignment assignment-01 changes variable-11, giving it the value of a constant (constant-08). The name of variable-11 is "counter" and the name of constant-08 is "zero". These two strings are concatenated together to form the string "Assigns counter <- zero", which is derived by the function to be the filler for the description role of the assignment.
The Classic code for the general rules that derive the descriptions of each kind of action is given in Appendix C, and the LISP code for the functions that actually derive these description strings is shown in Appendix D. Briefly, this is what the functions compute for each action type:
message: "Sends <method> to <instance>", where <method> is a string at the end of the path (call-method name), and <instance> is a string at the end of the path (send-to name).
return: "Returns <value>", where <value> is a string at the end of the path (return-value name) or (return-value description). Note again that a software value can be a data-type instance or a message, the key role of the former is name, but for the latter it is description.
switch: "Switch on <value>", where <value> is a string at the end of the path (switch-on-value name) or (switch-on-value description).
select-switch-case: "Case <condition> of <switch>", where <condition> is a string at the end of the path (case-condition name), and <switch> is a string at the end of the path (case-of description).
pass-parameter: "Pass <value> as <parameter> to <method>", where <value> is a string at the end of the path (argument-value name), <parameter> is a string at the end of the path (pass-as name), and <method> is a string at the end of the path (passed-from call-method name).
describe-implementation function takes over and simply prints the description of each statement in the implementation of the method, in order of the control flow. When it reaches a switch, it pursues each resulting branch in the control flow to the end before proceeding to the next. When it reaches a return or detects a loop, it terminates the current branch. The simplicity of these functions is made possible by the representation, a point that will be stressed again in Section 6.2. Below is the sample output of the describe-method function when applied to a method in the KBEDS Application (a full description of the domain and the meaning of the objects in this description are given in Chapter 5). Even without a complete explanation, the description reads fairly clearly:
[METHOD-3], Classify Message, is a method attached to [DATA-TYPE-6: Mail Message],
and all its subclasses: ([DATA-TYPE-76: Message to Group] [DATA-TYPE-57: Message
to Person]).
The method has 1 parameter: ([PARAMETER-12: Knowledge Base]), 2 local variables:
([SELF-VARIABLE-14] [VARIABLE-1: Recipient email string]), and returns a Void.
The method is implemented as follows:
Assigns [VARIABLE-1: Recipient email string] <-
[MESSAGE-11: Sends [METHOD-57: Find Recipient Field in String] to
[SLOT-12: Header]].
Assigns [SLOT-57: Recipient] <-
[MESSAGE-13: Sends [METHOD-13: Find Recipient by email] to
[PARAMETER-12: Knowledge Base]].
Switch on [MESSAGE-14: Sends [METHOD-58: Void?] to [SLOT-57: Recipient]]:
Case [CONSTANT-63: False]:
Sends [METHOD-59: Classify Individual] to [SELF-VARIABLE-14].
Switch on [MESSAGE-16: Sends [METHOD-4: Valid Message?] to [SELF-VARIABLE-14]]:
Case [CONSTANT-62: True]:
Sends [METHOD-5: Process Message] to [SELF-VARIABLE-14].
Returns [CONSTANT-55: VOID].
Case [CONSTANT-63: False]:
Returns [CONSTANT-55: VOID].
Case [CONSTANT-62: True]:
Assigns [SLOT-14: Error] <- [CONSTANT-14: No Such Recipient].
Returns [CONSTANT-55: VOID].
data-slot hierarchy, has an inverse. Figure 3.6 on page 66 shows the seven role hierarchies in the ontology. While any of the role inverses can potentially provide information that is useful during discovery, the most significant of these hierarchies is accessed-by. The accessed-by roles are all the inverses of the roles that link code-level actions to software values. This is at once an immensely simple and immensely powerful notion, and is the place where the most localization of information takes place.
There are basically four kinds of access to a software-value: reading the value, passing a value to a parameter of a method, changing the value (not all individuals of software-value can be changed, only individuals of changeable-instance), and sending a message to the value (only individuals of data-type-instance).
Reading a software value, represented by the reads sub-hierarchy (the inverse of read-by), occurs when the value is used in an action with any of these roles: case-condition, return-value, new-value, switch-on-value, or argument-value. These roles must be filled in by the developer in order for the various actions to be complete. The inverses of these roles are the roles listed at the bottom of the read-by hierarchy shown in Figure 3.6 on page 66.
An example is shown in Figure 6.2.
The developer has implemented a method and has filled the roles represented by solid lines, and Classic has derived all the rest through inverses and the role hierarchy. Consider the significance of all this derived information during discovery - Classic automatically keeps track of every access to a value. For a particular variable, a maintainer simply has to retrieve the values that fill the accessed-by role to see every place the variable is used. To restrict this list to only the places where the variable is changed, the maintainer retrieves the fillers for the changed-by role in the variable, as shown in Figure 6.2, the read-by role records every place the variables value is used.
All this added information, all this localization of information, comes essentially for free in this representation - that is, no extra work is required on the part of the developer to provide this cross-referencing, yet it is tremendously useful. Understanding a variable in a program involves determining how it is used, and the first step in that determination is finding where it is used and in what ways (read, change, etc.). In source code, especially when dealing with global variables, information about how a variable is used is very delocalized, and this is typically the source of delocalized plans.
Most derived information by far comes from role inverses and the role hierarchy, and that should be clear from looking at Figure 6.2 (and all the other figures in this document where these role inferences are shown), where each told role results in five derived roles. Rules typically result in the addition of a single link.
It should be clear that the KBEDS CSIS representation solves this problem and localizes inherited information. Each data-type of a variable (there can be only one immediate data-type, but there may be many inherited data-types) is in the role has-data-type. Examining the immediate data-type of a variable (again, there is only one) will show all the slots (including the inherited ones) in the has-slots role, and all the methods in the has-methods role. In addition, the next section describes mechanisms which can put all this information in graphical form on the screen for any object at the click of a button.
Generated with Harlequin WebMaker