CHAPTER 7
There are several well known examples of delocalized plans, two of which are: debugging, where information indicating certain internal states are printed if some global variable is set to a specific value, and dirty-bit, in which a global variable is used to record whether or not some data being held in memory has changed since last read from disk, and whether the data should be written back to the disk before being erased from memory. The former plan is fairly simple to recognize, but the latter is not [Soloway and Letovsky, 1986]. Every procedure that changes the data will have a single line in it that seems to have nothing to do with accomplishing the goal of that procedure, because it sets the dirty bit.
The KBEDS CSIS provides various levels of support for detecting delocalized plans. First of all, most delocalized plans involve the use of global variables. The representation provides immediate access to all uses of any variable. Next, a library of recognition procedures can be built that recognize the elements of known delocalized plans. Currently the library can recognize the debugging, dirty bit, and the error logging plans.
The dirty bit plan has several attributes by which it can be recognized. There is a global variable that governs its behavior, which is typically a boolean. The variable is initialized to false, and is typically only ever changed to a false value in one procedure, a procedure that writes to a file. The variable is set to true in any number of procedures, all of which make other global changes. The code that implements a search for a dirty bit variable is fairly complex, and it should be mentioned that this search finds global variables that are potential dirty-bit variables - it is heuristic, not conclusive.
The dirty-bit plan is the only non-trivial delocalized plan in the KBEDS application. The scenario that follows illustrates how a maintainer would proceed in discovering a delocalized plan for which there was no pre-defined recognition function.
The maintainer is attempting to understand the method add-item-to-KB, which is a method of the data-type knowledge-base. A good place to start is the textual description of the method:
[METHOD-63], Add Item to KB, is a method attached to [DATA-TYPE-2: KBEDS Knowledge-Base], and all its subclasses: () The method has 1 parameter: ([PARAMETER-63: Item to Add]), no local variables, and returns a Void. The method is implemented as follows: Sends [METHOD-10: Add New Element] to [SLOT-3: Container]. Assigns [SLOT-23: KB-Changed?] <- [CONSTANT-62: True]. Returns [CONSTANT-55: VOID].The key here is that normally a maintainer will attempt to form an understanding of a section of code based on purely local information [Soloway and Letovsky, 1986]. With the KBEDS CSIS, the amount of local information available at the click of the mouse button is expanded tremendously. At this point in the analysis, it is clear what the message does, but not clear what the assignment does, or more accurately, it is not clear why the assignment is necessary. What is the function of the
KB-changed? slot? Clearly it is true if the KB changed, and false otherwise, but why does the program need to know this? The purpose of the slot is not obvious from looking at this section of code because the goal of this assignment statement is implemented elsewhere.
To determine the purpose of the assignment statement, the maintainer must find all the places where the variable is used. It may seem like the next step would be to check the accessed-by role in the slot, but consider what this role represents: it is the aggregation of every action in which the slot is passed as a parameter, changed in an assignment, read, or sent a message. Knowing when it is passed as a parameter probably will not reveal much, though a check would show that it is not used in this way. Inspecting the places where a variable is changed often gives you information about what the variable represents, but not why. In this case the maintainer has probably already made the assumption that the variable is set to True whenever the KB is changed. The question to be answered is, "Why does the program need to know when the KB is changed?"

Typically the answers to "why?" questions lie in the places where the slot is read. The maintainer brings up a role-filler window for the slot, shown in Figure 7.6, and is now presented with a view of the slot. Realizing that the read-by role of the slot is the most relevant, the maintainer clicks on that role in the window, and the role-filler window for the only action that reads the slot pops up, as shown in Figure 7.7.

The maintainer sees that this action is a switch in the method Save KB, and then gets a description of that method:
[METHOD-25], Save KB, is a method attached to
[DATA-TYPE-2: KBEDS Knowledge-Base], and all its subclasses: ()
The method has no parameters, no local variables,
and returns a Void.
The method is implemented as follows:
Switch on [SLOT-23: KB-Changed?]:
Case [CONSTANT-63: False]:
Returns [CONSTANT-55: VOID].
Case [CONSTANT-62: True]:
Sends [METHOD-24: Internal Dump KB to File] to [SLOT-24: File].
Returns [CONSTANT-55: VOID].
When the variable is true, the KB is written to a file, and when it is False, it isn't. The purpose of the slot has been revealed with a few mouse clicks. The information necessary to understand delocalized plans is actually localized by the representation.
The most common form of vestigial code is a change to a global variable that is not used to realize any plan, and as mentioned in the previous section, the realization of a plan can be found by inspecting the places where the variable is read. A vestigial variable, then, is one that is never read, and all variables that are not read can be found with the Classic expression (and changeable-instance (at-most 0 read-by)). Vestigial code is therefore any assignment statement that changes a vestigial variable, (and assignment (all changes (and changeable-instance (at-most 0 read-by)))).
Let us assume the KBEDS maintainer assigned this task is a novice to KBEDS. A logical first question to ask is, "What is a filter?" This is domain information, so the maintainer queries the KBEDS CSIS for all data-types that have the word "filter" in their name role (this can be done through the user interface by selecting Find Individual from the CSIS menu. This scenario involves many accesses to the user interface, so not every step will be displayed). This query turns up two data-types, information-filter and filter-action, and the trees resulting from the search are shown in Figure 7.8. The maintainer inspects the first data-type (by clicking on it in the tree window) and finds that it has two slots, antecedent whose value is an instance of the data-type mail-message, and consequent, whose value is an instance of the data-type filter-action. The maintainer also sees that while information-filter has no subclasses, filter-action has four.
From this information and the initial bug report, the maintainer probably concluded that an information filter is an object that under some conditions causes a mail message to be delivered in special ways. The next question to be answered is, "How does a filter get processed?" This is code-level information, and the maintainer sees that each of the two data-types has a method attached to it: information-filter has a method called check-for-activation and filter-action has a method called fire-filter. The maintainer remembers the bug report had to do with a filter for saving a message to a file not being activated, and decides to inspect the check-for-activation method.
Understanding methods is covered in greater detail in Section 6.4. There are many choices and paths a maintainer can take, and what may seem informative or intuitive to one may seem obscure to another [Redmiles, 1993]. This maintainer inspects the method and finds that all it does is compare the values in the slots of the mail message being processed to the mail message that is the filter antecedent (just a bunch of switches). If any slots match, the method fire-filter is called. The bug report mentioned that mail from a particular user wasn't being filtered correctly, and it is clear from the method that if the Sender slots match in the two messages, the filter should fire. The problem, then, must be in the way one of the two Sender slots gets its value.
Finding the places where a slot or variable gets a value is, of course, one of the key features of the KBEDS CSIS. The maintainer asks, "What are all the methods that change the sender slot?" (this simple query is really just a path: (changed-by implementation-of)). There is only one such method, register-sender.
The maintainer now asks, "How does the sender slot get its value in this method?" The answer to this question is simply the filler for the new-value role of the assignment statement that changes the slot, and this is a message invoking the method find-sender-by-email-addr.
The maintainer now asks, "What does that method return?" This is a question that can be answered in increasing levels of detail. The simplest answer is the data-type of what is returned, which is the filler for the method's has-return-data-type role. In this case the method returns an instance of valid-mail-sender (there is an example on page 17 of a maintainer using the domain model to understand this concept). The maintainer may well have already assumed that the sender slot of a mail message is going to be filled with a valid mail sender, so the next level of detail is to retrieve all the return values themselves, which can be accessed with the path (has-implementation return-value)[4]. There are two such values, one is the constant void, and the other is the variable current-kb-entry. Neither of these gives the maintainer any clues, so the next step is to examine the return statements themselves.

One way to do this is to pop up a role-filler window. Figure 7.9 shows the process up to this point, with the role-filler window for the return in the forefront. Here the maintainer sees something interesting. The return statement that returns current-kb-entry is the case-action of a switch. In other words, it is only executed under some condition, and the maintainer now asks, "What is that condition?"
The answer to this question is found in the fillers of the case-condition and switch-on-value roles of the switch-case and switch, resp. When these two values are equal, the case is activated and the control flow beginning with the case's case-action (which is the return statement the maintainer is trying to understand) is executed. The two values are the parameter email-address-string (which was passed into the method) and a message that invokes the method get-email-address on the variable current-kb-entry. The maintainer concludes that the current KB entry is returned only if its email address is the same as the email address of the sender of the message.
The next step would depend on the developer remembering that the problem had to do with a mail sender that had multiple email addresses, and the name of the method get-email-address seems to imply that it only considers one address. The maintainer seeks to verify this implication by asking the question, "What does get-email-address return?" This time, the answer to that question is very useful: a message that invokes the method first on the slot email-address-list. In other words, the find-sender-by-email-addr method only checks the first email address if there are more than one for a person. The bug has been found.
This scenario may seem tedious and drawn out, but consider how much was done by the KBEDS CSIS. Each question that the maintainer asked had an immediate answer available often at the click of a button. Without the KBEDS CSIS, the maintainer would have had to search through multiple source code files to find each of the methods that were being considered, tracing variable usage is very hard using conventional tools, and tracing only the places where the variable is changed is nearly impossible. Every part of the software was at the fingertips of the maintainer, and the complexity of some of the searches, like "What are the names of all the methods that change the sender slot?" is beyond any text-based search, and something previous SISs were not capable of.
The overwhelming majority of software discovery problems occur in systems that are very old, very large, and written in languages like COBOL, FORTRAN, and Assembler. Making use of the KBEDS CSIS requires starting over from scratch, and LaSSIE generated its, albeit simple, knowledge base automatically from a huge C code repository. This fact is without a doubt the most serious criticism of the CSIS approach: you must re-engineer an software system into this representation in order to reap the benefits, and it is questionable whether the gain would outweigh the cost. This topic is covered in more detail in Section 8.2.1, and there is some hope. The KBEDS CSIS representation is the necessary first step, it defines an ontology in which the code and domain models can be merged.
Generated with Harlequin WebMaker