Why Roles

Programming and Knowledge
Roles of Variables
Uses of the Role Concept

Programming and Knowledge

Knowledge about computer programming covers the following three categories:

programming language knowledge	the syntax and semantics of some certain language (e.g., how an assignment statement is written and what effect it has)
program knowledge	knowledge about a specific program
programming knowledge	how to construct programs from abstract concepts within the programming paradigm in use (e.g., objects, variables, inheritance, iteration etc. in object-oriented programming)

For a programmer, the most important type of knowledge is programming knowledge. However, teaching programming to novices has traditionally concentrated on:

programming language syntax and semantics,
some specific abstract concepts directly related to programming languages, e.g., recursion, and
example programs

As a result, students have had to mentally construct programming knowledge from the other types of knowledge by themselves.

Roles of variables are programming knowledge that can be explicitly taught to students. Roles are easy to adopt in teaching, too: in one of our studies, computer science teachers learned in less than an hour to recognize roles in their typical uses with 90 % accuracy. Roles can also be used in analyzing large-scale programs making roles a useful concept for experts, also.

As an example of the effects of using roles in teaching, consider the following diagrams that depict results of a classroom experiment in an elementary programming course. For the experiment, students were divided into three groups that were instructed differently: in the traditional way in which the course had been given several times before, i.e., with no specific treatment of roles (the traditional group); using roles throughout the course (the roles group); and using roles together with the use of a role-based program animator, PlanAni (the animation group).

Students' mental representations of programs were different among the groups as depicted by the figure below. Students in the traditional group described programs either in program terms or in application domain terms. This contrasts with students that had been given role knowledge who summarized programs by describing the connection between program constructs and application domain concepts ("cross-referenced"). Thus role knowledge enabled them to think about programs in a new and better way.

The following figure depicts the fluency of programming ("forward development") among students in the course. As shown in the figure, knowledge of roles improved programming skill and especially the elaboration of this knowledge with role-based animation resulted in superior performance.

This page gives a short introduction to the role concept and discusses various ways of utilizing it.

Roles of Variables

In programming, variables are not used in a random or ad-hoc way but there are several standard use patterns that occur over and over again. For example, consider the following program:

    program doubles;
    var data, count, value: integer;
    begin
        repeat
            write('Enter count: '); readln(data)
        until data > 0;
        count := data;
        while count > 0 do begin
            write('Enter value: '); readln(value);
            writeln('Two times ', value, ' is ', 2*value);
            count := count - 1
        end
    end.

In this program, there are three variables: data, count, and value. In the first loop, a user is requested to enter the number of values to be later processed in the second loop. This number is requested repeatedly until the user gives a positive value. The variable data is used to store the latest input read, and there is no possibility for the program to guess what values the user will enter.

The variable value is used similarly in the second loop: it stores the latest input, and there is no known relation between its successive values.

The variable count, however, behaves very differently. Once it has been initialized its future values will be known exactly: it will step downwards one by one until it reaches its limiting value, i.e. zero.

The role concept captures this difference in the behaviour of these variables. The variables data and value are said to have the role most-recent holder (as they store the latest value in some value succession - user input in this case), and the variable count is said to be a stepper. These roles occur in programs again and again. In fact, only ten roles are needed to cover 99 % of all variables in novice-level programs.

Roles are not a special property of procedural programming but they apply to other programming paradigms, also. For example, consider the following Java class:

    public class Dog {
        String name;
        int age;
        public Dog (String n) {
            name = n;
            age = 0;
        }
        public void birthday () {
            age++;
        }
    }

Objects of this class have two attributes: name and age. The value of the attribute name does not change after initialization; it is a fixed value. The attribute age behaves similarly to the variable count in the Pascal program: it steps through a known sequence (1, 2, 3, ... in this case) and its role is stepper.

In order to see that the attribute age is increased repeatedly, we must assume that the method birthday is called repeatedly. This can be verified only by looking at other classes: do they call the method at all, and if so, does it happen several times. In object-oriented programming control flow is harder to reveal than in procedural programming; consequently, roles of attributes may be harder to find out than roles of variables in procedural programming. This does not, however, mean that roles would be less important in object-oriented programming. On the contrary, explicit role information in the form of comments written by the author of a program may help in program comprehension, e.g., in maintenance tasks; knowing that a variable is, say, a stepper and not a fixed value, indicates the succession of values it may obtain and makes program understanding easier.

In procedural programming roles apply to variables and parameters. In object-oriented programming roles apply also to attributes and objects that encapsulate a single conceptual attribute, e.g., String in Java. In functional programming there are no variables. However, function parameters as well as return values of recursive functions have role-like behavior. For example, consider the following ML function:

    fun max(a, nil)     = a
    |   max(a, (h::t))  = if h>a then max(h,t)
                                 else max(a,t)

During recursive calls, the parameter h is the current element of the list, i.e., a most-recent holder. The parameter a is always the largest value found so far -- its role is most-wanted holder, etc.

Thus roles apply to procedural, object-oriented, and functional programming but the set of entities that have roles is different in different paradigms. In the following, the term "variable" is used for brevity to cover all the different cases in different programming paradigms.

For a more detailed description of the role concept, see the introduction to roles page and a separate page listing all the roles.

Uses of the Role Concept

Roles of variables can be utilized in many ways. This section lists some examples. Some of these ideas have been implemented but some of them are just research ideas. See the literature page for the current state.

Teaching programming to novices: Variable roles are programming knowledge that has traditionally been tacit. However, it can be made explicit and thus help students to understand the ways variables are used in programs. See a separate page on teaching.

Explaining errors in novice programs: Roles of variables describe how variables are used. Any deviation from standard use patterns can be an indication of an error. Automatic error analysis of novice programs can use such deviations to build explanations for errors.

Program visualization: Traditional program animation systems provide visualizations that operate on programming language level resulting in within-paradigm visualizations that are uninformative to students. Roles of variables can be used to provide role-specific images for variables and role-specific animation for operations resulting visualizations in the programming level (as opposed to programming language level). As a result, such visualizations are able to provide information about the program - and not about the language only.

Comprehension of large-scale programs: Roles can be used to characterize variables for maintenance programmers trying to comprehend existing code. Thus the applicability of roles is not limited to novice programmers but they can be utilized by experts, also.

An introductory article

Last updated: October 26, 2005

saja.fi@gmail.com