Some time ago, I discussed how the static chain pointer is used to access local variables of lexical parent functions, for languages like Pascal that support it. In a later article, I discussed how the mysterious second parameter to the x86 ENTER
instruction accomplishes the same task, but in a different way.
There is yet another popular way to implement this feature, known as the display, a name that comes from the final paragraph of the paper Recursive Programming by Edsger Dijkstra.¹
The idea is that you have a reserved array of pointers known as the display, its size equal to the maximum depth of lexically scoped functions. If a function has inner functions, then it stores a pointer to its stack frame in the display at an index corresponding to its depth, where the “depth” of a function is the number of enclosing lexical parent functions. For example, a global function has depth zero since it has no lexical parents. When the function returns it restores the original value to the display slot.
Let’s walk through this design with our example.
function Outer(n: integer) : integer; var i: integer; procedure Update(j: integer); begin i := i + j end; procedure Inner(m: integer); procedure MoreInner; begin Update(m) end; (* Inner body begins here *) begin MoreInner end; (* Outer body begins here *) begin i := 0; Inner(n); Outer := i end;
The function Outer
is not lexically enclosed in any other function, so its depth is zero. Its lexical children are then assigned a depth depending on how many frames they are away from that starting point.
Outer | Depth 0 | |||
↓ | ↘ | |||
Update | Inner | Depth 1 | ||
↓ | ||||
MoreInner | Depth 2 |
Let’s walk through a call to Outer
.
Initially, the display looks like this:
0 | mystery0 |
1 | mystery1 |
2 | mystery2 |
The Outer
function saves the value in display slot zero (mystery0
) and replaces it with a pointer to its stack frame.
0 | Outer frame ← |
1 | mystery1 |
2 | mystery2 |
When Outer
calls Inner
, the Inner
procedure saves the value in display slot one (mystery1
) and replaces it with a pointer to the Inner
stack frame.
0 | Outer frame |
1 | Inner frame ← |
2 | mystery2 |
Next, Inner
calls MoreInner
. This time, MoreInner
has no inner functions of its own, so it doesn’t have to do anything with the display.
Now things get interesting. How does MoreInner
load the value of m
? Well, m
is a parameter in Inner
‘s stack frame, and Inner
is at depth one, so MoreInner
reads the pointer at index 1 in the display pointers and uses it to read m
from the Inner
stack frame.
Okay, now we have a call to Update
, which is at depth one, so it saves the value in display slot one (which is a pointer to Inner
‘s frame) and replaces it with a pointer to Update
‘s frame.
0 | Outer frame |
1 | Update frame ← |
2 | mystery2 |
To find the i
variable in Outer
‘s frame, Update
realizes that Outer
is a depth zero function, so it looks for the pointer in display slot zero (which is a pointer to Outer
‘s frame) to locate the i
variable and add j
to it.
Now that the work is done, Update
restores the saved value to display slot one (Inner
frame) and returns to MoreInner
.
0 | Outer frame |
1 | Inner frame ← |
2 | mystery2 |
If MoreInner
had more work to do, it could use display slots 0 and 1 to access the variables of Outer
and Inner
. But it doesn’t. So it just returns.
0 | Outer frame |
1 | Inner frame |
2 | mystery2 |
Similarly Inner
is finished and restores its saved value to display slot 1 before returning.
0 | Outer frame |
1 | mystery1 ← |
2 | mystery2 |
Finally, Outer
fetches the value from its i
variable (which Update
conveniently updated) and uses it as the return value. Before returning, it restores display slot zero.
0 | mystery0 ← |
1 | mystery1 |
2 | mystery2 |
Notice that whenever a depth N function is active, only the first N+1 display slots are meaningful.
Originally, the display was a global variable with a fixed number of slots, which caps the depth of lexically nested functions.³ Accessing a variable from an outer frame requires only one added indirection (through the display). It does cost a load from a global variable, which depending on your processor architecture might be easy (if your processor supports absolute addressing or instruction pointer-relative addressing with a large enough reach), or annoying (if you have to calculate the address separately).
Making the display global precludes multi-threaded operations. Most implementations have the function at depth zero allocate the display in its own frame, then have all inner functions receive the address of that display as a hidden parameter.
¹ Reading that document has a similar feeling to reading centuries-old scientific documents.² The terminology and style of writing is so different that you spend a good amount of mental effort translating the text into modern terminology: “The program for subroutine A can only be executed correctly if the current value of the parameter pointer is at the disposal of the arithmetic unit, in other words, the parameter pointer can be regarded as one of the aspects of the state of the arithmetic unit, and it must therefore be recorded amongst the link data.” Translation into modern language: “The frame pointer must be kept in a CPU register and stored in the stack frame.”
² For example, here’s Proposition IV. Theorem IV from Isaac Newton’s Principia: “The centripetal forces of bodies, which by equoble motions describe different circles, tend to the centres of the same circles; and are one to the other, as the squares of the arcs described in equal times applied to the radii applied the circles.” In modern terms, this is saying that centripetal acceleration is toward the center of the circle and is proportional to s²/r, where s is the distance travelled over a fixed period of time, and r is the radius.
And that’s one of the easy ones! Corollary 6 reads, “If the periodic times are in the sesquiplicate ratio of the radii, and therefore the velocities reciprocally in the subduplicate ratio of the radii; the centripetal forces will be in the duplicate ratio of the radii inversely: and the contrary.” Good luck with that.
³ If each module provides its own display, then you can have each module’s display be exactly the size of the deepest lexical nesting in that module.
0 comments
Be the first to start the discussion.