Reflecting on Reflection

(A cheap title, I know)

Reflection is one of these concept that constantly move from “obvious” to “puzzling” in my head–you feel like you have a firm understanding of it, and then suddently it evades, you feel puzzled by certain aspects, and ultimately confused about the whole thing. Why?

Two Perspectives

One of the first source of confusion is that reflection can be apprehended with two perspective: code as data, and meta-objects.

The first one, “code as data”, corresponds to the ability to dynamically convert code of the running software into data, and to dynamically evaluate data representing code. The code and its representation are however not connected: modifications of the data have no impact on the running system until data are evaluated. This perspective is best represented with language of the lisp family.

The second one, “meta-objects”, corresponds to the ability to expose meta-objects that reifiy other aspects of the running system. Modifications of meta-objects impact the running system immediately since there is (ideally) a causal connection between the two. This perspective is best represented with language in the smalltalk family.

Understanding Reflection in Terms of Bindings

My take is that there are essentially two kinds reflective operations that can be understood in terms of bindings: (1) dynamic source code inspection and changes that reveal, update, or create bindings, and (2) dynamic code execution that exploit existing bindings.

Reflection can operate at two levels, the base and the interpreter level. The base level correspond to the program, and the interpreter-level to the execution of the program. (The interpreter-level is sometimes called meta-level) A program can reflect on its (a) own code and state, or (b) on the code of the interpreter and its state.

Category (1) corresponds to changes that could achieve by generating sources for the running software, persisting the state of the base and the meta level, modifing the source, restarting the software and restoring the state of the base and the meta level. Note that the state of the program might be transformed during the process. Category (2) corresponds to dynamic code execution that only exploit existing bindings. Unlike category 1, it can perform late binding using information known only at run-time, for which not source could be generated. Also, dynamic code execution can be used to jump across meta-level. Indeed, code at one level, is data for the level below. Note that when transfering data accross meta-level, data might be marshalled and unmarshalled.

Combining both categories (1)(2) and (a)(b) gives four kinds of reflective operations.

Both perspectives are equality powerfull, but each perspective represents a way of thinking about reflection that favors either (1) or (2).

Categorizing language features

Let us know walk through traditional reflective features and see how they can be categorized.

Enumerating fields — 1a
The program inspects its own source code. For class-based languages, enumeration is usually performed with a first-class class that reifies the structure and behavior of a class. Special language construct could exist for enumaration, think for instance of the for construct in Javascript.
Adding a field — 1a
The program modifies its own source code. It is similar to enumeration.
Reflectively updating a field — 2a
The program dynamically executes a statement that update an object, i.e. eval( "anObject.field = newValue"). The language might have special construst to do this, e.g. anObject at: "field" put: newValue in Smalltalk or might rely on first-class classes.
Reflective invocation of a method — 2a
The program dynamically executes a state that invokes the method, i.e. eval("anObject.message(params)"). The language might have special constructs to do this, e.g. anObject perform: "method" with: arguments in Smalltalk or might rely on first-class classes.
#doesNotUnderstand trap — 1b
The #doesNotUnderstand trap is a reification at the application level of code that run at the interpreter level and modify the semantics of message sends. This is conceptually equivalent to source code changes in the interpreter itself. Namely, instead of raising an error, message that can’t be delivered are for instance re-routed to another destination.
Method lookup — 1b
Certain platform have other traps to customize method lookup and dynamic method dispatch, e.g. DLR, Java, Smalltalk/X. They are similar to #doesNotUnderstand.
thisContext meta-object — 2b
the interpreter’s stack is reified at the applicative level via a meta-object. It corresponds conceptually to application code being evaluated at the interpreter level, so that it can inspect or modifies the internal state of the interpreter. The code of the application correspond to data that are evaluated one level below.
Continuations — 2b
Continuations are similar to thisContext reification. The interpreter evaluates code which can access its internal state, and returns a data structure (the continuation) to the application; or it evaluates code that consume a data structure (the continuation) that is used to update its internal state.
First-class function — 2a
First-class functions are gloryfied dynamic reflective invocations. A method, block, closure, function represents code. When invoked, it creates a new activation frame on the stack. Regular methods can be reified to be passed around for reflective invocations, e.g. aMethod.invoke( target, param ). Methods can also be simply reified as strings: target.perform( methodName, param ) or eval( target + "." + methodName + "(param)" ). A function that does not close over any variable is mostly an anonymous method. The main difference is that its reification as a first-class object already binds it to a given target, e.g. aBlock.value( params ). Functions with bound variables refer additionally to their outer context which is an activation frame.
Modification of behavior — 1a
Modifying or addition behavior to existing objects correspond to source code change. New behavior can be loaded by evaluation code that return first-class function, e.g. eval( "[ 40+2 ]" ). Behavior can be added or modified by changing the corresponding reified entity such as method dictionary via first-class classes. The whole process might be exposed via other interfaces. For instance, Java’s Hotspot accept raw bytecode to replace a method.
Dynamic class loading — 1a
Clearly, it correspond to dynamic source code changes. Dynamic class loading can be see as a one shot operation, or split into class creation, and incremental addition of structure and behavior. Class loading is a reflective operation that might be exposed via first-class classes (e.g. Smalltalk’s Class>>subclass:name:fields), or via specific interfaces (e.g. class loader).
Query all instances of a class — nothing
If the class is provided as a literal, e.g. MyClass.class, there’s nothing really reflective going on. It could be considered as reflective if it is eposed via first-class classes, and reifies internal information of the interpreter.
Eval — 1a and 1b
Since reflection is all about treating code as data, eval is the one superpower. We have shown in the list above several eval of types 1a and 1b. Its type depends on the data being evaluated. In its unrestricted form, it can be a mix of both.

Structural and Behavioral Reflection

Reflection is sometimes split into structural and behavioral. Structural reflection deals with reifying structure (classes and methods as objects), while behavioral reflection deals with reifying execution (#doesNotUnderstand, thisContext). Behavioral reflection map to category (b) and structural reflection to category (a).

Correlated use of Reflection

Interestingly, if an application does not leverage operations of kind 1a, it does not need 2a neither. Use of reflection (1a) and (2a) are correlated. Indeed, if reflection 1a is never leveraged, source never change, which means the entites are all known statically. While it might be usefull for consiceness to rely on operation of type 2a, such operation could be translated in an equivalent forms with non-reflective code. For instance, if methods are never added to a system, dynamic reflective invocations could be avoided with proper use interfaces, and exhaustive if-else cases (if( action="doIt" ) { doIt() } else if (action ="doThat" ) { doThat() } ...).

Conclusion

We tried to make sense of reflection and categorize existing features in a simple framework, and tried to relate two perspectives about reflection: code as data, and meta-objects. Going through a list of typcial features shows the advantage and disadvantage of both perspectives. For instance, reflective invocations are best understood in term of code as data, but stack reification is more easily though in term of meta-object. Ultimately, features can be explained in term of both perspectives, possibly with some mind twists.

references

my citeulike #reflection tag