Objects have not failed

Copyright 2002 by Guy L. Steele Jr.. All rights reserved.

(Opening remarks by Guy L. Steele Jr., November 6, 2002)

Objects have clearly succeeded.

Here is some practical evidence: According to the most recent 2002 North American Developer Survey by Evans Data Corporation, over half the developers surveyed are using Java. About 1/7 of developers surveyed are using C#. Roughly half of those also use Java, so the fraction of surveyed developers using one or both is about 3/5. The numbers for both languages are expected to increase next year. Over 1/5 of developers surveyed are using Enterprise Java Beans; almost 2/5 are using COM+; and nearly 2/3 are using JavaScript (which at least tries to be object-oriented).

The main strengths of object-oriented programming are that it encourages the abstraction and encapsulation of state, and that objects are a good model for most entities in the real world.

Thirty years ago, most programming was procedural in nature. The unit of programming was the procedure, the subroutine, the function, the algorithm. Data declarations tended to be strewn about and were not abstracted. An array of integers might be any of several conceptual data structures, and it was not always apparent which of the many procedures accepting an integer array argument were actually intended to apply to that particular array. So it was necessary to document data structures fairly carefully, outside the programming language–in comments, for example.

Object-oriented programming clusters data with code that is appropriate and relevant to that data. The trade-off is that sometimes it is difficult to grasp and to envision entire algorithms, because the fragments of an algorithm are spread out among the many methods associated with various object types. So in object-oriented programming it is necessary to document methods fairly carefully, in comments, perhaps, but also in interface declarations.

Fred Brooks, in Chapter 9 of The Mythical Man-Month, said this:

Show me your flowchart and conceal your tables, and I shall continue to be mystified. Show me your tables, and I won't usually need your flowchart; it'll be obvious.

That was in 1975.

Eric Raymond, in The Cathedral and the Bazaar, paraphrased Brooks' remark into more modern language:

Show me your code and conceal your data structures, and I shall continue to be mystified. Show me your data structures, and I won't usually need your code; it'll be obvious.

That was in 1997, and Raymond was discussing a project coded in C, a procedural language. But for an object-oriented language, I think this aphorism should be reversed, with a twist:

Show me your interfaces, the contracts for your methods, and I won't usually need your field declarations and class hierarchy; they'll be irrelevant.

I think, however, that practitioners of both procedural and object-oriented languages can agree on Raymond's related point:

Smart data structures and dumb code works a lot better than the other way around.

This is especially true for object-oriented languages, where data structures can be smart by virtue of the fact that they can encapsulate the relevant snippets of "dumb code." Big classes with little methods–that's the way to go!

The Scheme programming language was born from an attempt in 1975 to explicate object-oriented programming in terms that Gerry Sussman and I could understand. In particular, we wanted to restate Carl Hewitt's theory of actors in words of one syllable, so to speak. One of the conclusions that we reached was that "object" need not be a primitive notion in a programming language; one can build objects and their behavior from little more than assignable value cells and good old lambda expressions. Moreover, most of the objects in Hewitt's theory were stateless and unchanging once created; for those, lambda expressions alone were sufficient.

That was a useful theoretical observation–and not original with us, though Scheme did help to spread the word–but it was not a good guide to designing practical programming languages. Soon both Scheme and Common Lisp felt the pressure to graft on facilities to make it easy, not merely possible, to program in an object-oriented style. A major source of this pressure was the displacement of character streams by windows as a model of terminal-screen interaction–this was made practical and desirable by the advent of high-resolution bit-mapped displays–but programmers quickly grasped the value of object-oriented programming for other purposes.

As I observed 20-odd years ago in my paper Lambda: The Ultimate Declarative, part of the value of object-oriented programming is that while it may be difficult to add a new method interface to a mature set of classes–at least, using an ordinary text editor, as was common practice then and still is today–because many individual method declarations must typically be coded, inheritance notwithstanding, and inserted into each relevant class, whereas it is comparatively easy to create a new class of object, a new data type. Procedural programming is just the opposite, the dual; it's easy to add a new procedure, but it can be difficult and time-consuming to introduce a new data type into a mature set of procedures because code must be inserted into each relevant procedure.

The question is, then in today's practice, is it more common to introduce new universal methods or new universal data types as a system is maintained? (I say "universal" to mean something that is widely used throughout a system. A universal method, such as equal or toString, is supported by many types of object, and a universal data type, such as String or Vector, must support most universal methods.) I assert that new universal object types arise more frequently than new universal methods–this is a consequence of Raymond's "smart data, dumb code" principle–and this is one reason that object-oriented programming has proved to be so successful: It reduces the effort of program maintenance when working with inadequate program development tools.

Another weakness of procedural and functional programming is that their viewpoint assumes a process by which "inputs" are transformed into "outputs"; there is equal concern for correctness and for termination (and proofs thereof). But as we have connected millions of computers to form the Internet and the World Wide Web, as we have caused large independent sets of state to interact–I am speaking of databases, automated sensors, mobile devices, and (most of all) people–in this highly interactive, distributed setting, the procedural and functional models have failed, another reason why objects have become the dominant model. Ongoing behavior, not completion, is now of primary interest. Indeed, object-oriented programming had its origins in efforts to simulate the ongoing behavior of interacting real-world entities–thus the programming language SIMULA was born.

Now, objects don't solve all the problems of programming. For example, they don't provide polymorphic type abstraction (that is, generic types). They don't provide syntactic abstraction (that is, macros). Procedural programming still has its place in the coding of methods. But to say that objects have failed because they don't solve all possible problems is like saying carbohydrates have failed because you can't live on pure sugar. Object-oriented programming is like money, as the old joke has it: It's not everything, but it's way ahead of whatever's in second place.

If you are an idealist, you may be disappointed with the current state of the object-oriented programming art. The ongoing evolution of object-oriented programming has not reached completion and perhaps never will. I do not claim that Java or C# is the apotheosis of object-oriented programming languages.

As for C++–well, it reminds me of the Soviet-era labor joke: "They pretend to pay us, and we pretend to work." C++ pretends to provide an object-oriented data model, C++ programmers pretend to respect it, and everyone pretends that the code will work. The actual data model of C++ is exactly that of C, a single two-dimensional array of bits, eight by four billion, and all the syntactic sugar of C++ fundamentally cannot mask the gaping holes in its object model left by the cast operator and unconstrained address arithmetic.

If, several years ago, with C++ at its most popular, with Smalltalk in decline and Squeak yet to appear, you had come to me, O worthy opponents, and proclaimed that objects had failed, I might well have agreed. But now that Java has become mainstream, popularizing not only object-oriented programming but related technologies such as garbage collection and remote method invocation, and now that the utility of object-oriented programming has been seconded by the sincere flattery of C#, we may now confidently assert that objects most certainly have not failed.