Deliberate Writing

Richard P. Gabriel

This paper was published in 1983

Introduction

Deliberate writing: When I sit down to write several paragraphs of English text for an audience which I cannot see and which cannot ask me questions, and if I care that the audience understands me perfectly the first time, I am engaged in 'deliberate writing.' Such writing is careful and considered, and for a computer to write deliberately many of the outstanding problems of artificial intelligence must be solved, at least partially.

I want to contrast deliberate writing with spontaneous writing and with speech. For the remainder of this presentation I will use the term 'writing' to refer to deliberate writing, and I will use the term 'speech' to refer to both casual writing and speech.

Vivid and Continuous Images

Whether you are writing fiction or non-fiction, good and careful writing has two important qualities: it must be vivid and it must be continuous. I have borrowed these terms from John Gardner [Gardner 1984], who wrote of them in the context of fiction writing, but I think they are appropriate in non-fiction writing as well.

In a vivid piece of writing the mental images that the writer presents are clear and unambiguous; what the writer writes about should appear in our 'mental dream' exactly as if we ourselves were thinking the thoughts he is describing. When the writing produces this clear image we can absorb what he writes with little effort.

In a continuous piece of writing there are no gaps or jumps from one topic to another. The image that is produced by the writing does not skip around. In non-fiction, especially in technical writing, the problems and questions we have about the subject are answered as soon as we formulate them in our minds. That is, as we read a piece of technical writing we are constantly imagining the details of the subject matter. Sometimes our image is confused because we are not sure how some newly presented detail fits in, or we are uncertain of the best consistent interpretation. At this point the writer is obligated to jump in and settle the matter or provide a clarification. This way we do not have to stop and think, or go back to re-read a passage or some passages.

Insofar as our image must be vivid, it must also be continuous. If our image is discontinuous it cannot be vivid - it is blurred or muddy at the point of discontinuity. Similarly, if our image is not vivid it must be discontinuous - we are apt to stop and wonder about the source of blurriness, and at that point our image stops being continuous.

Computers and Writing

I believe that writing is the ultimate problem for artificial intelligence research. Among the problems that must be solved for a computer to write well are: problem-solving, knowledge representation, language understanding, world-modeling, human-modeling, creativity, sensitivity, and judgment.

Problem-solving is important because some aspects of writing require the writer to place in a linear order facts or other statements which describe an object or an action that is inherently 'multi-dimensional.' The order of the facts and the techniques that prevent reader-worry about yet-unpresented facts is as difficult a planning task as any robot planning problem.

Knowledge representation is important for being able to find and refer to facts about a topic rapidly and accurately. The interconnections in the writer's mind between facts must be such that connections that the reader will see are apparent to the writer. If a detailed and complex search must be undertaken by the writer to discover relevant connections, it may be that they will be missed, and the vivid and continuous image will be lost.

Language understanding is important because a human writer will re-read his writing in order to test its effectiveness. Later in this presentation I will talk about this more thoroughly and speculatively.

World-modeling is important because a writer must understand the consequences of his statements; if he talks about some aspect of the world or chooses to use a metaphor or an analogy, he must think carefully whether the correspondence between his subject and the metaphor or analogy is accurate, and whether consequences of his metaphorical or analogical situation are impossible or ridiculous.

Creativity, sensitivity, and judgment fall into the category of things that artificial intelligence has never really looked at seriously. To write with good taste, and, hence, effectively, requires the writer to write in new and interesting ways, to be sensitive to the sore spots that his reader might have, and to judge what is important and useful for his readers.

Aspects of Good Non-Fiction Writing

If I expect you to understand my non-fiction writing without problems, I must do two things: I must anticipate what you know about the topic of discussion, and I must anticipate the problems you will have comprehending how my sentences and paragraphs are constructed. As you read from left-to-right, every word must fit in properly; you must never be forced to re-read parts already seen, and you must never have to reflect on my sentences. The text must be transparent.

These two aspects form the ends of a spectrum of concerns that a writer who cares about good writing must consider each time he writes. At one end is the correct decision about what is shared information, and at the other end is the effortless transmission of new information and relationships between facts. I will illustrate these two aspects with an example.

Consider writing the directions on how to get from one place to another in a car. When I tell you how to get to my house, I must know how much you know about the area; I must be certain you know where the Locust Street Eisner's is. If you do not live in the area, then perhaps the specific landmarks I use will be impossible for you to recognize. But if you do live in the area, I can use phrases like, "go to the stadium on Welch Road, and then ...." In short, I must carefully reason about what shared information we have about the area and also about what information you will learn while you are traveling through the area following my directions.

If I have tried to explain the directions to you in the past, then I can refer to that conversation or to that document. In short, there can be some common context and shared information about my explanation. My writing of the directions to you must accurately refer to the knowledge I am sure you have. If I refer to something that you don't know or to something that you could find out with some difficulty as if it were something you knew, then my directions would be bad.

At the other end of the spectrum, I must anticipate where along the trip you will become uncertain that you are on the right track. If there is a long stretch of road to traverse after several tricky turns, I must tell you sights that will alert you that all is well. If I say to turn right at the third stop sign, and it is behind a bush, I must warn you of that, or else you will likely have to re-do that part of the trip.

My directions will not be less accurate for this extra information, but this information will help make them better directions.

If you are not certain that you understand my directions, then you will perhaps become confused and begin to doubt that landmarks that you see correspond to landmarks I describe in my directions. You will think, "would he describe this tree like that?" or "could this red house be the pink one to which he refers; his directions are so confused that maybe he's simply being sloppy here?"

If my decisions about what is shared information are bad enough, then you - the reader - will find that my writing is difficult to read; you will try to find the correct reading of the text that makes it all clear. And, if my text is simply confusing, then you will wonder whether we agree on the facts; you will think that, if you could only know what I - the writer - knew, then the text would become crystal clear.

Pragmatics of Good Non-Fiction Writing

There are many ways that shared information comes into play in good non-fiction writing. Obviously facts that I assume that the reader knows ought to be facts actually known to the reader. If the facts I assume the reader knows are not clear to the reader - if they are difficult concepts, or if the implications of the facts as they bear on my discussion are difficult to grasp - then it is my obligation as a writer to make the facts clear, even if that requires repetition and tutoring.

My text may introduce information that is crucial to understanding the rest of the piece. Not only must I carefully present that material, but in my subsequent references to it I must be sensitive to the fact that the information was recently learned - perhaps it was forgotten or even skipped over. I should never treat information that I have introduced the same way that I treat assumed facts. For one thing, if I treat the information I have introduced exactly as the information I assume the reader has known for a while, then the reader may believe that I am talking over his head by falsely assuming his knowledge is greater than it actually is; and maybe the reader skimmed the presentation of the new material and doesn't realize that the later, confusing reference to it is a reference to new and not old information.

It is often helpful for the reader if the writer, when he refers to possibly puzzling information, refers to the information in a clarifying way. If every reference adds to the comfort the reader has about the material, the new material will be better understood.

The writer has an obligation to the reader: The reader chooses to read the piece. It is rarely the case that a reader is truly forced into reading a piece of writing from beginning to end. The writer's obligation is to make the reader's task easy enough that the reader will want to read the entire piece.

The Language of Good Non-Fiction Writing

Beyond what I assume my reader to know, and beyond what I tell him, there are the actual words, phrases, sentences, and paragraphs with which I choose to pass that information to him. In bad writing the 'mental dream' is interrupted or chafed by some mistake or conscious ploy of the writer. Whenever a reader is forced to think about the writing, the words, the sentence structure, or the paragraph structure, or whenever the reader has to re-read a section of writing to understand how the words relate to each other, it is at this point that the transfer of information from the writer to the reader is stopped, and the dream that accompanies this transfer dies. The dream must be re-established, and this can take extra time that could be better spent continuing a line of thought.

A second effect of such bad writing is that if a sentence has incorrect syntax, or if it is clumsy and difficult to understand, then the reader is justified in losing respect for the writer, in questioning the intelligence of the writer and his judgment, and in lowering his estimate of the importance, significance, and correctness of the entire piece of writing.

Finally, all non-fiction, and especially technical writing, requires examples and concrete details to be understandable. When we write about a computer program, we probably have thought about that program for a long time, and we have internalized its characteristics to help our own mental processes. When the reader reads our description of it, he wants to build a mental image of the program and its operation, and we hope that his mental image is similar to ours. Without specific details the reader cannot imagine the program accurately, and it is even possible that his image is inconsistent with ours. In this case, the reader will have to adjust to the newer image once he discovers the discrepancy, if he ever discovers it.

Common Errors

There are many common writing errors that occur; these errors render writing difficult to read with respect to the aspects mentioned earlier. The most common errors that a writer makes are errors of the basic skills of writing. I will briefly catalogue some of these errors.

Many writers excessively use the passive voice. In using the passive voice, the agent of the action is either placed at the end of the sentence ("His finger was bitten by the parrot") or else it is left out altogether ("His finger was bitten"). Perhaps the writer intends to focus the reader's attention on the injury, but the natural tendency of the reader is to imagine this action, agent and all. But if the agent is missing or introduced at the end of the sentence, the image is hazy or it is wrong; a second attempt at the image must be made by the reader, and that is when the distraction away from the dream occurs.

Beginning a sentence with an infinite-verb phrase is often a mistake; either the reader can have trouble understanding the time relationships between the elements of the sentence or else he may have difficulty understanding the logic of the statement. For example, in "taking an interrupt, the program executes the terminal handler," there may be some question about whether taking an interrupt happens concurrently with the program's executing the terminal handler or whether one event causes the other to occur. In "rapidly switching contexts, the scheduler carefully considers the next job request," there is a hint of illogicalness because 'rapidly' and 'carefully' are dissonant. The reader will pause over this statement, even if it is ultimately understandable.

Problems with diction are common. Diction refers to the choice of words and the appropriateness of that choice. Diction is the hallmark of many styles of writing. 'High diction' refers to high-brow, intellectual, or even snobbish writing. I have heard some people say that scholastic writing should not be fun to read. This reflects their attitude about the proper diction for scholastic writing. The main problems people have with diction are deciding on the proper tone for their writing task and in then sticking consistently to that tone. The sentence "the essential ingredient of an efficient topological sorting procedure is a fine set of cute hacks and clever bums" shows an inconsistent level of diction - scientific and refined at the start and casual or computer-gutter at the end. Either level of diction may be fine within a specific context, but to mix them in this way is unforgivable.

Sentence variety is important because when the sentences of the same type are strung together, the result is boredom and a lessening of the dream that keeps readers reading and understanding. Anticlimactic sentences - sentences with relative clauses at the end - can be distracting because they seem to taper off rather than getting to some point.

Unintentional rhymes and rhythms can also be distracting by causing the reader to stop absorbing the content of the writing and to focus on the writing itself. No reader of a serious paper on garbage collection would pass over this sentence without a chuckle: "Collecting spare records and gaining some space happens quite regularly and at quite a high pace."

Similarly, unintentional puns can be a problem. "The input/output bottleneck was broken by adding a separate DMA channel" contains such a pun. When the reader sees a pun or thinks of an interpretation of a sentence that is humorous, the writer is in trouble: The reader has lost the mental dream and is thinking about the pun.

Explaining events out of their natural order is a serious error: The reader must stop reading and piece together the actual sequence. This error is at the border of the language errors and the pragmatic errors, but it shares the theme of all of these errors: The reader must abandon his dream and concentrate on the writing.

Writing and Manners

Good writing is an act of communication between a writer and an unseen reader. Good writing is a courtesy that is expected by the reader, and if a reader puts my paper away because he cannot handle the writing, I have failed my duty to that reader. Similarly, I have no respect for a writer, regardless of his professional stature, if he will not take the time to think carefully about how he presents his work and results to me.

Writing Well

How can you write well? The key is to use yourself as a model of your audience. The reader does not know as much about the topic you are presenting as you do. To understand what people know you must be able to forget things you know. This forgetfulness is reasoned: You carefully reason about what you know that is not common knowledge. Then you use this reduced knowledge to see what in your text is unclear because it requires the knowledge you have that your reader may not have.

And you must be able to forget the structure of your text, so that as you read it you can successfully be a model of a reader coming upon your text afresh. One way to do this is to put time between you, the writer, and you, the later reader.

As we read, each new word causes us to move forward in parsing the text - we must understand the relationships between words in order to piece together the picture that the writer is presenting. It is possible to carefully choose words so that the reader can progress along an obstacle-free path through the words. Word choice is done when the first draft is being written and also during revisions.

You could accomplish this by reasoning about where parsing choices (and confusions) can arise and by picking words that tend to guide the reader one way over others. Sometimes it isn't possible to eliminate problems for the reader with such local choices, and global re-planning is necessary: You may have to rewrite an entire paragraph to avoid confusion in one part of a sentence.

Computer Writing

But I want to talk about computers doing deliberate writing. The above cautions are only a small fraction of the advice that could be given to human writers: What of computer writers? A program that writes deliberately must be able to plan and re-plan, to debug errors and inconveniences, to reason about knowledge, and to understand itself well enough to reason about why it makes certain decisions.

Writing well is difficult for a person to do, and it is also very difficult for a computer to do. As we have seen, there is an intimate relationship between the writer and his reader, although these two individuals may be separated by many miles and years.

A computer that has such a relationship with a reader is difficult to achieve. Because the writer has to share a common background with a reader (at some level) to be a successful writer, the computer must have this common background built in by the author of its writing programs. Artificial intelligence has a range of techniques for reasoning about shared knowledge and, to some degree, about plausible inferences from that knowledge.

However, the careful writer also is able to reason about the problems that the reader will have in parsing his writing. The writer is able to use himself as a model reader, after he has put his writing aside for a time. The computer cannot do this as easily, because there is no conception of 'forgetting' in current artificial intelligence paradigms, nor is there an easy way to use a natural language parser to find out where the parser has trouble with a sentence and how to correct the difficulty.

Nevertheless, there are programs that can write. I have written such a program, called Yh. Yh is a program that writes text, and it is one of the first attempts at a deliberate writing program. The texts it produces are explanations of the operation of simple programs. These programs have been synthesized from a description of an algorithm provided by a person during a conversation with an automatic programming system - in this case the PSI system [Green 1980].

To accomplish the writing behavior I described earlier, this program generates text from left-to-right, making locally good decisions at each step. As the generation proceeds, other parts of the program observe the generation process, and, because these parts of the program are able to make connections between distant parts of the text, they are able to criticize the result.

In other words, after the initial version of the text has been produced, further reasoning about the global nature of the choices is performed, the text is transformed, and complex sentence structure is introduced or eliminated.

There are several mechanisms in Yh designed to produce clear writing, but these mechanisms require detailed and extensive knowledge both about writing and about the subject matter of the writing to be effective. Yh has only a small amount of knowledge, but, even with this limited depth of knowledge, Yh has generated a great deal of text of good quality. However, Yh does not write uniformly well; as more knowledge is added or existing knowledge is refined, the quality of its writing will improve.

Background

In this section I will explain some of the philosophy behind the design of Yh. Yh is a fairly large program and has a complex control structure. Because the behavior of Yh is dependent on this structure and complexity, and because the structure and complexity are a result of this philosophy, I think it is important to spend a little time understanding it.

How to Make Computers Write

Researchers in artificial intelligence have been theorizing for many years about the mechanisms necessary for intelligent behavior in restricted domains, especially domains that are the realm of specialists and not laymen. One hope is that a uniform structure among these mechanisms will emerge and that this uniform structure will generalize into something which can perform a wide range of tasks.

The effect of this generalization, it is hoped, would be the creation - theoretical or actual - of a computer individual, a program or machine that has some of the qualities of a human mind and which encompasses nearly the full range of abilities of a normal, average person. Such a computer individual would comprise an immense body of machinery.

Rather than looking for this uniform structure within the individual mechanisms, perhaps the proper place to look for it is within the organization of an entire system. That is, perhaps individual solutions to specific problems of intelligence need be constrained to work only within their intended domain; the responsibility for selecting and relating various solutions would be left up to this uniform, overlying organization.

The driving force behind the ideas presented herein is the fluid domain, which will be introduced shortly.

Complexity versus Simplicity

One of the prevailing notions in all scientific endeavors is that simplicity is to be favored over complexity where there is a choice. When there is no other choice but a complex alternative, of course the complex alternative must be chosen.

Years ago, Herbert Simon [Simon 1969] gave us the parable of the ant on the beach, in which the complex behavior of the ant as it traverses the sand is viewed simply as the complexity of the environment reflecting in the actions of the ant. He says:

An ant, viewed as a behaving system, is quite simple. The apparent complexity of behavior over time is largely a reflection of the complexity of the environment in which it finds itself.

He goes on to substitute man for ant in the above quote and attempts to justify that statement. The goal of creating an intelligent machine has evolved, historically, from the initial sense of simple programs demonstrating interesting behavior even though those programs are simple. This sense comes, I think, from the speed of the machine in executing the steps in a program: Even though a computer may not deliberate deeply in its processing, it can explore very many alternatives, perhaps shallowly. For many problems this extensive, but shallow, analysis may substitute effectively for deep deliberation.

Complexity

I want to counter Simon's parable above with a quote from Lewis Thomas's "The Lives of a Cell" [Thomas 1974]; he says, when discussing the variety of things that go on in each cell:

My cells are no longer the pure line entities I was raised with; they are ecosystems more complex than Jamaica Bay.

He later goes on to compare the cell as an entity to the earth.

Each cell is incredibly complex, and our brains are composed of very large numbers of them, connected in complex ways. A conclusion consistent with this is that programs that behave like people, even in small domains, must be rather large and complex - certainly more complex than any program written by anyone so far. And to write such a large program requires an organizing principle that makes the creation of such a program possible.

Two Aspects of Programming

There is a useful dichotomy to help us understand how artificial intelligence programs differ from many other programs: algorithmic programming versus behavioral programming.

In algorithmic programming the point is to write a program that solves a problem that has a single solution; steps are taken to solve the problem in an efficient manner. One example of an algorithmic program is one that finds the largest prime pair smaller than 1,000,000. To be sure, writing algorithms is not simple, but it is quite different from what I call behavioral programming.

In behavioral programs the point is to produce a program that behaves in certain ways in response to various stimuli. A further requirement on such a program might be that the response is not simply a function of the current stimuli but also of all previous stimuli and responses. Examples of behavioral programs are operating systems and some artificial intelligence programs.

In writing, one could possibly write an algorithmic program to generate single sentences, but to generate a paragraph or some longer piece requires a program that can react to its previous prose output, much as people do when they write. I say 'requires,' but that obviously isn't correct, because it is certainly possible to write an algorithmic program to write paragraphs within selected domains - I simply mean that writing prose-generating programs is easier if there is a structure that supports the activities necessary for prose writing, and I believe that structure is a loosely connected network of experts.

Fluid versus Essential Domains

I want to make a distinction between two of the kinds of domains that one can work with in artificial intelligence research: fluid domains and essential domains. The qualifiers, fluid and essential, are meant to refer to the richness of these domains.

In an essential domain, there are very few objects and operations. A problem is given within this domain, and it must be solved by manipulating objects using the operations available. Generally speaking, an essential domain contains exactly the number of objects and operations needed to solve the problem, and usually a clever solution is required to get the right result.

As an example of an essential domain, consider the missionaries and cannibals problem. In this problem there are three missionaries, three cannibals, a boat, and a river; and the problem is to get the six people across the river. The boat can hold three people, and if the cannibals ever outnumber the missionaries in a situation, the result is dinner for those cannibals.

If this problem actually were to occur in real life, it probably would be solved by the missionaries looking for a bridge, calling for reinforcements, or making the cannibals swim next to the boat.

An important feature of a problem posed within an essential domain is that it takes great cleverness to solve it; an essential domain is called essential because everything that is not essential is pruned away, and we are left with a distilled situation.

In a fluid domain, there are a large number of objects and a large number of applicable operations. A problem that is posed within the context of a fluid domain is typically the result of a long and complex chain of events. Generally, there are a lot of plausible-looking alternatives available, and many different courses of action can result in a satisfactory solution. Problems posed in this type of domain are usually open-ended and sometimes there is no clearly recognizable goal.

A typical fluid domain is writing. In this domain there are a large number of ways of expressing things, beginning with inventing phraseology out of whole cloth and progressing towards idioms. As I noted earlier, writing is a process of constant revision, and often that revision is centered on word choice and how those choices affect the overall structure of a piece of writing. Therefore, it does not seem likely that a computer program could avoid doing the same sorts of revisions and be as effective as a program which also did post-word-choice revision.

A key feature of writing, and of fluid domains in general, is that judgment is often more important than cleverness, and frequently the crux to solving a difficult problem is recognizing that a situation is familiar. As we write more and more, situations in which wording or fact-introduction is a problem become easier for us to spot and to repair. We use our judgment to improve the clarity of our writing.

A successful approach to take in order to solve problems in fluid domains is to plan out a sequence of steps that lead from where we start towards what appears to be the goal. These steps are islands, where each island is a description of a situation that we believe can be achieved adequately. We will refer to a description of the situation as the situation description. At each island we then apply the best techniques available for achieving that situation, given the previously achieved situations.

There are two problems we need to solve to make this method work: 1)~We must be able to plan these islands without doing very much backtracking, and during the planning stage we must not be required to perform any actions that might need to be performed during the later execution stage; and 2)~once the plan is completed and we are executing it, we must be able to effectively bring to bear the appropriate techniques at each island so that the island is actually achieved.

I will consider each problem a little more carefully; as I do so, I will illustrate points concerning the general problems with their realization in the writing domain.

The first problem is to build a path, perhaps a graph, of nodes where each node is a situation that we wish to establish. We hope that if we traverse the graph, establishing the corresponding situation by taking some actions, then the final situation matches the goal towards which we were aiming. Building this graph is called coarse planning.

In writing, each node - the islands above - could be a sentence's worth of facts that must be conveyed, or it could be several sentence's worth of facts. The point is that each island represents a set of facts which ought to be expressed as a unit - locally in some section of the text, if possible. We hope that if we could adequately express in sentences each fact in the plan, then the entire text would adequately express all the facts.

Because we may need to consider many possible graphs of nodes before a plan emerges and because we may not wish to take all of the actions to establish the situations at each node while planning, we will need to operate on abstract descriptions of the possible actions that can be taken to decide whether a node can possibly be established by future actions.

In writing, this planning stage involves making a list of the propositional contents of sentences that might be written - the graph that is produced is simply a linear list. During this planning stage a sequence of sets of predicate calculus formulas is created, where each set of formulas represents the propositional content that should be conveyed at that stage in the text. The propositional content in each set might be expressed in a single English sentence or in several. We will want to consider whether saying these sentence-contents in a given order will convey the meaning we intend before we commit ourselves to the plan.

The second problem is to find those actions that will actually establish the situations called for in our plan. Fine-grained planning may be required to establish smaller islands - or islets - within the larger given islands. The actions that are determined to be appropriate at each island are executed in order to flesh out the plan, establishing the particulars of the plan. However, once these particulars have been established, we may find that the fleshed-out plan does not work well, and then we are faced with the problem of modifying what we have so as to accomplish our goal as nearly as possible.

In writing, we will actually propose words and sentences to accomplish the propositional contents in our plan. After this stage there exists a first draft text. We might find that the words chosen do not fit well with the planned structure of the text, and that a different structure might be better. Or it might be that the structure of an earlier part of the text prevents the structure for the later part to be realized.

In order to determine that the executed plan accomplishes our goal, we must be able to observe and criticize the actions of the system as it performs the steps of the plan. This is a good way to discover the inadequacies of the techniques brought to bear at each island.

In writing, we will observe our word and sentence-structure choices to determine whether we are effectively conveying intended meaning or whether we have unintentionally expressed an unwanted meaning or connotation. One might say that the program 'reads' what it has written, although in Yh there is no parser - the program reviews a representation of the text, which is simply an elaborate parse tree for that text.

Thus there are three essentials to our method: 1)~draw up a coarse plan; 2)~implement the details of the plan as best as can be done; and 3)~observe the processes carrying out the details of the plan in order to criticize its effectiveness and to propose changes to correct any deficiencies.

Intelligence and Communication - Object-oriented Programming

Suppose we had a program that exhibited a degree of intelligence; from whence would this exhibited intelligence emerge? Certainly the program code by itself is not 'intelligent,' although the intelligence of the system must emerge from that code somehow. The intelligence emerges from the program code as it is running. But, to go one step further, what in the running of that code is the source of the intelligence?

Consider a team of specialists. If a problem the team is working on is not entirely within any one person's specialty, then one might expect that they could solve it after a dialogue. This dialogue would be an exchange of information about strategies, techniques, and knowledge as well as an exchange of information about what each specialist knows, why it might be important, and why some non-obvious course of action might be appropriate.

One could say that the collective intelligence of this team emerges from their interactions as much as from each individual's expertise; some particular expertise may not be able to address very much of the problem directly, but the combination of expertise plus an overall organizing principle might better address the problem as a whole.

Also, from a practical programming point of view, if a system can be expanded mainly through the addition of another individual piece of knowledge or expertise, with responsibility for organizing that new piece of knowledge left up to the system somehow, then a large system composed of many pieces of knowledge could be created and managed.

The question, then, becomes one of supporting communication well. In Lisp, for example, in order to use a function written for a specific purpose, one has to know the name of the function and its calling sequence. This will not do for the scenario I have outlined above: Being able to address the correct or most appropriate function or expert must be accomplished flexibly.

Object-Oriented Programming

Object-oriented programming addresses some of the needs of the system I have outlined. In object-oriented programming one builds systems by defining objects, and the interactions between these objects is in terms of messages these objects send to each other.

A standard example of this style of programming is the definition of 'addition.' In a traditional programming language we can define addition as an operation performed on two numbers. The function that adds two numbers might be able to look at the types of numbers (integers, floating-point, or complex, for instance) and then decide how to add the numbers, perhaps by coercing one number into the type of the other (we add a floating-point number to a complex number by coercing the floating-point number to a complex number where the real part is the given floating-point number and the imaginary part is 0).

An orthogonal - the object-oriented - way of doing this is to consider numbers as objects which know how to add other numbers to themselves. Therefore a complex number might be sent a message saying to add to itself a floating-point number and to return the value to the sender. The complex number would then look at the type of the number sent to it and take the correct steps.

When we want to modify these systems to be able to add new types of numbers, we do different things in each system. In the traditional system we need to improve the addition program so it knows about the new types of numbers that can be added. In the object-oriented system we need to create a new type of object - the new type of number - and to provide information about how to add other sorts of numbers to it.

To make this work easily, though, it is necessary to have provided a fallback or error handler to each object. For example, if a number, x, is sent a message requesting that another number, y, be added to that number, then if x does not know what to make of y, x could send y a message to add x to it, but cautioning y that x has already tried. If y is also puzzled, then y can try another error procedure rather than simply throwing the question back to x.

With such a fallback position, we can add new data types to an object-oriented system more easily than to a traditional system.

Yh is an object-oriented system, and I will call the objects in it experts.

Overview of the System

Experts

Yh, the writing program, is organized as expert pieces of code that can perform certain tasks in writing: Some can construct sentences, some can construct phrases, some can supply words or idioms, and some can observe and criticize the operation of the rest of the system. These expert pieces of code are objects in an object-oriented system.

These experts are capable of taking some action, and each one has an associated description of what it does. This description is in a description language which can express features and attributes, as well as their relative importances.

Let me be a little more specific. Yh comprises a number of experts held in a database, each of which is a small program. When Yh has a task to do, it finds an expert to invoke. Each expert has an associated description of what sorts of tasks it can do, and this description is used as an index for that expert. To find an expert to do a certain task, Yh formulates a description of the task and matches that description against the description of each expert in its database of experts. The description that best matches the task description corresponds to the expert that Yh will invoke.

Yh uses these descriptions when it is planning: Yh can use the descriptions of each expert to simulate the actions of that expert and can thereby propose a sequence of experts to invoke that will accomplish some goal. The descriptions are not represented procedurally, but they are structured in such a way that an 'interpreter' can be applied to a situation description and an expert description, and produce the situation description that would result if the expert were applied in a context where the first situation description held. This interpreter assumes that the expert would do exactly what its description claims it would.

In summary, these descriptions are used during coarse planning to help determine islands and during the execution of the plan to find appropriate experts to invoke. During both of these activities a pattern matcher is used to identify the appropriate experts.

Control Structure

Yh is agenda-driven using a priority agenda. That is, the agenda contains items that have priorities attached to them. Periodically Yh scans this agenda to determine what to do next, invoking the highest-priority agenda item.

Descriptions

Descriptions are matched against other descriptions. The matching process is soft or hybrid, which I define to mean that matches result in pairings of attributes, bindings of variables, and a numeric measure of the strength or closeness of the match.

More specifically, each description is a set of ordered pairs called descriptors; the first element of each descriptor is a sentence in a simple first-order logic, and the second element is the 'measure of importance' of that sentence. This simple first-order logic contains constants, variables, functions, predicates, and some quantifiers. I will refer to the first element of a descriptor as the propositional content of the descriptor. The propositional content of the description is the concatenation of the sentences within the description.

Here is a partial list of the sorts of entries in the description of an expert and how they affect a match:

GOALS: These are the main actions performed by the expert, expressed as sentences with associated measures of strength. The primary matching operations consider only these sentences.

PRECONDITIONS: These are the pre-conditions the expert expects to be true when it is invoked. These are also expressed as sentences with associated measures of strength.

CONSTRAINTS: These are predicates that must be true in order for the expert to be invoked.

PREFERENCES: These are predicates with associated measures of strength. For each predicate that is true in the context of a potential match, the associated measure of strength is added to the strength of the match. If the measure of strength is a number, it can be periodically decayed.

ADDED-GOALS: These are the new goals that the actions of an expert may post when that expert is invoked. The goals are stated as sentences and have associated measures of strength. For each goal, the associated measure of strength indicates how important it is to achieve that goal.

SOFT-CONSTRAINTS: These predicates are exactly like CONSTRAINTS above, but each has an associated measure of strength that affects only the strength of the match.

INFLUENCES: These are GOALS-like entries that only affect the strength of a match. These entries are unified against all descriptors (entries in a description) in the description being matched. If an entry unifies with another, then the measure of strength is added to the strength of the match.

COUNTERGOALS: }These are like GOALS above, but they represent things that are undone by the action of an expert when invoked.

The expert description as well as the situation description are expressed in this language.

Influences

There are two major intentions behind this style of description: inexact matching and influencing a match.

This first intention was formulated after observing that it may not always be possible to find experts with descriptions that match perfectly, and that experts whose descriptions are relevant to the goals may be able to help accomplish those goals. For example, an expert that can write a passive sentence is able to accomplish the following two goals: The expert can put the direct object at the front of the sentence, making that object more prominent; and it can keep the agent of the sentence anonymous, which is useful if the writer doesn't know the agent, for instance. One can argue that neither of these goals is more important than the other, and it can be the case that if one wanted to accomplish one or the other of these goals, the passive sentence expert might represent the best means.

The second intention was formulated after observing that there may be very many experts whose descriptions match a given situation description and which could be used to take some useful actions. For instance, there will be quite a few words that could be used to express a concept or an object, perhaps equally well. We want to be able to influence which expert is invoked. In the word-choice example, perhaps we want to avoid recently used words (using PREFERENCES), or we want to encourage the use of words with certain connotations (using INFLUENCES). If the writing program wishes to avoid sentence constructs that have been used recently in a passage, SOFT-CONSTRAINTS can be used to influence the choice of sentence constructs.

Pattern-matching

This section describes the pattern-matching process in a medium degree of detail; and, in particular, the mechanisms in the pattern matcher which give rise to the behavior described above will be outlined. The casual reader can skip this section.

To match two descriptions, a pairing of the descriptors of one with the descriptors of the other is produced. A pair of descriptors, d1 and d2, is placed in the pairing if the propositional content of d1 unifies with the propositional content of d2. During this pairing process, the entries labelled GOALS are the only ones considered. It may or may not be the case that each descriptor of each description is paired with one from the other, but no descriptor can be paired with more than one other descriptor. A match exists if there is a non-empty pairing.

More formally, suppose we have two descriptions, P and D, and suppose that the GOALS part of the description of P is:

{(p1,s1)...(pn,sn)}

where each pi is a sentence in the first-order logic and each si is an integer. Suppose that the GOALS part of the description of D is:

{(d1,t1)...(dm,tm)}

where each di is a sentence and each ti is an integer. Let U be a predicate on sentences where U(f1,f2) is true iff f1 and f2 unify. Then, P and D match iff:

there exists an i,j 1 < i < n, 1 < j < m such that U(pi,dj)

Let Pairing be a set of pairs that result from a match. If [(pi,si),(dj,tj)] belongs to Pairing and [(pi,si),(dk,tk)] belongs to Pairing, then j=k.

For every two descriptions there may be several pairings of GOALS entries.

Once the pairing is produced, the strength of the match is computed using the measures of importance. The basic strength of the match is computed from the pairings obtained as described above.

Pairing=[(pi_1,si_1),(dj_1,tj_1)],...,[(pi_k,si_k),(dj_k,tj_k)]

then the basic strength of this match is a function of si_1,tj_1,...,si_k,tj_k

Define the strength of match between P and D, Strength(P,D), to be the maximum of the strengths of match of all the possible pairings of descriptors in P and D.

The remainder of the entries (such as INFLUENCES) are also paired with entries from the other description, and the measures of importance are used to modify the strength of the match.

To be more specific in the case of INFLUENCES, let (I,s)_P be an influence, where I is a sentence and s is a measure of strength. If

there exists a (d,t) such that U(S,d)

then the strength of the match is altered by a function of s and t.

The effect of all but the GOALS portion of the description is to affect the strength of the match and not the validity of the match.

Performance of the Matcher

The pattern matcher performs operations on cross products. Because of this, the performance of the pattern matcher is potentially quite bad. Many of the operations, however, can be formulated in such a way that a parallel processor could greatly increase the performance of the matcher.

On the other hand, during the generation of the paragraph of text that will be shown in the example that follows, the pattern matcher was invoked approximately 5,000 times and the underlying unifier approximately 300,000 times. The total time for the generation of the paragraph was only 15 CPU minutes on a DEC KL-10A.

Planning

Planning is done quite simply. We start with an initial situation and a goal situation. The initial situation and the goal situation are each represented by a description, and the pattern matcher is able to use these descriptions to determine the degree of progress towards the goal.

The planner tries to find a sequence of experts that will transform the initial situation into the goal situation. To do this, the planner finds an expert whose description indicates that the expert will transform the initial situation into a situation that is 'closer' to the goal situation; P1 is closer to D than P2 if Strength(P2,D) < Strength(P1,D). Yh uses the description of the chosen expert to transform the initial situation into a new situation, and the process is repeated with this new situation in place of the initial situation.

This transformation may add further goals, and it may also add entries that indicate preferences or influences over the remainder of the planning process. In this way, the current part of the plan can influence the later parts.

If the search does not appear to be proceeding towards the goal, backtracking occurs.

The initial and goal situations may also be pairs of islands within a larger plan, and often this is the case. That is, when Yh is planning some paragraphs of text on some topic, it uses other, simple, planning heuristics to lay out the sequence of topics or facts to be expressed within each paragraph. Finer-grained planning then fleshes out this plan as best it can.

Action

Once the plan is in place, the first expert in the list of experts found in the planning process is applied, and that expert actually updates the situation description accurately. The updating done by the planning process using the description of the expert may have been only an approximation to a detailed analysis that the expert needed to perform to decide the expert's exact effect.

At this point, if the situation that was expected to be established by the actions of this expert is, in fact, not established, Yh may apply other applicable experts to the situation until the desired state is reached.

Applicable experts are found by the pattern-matcher by searching and matching in the database of experts. An applicable expert is one whose strength of match is above a certain threshold. If no expert is above the threshold, the planner is called again to do finer-grained planning to solve the problem.

As an example from writing, suppose the initial situation is a certain state of the text and the goal state is to explain a simple additional fact. There may be some means of expressing this fact, but the means selected may have created a noun phrase that is ambiguous in its context, perhaps by using similar words to those used earlier to refer to a different object. When the situation description is updated, this ambiguity may become apparent, and some other actions might be taken to modify the text or to add further text to help disambiguate the wording.

A general-purpose function-calling mechanism is based on pattern matching descriptions - call by description. It is possible for an expert to ask another expert to perform some actions: The first expert puts together a situation description - which describes the current state of affairs along with the desired state of affairs - and asks the pattern-matcher and planner to find an expert or a sequence of experts to accomplish the desired state of affairs.

The Writing Process

With the above simple overview description of the operation of Yh, I will now explain in more detail how writing is done by Yh.

Given a set of facts to explain, Yh applies some simple heuristics to the facts to determine the order of presentation of those facts. For writing about programs the heuristics simply examine the program to determine the data structures and the flow of control.

These ordered facts are the initial islands in the planning process. A finer-grained plan is produced which partitions the facts into sentences. That is, the finer-grained plan is a sequence of sentence schemata (declarative, declarative with certain relative clauses, etc.) along with the facts that each expresses.

At this point the writing begins with text being produced from left-to-right, all the way down to words. As the actual writing proceeds, a simple observation mechanism is used to flag possible improvements in the text. For instance, if sentences with the same subject or verb phrase appear, this is noted. The mechanism for observation is to use the pattern-matcher to locate experts that are designed to react to specific situations, such as the same subject appearing in two different sentences.

The text is represented as a parse tree. Each node of the tree contains two annotations: One annotation states the syntactic category of the subtree rooted at that node; and the other annotation contains the situation description which caused that part of the tree to be created. In general, Yh is able to randomly access any part of the tree, using as indices the syntactic annotations, the situation description (semantic) annotations, or the contents of the nodes.

When the text is complete, the experts that were triggered by interesting events - such as the same verb phrase appearing in several places - are allowed to modify the text. While this is happening, further observations are made. The process continues until a threshold of improvement is reached - that is, until there is little discernible improvement to the text.

The effect of the observation experts can be to move facts between planning islands. The initial planning stage can be regarded as only a first approximation in a series of better approximations to a satisfactory plan for expressing a set of facts in English.

When Yh starts writing there are three agenda entries, which cause the above actions to happen: 1)~A coarse planning entry; 2)~a plan execution entry; and 3)~an observation-expert activation entry. This first entry causes the coarse planning to happen, and the second entry causes the plan to be executed (the first draft to be written). The third entry is more complicated. While the first draft is being written, observers watch the process and make suggestions. These suggestions are simply entries in a database of such entries. The third agenda item causes these entries to be processed, and any actions that need to be taken based on the suggestions contained there will be initiated by this agenda entry.

Example of Writing

The next few pages will present an example of Yh writing about a simple Lisp program.

Dutch National Flag

The Dutch National flag problem as is follows: Assume there is a sequence of colored objects in a row, where each of the objects can be either red, white, or blue; place all red objects to the left, all white objects in the middle, and all blue objects to the right.

Given the initial sequence:

B R W B R W B

the result is:

R R W W B B B

The problem is a sorting problem, and it can be done in linear time using an array and three markers into that array. The following is a simple MacLisp program that solves the problem where an array is used to the store the elements in the sequence:

;;; Dutch National Flag

(declare
 (array* (notype flag 1)) ;represents the Flag
                          ;can be r,b, or w.
                          ;r = red, w = white, b = blue
         (special n))     ;represents the length of the Array

;;;exchanges (flag x)  and (flag y)
(defmacro exchange (x y)
 `(let ((q (flag ,y)))
    (store (flag ,y) (flag ,x))
    (store (flag ,x) q)))

;;;tests if (flag x) is red
(defmacro redp (x) `(eq (flag ,x) 'r))

;;;tests if (flag x) is blue
(defmacro bluep (x) `(eq (flag ,x) 'b))

;;;tests if (flag x) is white
(defmacro whitep (x) `(eq (flag ,x) 'w))

;;;increments x by 1
(defmacro incr (x) `(setq ,x (1+ ,x)))

;;;decrements x by 1
(defmacro decr (x) `(setq ,x (1- ,x)))


(defun dnf ()
 (let ((l 0)(m 0)(r (1- n)))     ;initialize l,m, & r
  (while (not (> m r))
   (cond ((redp m)
          (exchange l m)
          (incr l)(incr m))
         ((bluep m)
          (exchange m r)
          (decr r))
         (t (incr m))))
  t))

The flag is represented by a 1-dimensional, 0-based array of n elements, FLAG. There are three array markers, L, M, and R, standing for Left, Middle, and Right, respectively. L and M are initialized to 0; R is initialized to n-1.

While M is not bigger than R the program does the following: If (flag m) is red, it exchanges (flag l) and (flag m), incrementing L and M by 1. If (flag m) is blue, it exchanges (flag r) and (flag m), decrementing R by 1. Otherwise, it increments M by 1.

In order to exchange (flag x) and (flag y), the program saves the value of (flag y), stores the value of (flag x) in (flag y) and then stores the value of the temporary in (flag x). An element of FLAG is red if it contains R, blue if it contains B, and white if it contains W.

The above three paragraphs are written by Yh from an internal representation that captures exactly what is in the above code plus the comments. The representation is similar to that which a compiler would use to represent the above computation. Rest assured, Yh is not capable of reasoning about programs - every deduction made about the program while writing the above text was trivial.

The rest of this section will explain the generation of the first paragraph.

Overview

Yh is started with the task of explaining the Dutch National Flag program. Yh will first produce a plan to accomplish that, which is as follows:

discuss the data structures;
discuss the main program; and
discuss the macros.

The remainder of the paper will present a detailed discussion of how the paragraph which accomplishes step 1 of the plan is written. The only data structure is an array with some array markers; the plan for this portion of the text, after it has been fleshed out during the writing process, is:

discuss what the array represents;
discuss the dimensionality of the array;
discuss the base of the array;
discuss the size of the array;
discuss the array markers; and
discuss the initialization of the array markers.

At the end of this part of the writing process, the paragraph is:

The one-dimensional, zero-based array of n elements, FLAG, represents the flag. There are three array markers, L, M, and R, standing for left, middle and right, respectively. L is initialized to 0. M is initialized to 0. R is initialized to n–1.

While writing this first draft, experts notice that some changes should be made. After making those changes, the paragraph will be:

The flag is represented by a 1-dimensional, 0-based array of n elements, FLAG. There are three array markers, L, M, and R, standing for Left, Middle, and Right, respectively. L and M are initialized to 0; R is initialized to n–1.

Starting the Process

Yh is started on the task of writing about the program described above. It is given three pieces of advice in the form of influences. Recall that an influence is a descriptor which is used to bias the pattern matcher towards choosing an expert that has a description with that descriptor in it; if a negative influence is used, then the pattern matcher will be biased towards choosing an expert that has a description with the negation of that descriptor in it.

These three pieces of advice are: 1) do not use too many adjectives; 2) collapse all sentences that share either a common subject or a common predicate as soon as they are identified; and 3) do not allow very complex sentences. This last piece of advice uses a simple complexity measure which considers sentence length, adjectival phrase length, and relative clause depth.

Conducting an Inspection

I will refer to all of the functions and data structures in the above program as the program.

First the program is examined. Initial planning islands are established for all of the major parts of the program: the data structures and the functions. This is done by an expert that has a checklist of general features of a program that are important to discuss when explaining or describing programs. The items in the checklist have associated priorities, and the plan is to discuss the items in the checklist in priority order unless some other order is specified. That is, this checklist represents heuristics about how to write about programs.

In the case at hand, the data structures come first because they are given a higher priority by these heuristics. The functions come next with the main function, DNF, first and the macros after that. A paragraph break will be inserted between the discussion of the data structures and the discussion of the program code.

The initial plan is: 1) explain the data structures, in this case the array; 2) explain the main program, DNF; and 3) explain the macros - EXCHANGE, INCREMENT, DECREMENT, REDP, BLUEP, and WHITEP - in this order.

Yh now begins to execute that plan, and the first data structure is then examined. It is the array that represents the flag. A simple examination causes the array-describing expert to be directly invoked using call by description. This array expert knows about interesting things concerning arrays.

All relevant features of the array are retrieved. In addition, the functions that use this data structure are retrieved. All other arrays in the program are found.

The facts discovered about this array by the array expert are: 1) it represents an object to the user; 2) each of its elements is one of R, B, and W, which represent the colors of the flag; 3) three array markers are used to point to places in the array, and these markers are moved; and 4) there are no other arrays defined in the program.

Because there are no other arrays in this program, an influence is added to the initial situation description stating that the array is unique. This may result in a noun-phrase generator choosing to say 'the array' rather than some other descriptive phrase.

Now that the relevant facts about the array are known, the discussion of the array will be written.

The things that are important to talk about for an array are: what it represents, its name, its length, its base, its first element, its last element, its dimension, and the types of its elements. The array expert's overall strategy is to introduce the array as a topic of discussion and to follow~up that introduction with facts about its features. The options for introducing the array are:

The <array> represents <something>.
There is <array description>.
The <array> has n elements.
The <array> is m dimensional.
The first element of <array> is <first element>.

The notation, <form>, means that form is to be expressed as a phrase or a clause, and the text is substituted for <form> in the appropriate schema. In order to write any of these sentences, the array expert will invoke the simple declarative sentence expert using call by description.

Simple Declaratives

The simple declarative sentence expert knows that exophoric references - references to objects in the external world with which the reader is familiar - should be placed early in a passage. The array expert knows that the array represents the flag in the program, and this representation forms the basis of an exophoric reference. The array expert chooses to introduce the array using the first of the list of schemata above: The <form> represents ....

The array is an object in the Dutch National Flag program, and the flag is an object in the world. Because the flag is more concrete to the reader, the reference to the flag ought to be placed first in the sentence. To accomplish this, the simple declarative sentence expert posts a request to transform this sentence to the passive voice. This would move the noun phrase referring to the flag to the beginning of the sentence, making it the topic and formal subject.

Requests to perform passive transformations are sent on a special expert that keeps track of the various proposed passives and has the responsibility of deciding whether there would be too many passive sentences too close together.

The first sentence generated will be: The array represents the flag.

The simple declarative sentence expert generates the sentence left-to-right by generating the noun phrase for the subject, then the verb phrase, and finally the noun phrase for the direct object. These phrases are written by experts invoked by the simple declarative sentence expert using call by description.

Every sentence generated is annotated with the situation descriptions that were used to generate the sentence as well as the experts that were used in the process. In fact, each phrase and each word is also so annotated.

The situation descriptions used in finding the experts that generate each sentence and phrase are among the primary factors that determine wording and sentence structure. Thus, there is a correspondence between the situation descriptions and the words and sentences. While writing the first draft, these situation descriptions, along with the influences, are the sole determiners of the wording and sentence structure, and, while revisions are being made to the text, the situation descriptions are updated so that the correspondences between the situation descriptions and the words and sentences are maintained as well as possible.

If the annotations are examined in left-to-right order, a good idea of the structure and wording of the passage can be gained. This is an approximation to re-reading, which was mentioned earlier as an important aspect of how good writing is done by people.

Although this does not have the same effect as an actual re-reading under forgetfulness, the performance of Yh demonstrates that it is adequate for many writing tasks.

Noun Phrases

The noun phrase expert is fairly robust and generates interesting and appropriate noun phrases. It also generates all of the modifiers called for by the situation description. These modifiers include the determiner, adjectives, relative clauses, and post-noun modifiers such as prepositional phrases.

If the number of modifiers of a noun would make the noun phrase too long or too complex, the noun phrase generator can post further requests in the current situation description that possibly would cause other sentences to be generated in order to present the modifiers.

In the first sentence about the array, noun phrases for the array and the flag must be generated.

If the representation of an object to be generated has a unique name associated with it, the noun phrase generator will use that, unless it is necessary to add other descriptive material. As an example from fiction writing, in introducing a character to a story it is often not sufficient to use the character's name - usually the reader wants to know some simple facts about him.

In this first sentence, the name is not used because we are in the situation of introducing a new 'character,' the array. However, this name, which is FLAG, will be introduced later.

Because Yh is being used recursively to generate the subject noun phrase, there is no a priori reason to expect that the noun phrase will be a single word, or even a simple noun phrase. Had a phrase been generated, that entire phrase would be treated as a noun; if it were a verb phrase - as would be the case for an action - the result would be a gerund.

Uniqueness of the Noun Phrase

There are two considerations to be made when a noun phrase is being generated: 1) whether the noun phrase is being generated to refer to an object already referred to in the text written so far; 2) whether the noun phrase is being generated in such a way that the noun phrase itself is similar to one already used in the text to refer to an object not equal to the object being referred to now.

In order to locate the first type of reference, the annotations for all previous noun phrases are searched to find references to the same object. The second type of reference is more difficult to find. Simply stated, Yh searches all previous noun phrases and tries to match the description for the current phrase against those for all previous noun phrases. It is hoped that the internal descriptions are such that if two noun phrases have closely matching descriptions, then the noun phrases generated are similar, and hence ambiguous. This activity is the analogue of re-reading a passage.

In general, when Yh discovers that two parts of a text clash - due to ambiguities or coincidentally similar wording - Yh is capable of repairing the problem at either site, or it could choose to reformulate parts of the text to remove one or both of the offending parts.

If Yh decides that two references are the same (Case 1 above) Yh makes a request to consider combining the sentences in which they occur. In the case of ambiguous references (Case 2 above) Yh posts a request to find distinguishing descriptors. The descriptions will be scanned for the most important distinguishing descriptors, which will then be placed in prominence in the noun phrases. Other tactics such as increasing the distance between the two references might be tried also.

The next decision in writing a noun phrase is which determiner to use, the or a, or whether to use no determiner at all. If it is specified that there should be no determiner, then none is used. If there are no other noun phrases referring to the same thing, then the is used. If the noun phrase is plural then a cannot be used.

Modifiers

Let us recall where we are in the writing process: The array expert has invoked the simple declarative sentence expert, which has invoked the noun phrase expert. The noun phrase expert may invoke the adjective expert.

Any of the experts in this chain of control can specify that no adjectives should be used. This would be the case if one these experts wanted to add adjectives in some order other than that which the adjective expert would choose; if one of these experts wished to add some adjectives, it would invoke the adjective expert directly.

In the sentence, the array represents the flag, there are no adjectives to insert.

Under the heading of 'modifiers' are also prepositional phrases that appear after the noun phrase, as in the dog in the yard. The placement of all adjectives and prepositional modifiers in a noun phrase is controlled by the distance to the nearest noun phrase to the left that refers to the same object. That is, the further to the left there is a noun phrase referring to the same object, the less the negative influence there is against using these modifiers: The further away a reference to the same object, the more important it is to use a detailed noun phrase.

Verb Phrases

Verb phrases are handled very much the same way as noun phrases.

The Array Lives

The sentence produced thus far is:

The array represents the flag.

When the initial plan was made, no attention was paid to the details of the array, and now that the array expert has examined the array, it is about to add a number of new goals to be achieved at this planning island. In particular, the array expert wants to write about (in order of importance): the length, the name, the base, the dimension, the array markers, the element-type, the first element, and the last element. Moreover, the operations that are performed on the array and any array markers into the array are important to discuss. In the Dutch National Flag program, exchanges are performed only at the array-marker points. Fortunately, Yh knows specifically about these operations and can talk about them intelligently.

Remember that there is a negative influence against being too verbose with adjectives, which will cause some of these modifiers to be left out.

In the world there are objects, and there are qualities of those objects, which may be necessary qualities or accidental ones. In writing, mentioning an object is typically done with a noun phrase, and the qualities are expressed as modifiers. In Yh, there are two ways to influence the writing about objects and their qualities: 1) One can add influences which will increase or decrease the importance of discussing the objects or their qualities, and 2) one can add influences which increase or decrease the importance of using the means of expressing the objects or their qualities.

For example, if the influence which controls the means of expressing a quality is negative and strong enough, the quality probably cannot be mentioned - Yh is prevented from using the means to do it. If the influence which controls the importance of discussing a quality is strong enough, the quality probably will be mentioned.

If the importance of mentioning a quality is high, and the importance of not using adjectives is high, then Yh may express the quality in a separate sentence in which the quality is not expressed as an adjective. If the importance of mentioning the quality is not very high, and the importance of not using adjectives remains high, then the quality will not be mentioned. This behavior is a product of the mechanisms in the pattern matcher.

In the case at hand, the importances of mentioning some of the qualities of the array are low.

The array expert goes through this ordered list and decides facts to mention, knowing that the sentence it just generated contains the noun phrase the array. The first thing considered is the length; because there are a number of ways to talk about the length of an array, the decisions about how to mention the length may interact with some of the other information to discuss.

For instance, if the first and last elements are mentioned, or the base and the last element, then the length can be skipped. Which of these alternatives is used can depend on whatever aspects of the array have been discussed or whether there is some advice about what to discuss.

Given that the length is to be said directly, there are several ways to accomplish this. One is to say, the..., length n, ... array; another is ...array...of n elements. The first alternative is simply to add another adjective to the list of adjectives in the noun phrase so far. If this would result in an overly complex noun phrase, this alternative would be rejected.

Because I have specified to Yh that it ought not use a lot of adjectives, ...array... of n elements is chosen, but a negative preference is added to this method, which decays with time: The other methods of introducing modifiers to a noun phrase will tend to be used if the same request is made later. Human writers do the same thing: Recently used words and sentence constructions are avoided because they distract from the mental dream by their repetition.

Yh chooses this method using call by description. The influence that I added simply is weighed along with all the other considerations by the pattern matcher. The effect of this influence is to reduce the strengths of all adjective-adding methods.

Recall that the current sentence is:

The array of n elements represents the flag.

The name of the array is next, and there are several methods that can be used: Yh could say ...,named < name>,... in the adjective list, Yh could add a second sentence, or Yh could use an appositive to the noun group itself. This last method is selected, again because Yh was told to avoid adjectives, yielding:

The array of n elements, FLAG, represents the flag.

Next comes the base; in this case FLAG is a zero-based array, which means that the first element has index 0. The possibilities are the same as for the name of the array. The appositive route is abandoned both because double appositives are not pleasing and because an appositive was just used, which resulted in a negative preference being attached to that technique.

Zero-based is considered a single word and is inserted by a specialist on adding adjectives to existing noun phrases. The list of adjectives is located, and the new one is added at the front of this list. Recall that the text is represented as a parse tree along with an annotation of what is at each node. To locate the adjectives, the tree is searched to find the noun phrase node for the array, and then the adjectives are located by looking at that part of the subtree.

Similarly, the modifier one-dimensional is appended to the front of the sequence of current adjectives. Thus far the sentence is:

The one-dimensional, zero-based array of n elements, FLAG, represents the flag.

Recall that there is a transformation pending for moving the direct object of this sentence to a more prominent position in the sentence.

Those Pesky Array Markers

At this point the array expert is still controlling the writing process. It is turning its attention to the array markers by invoking an array marker expert.

When an expert is controlling the writing process, it can do one of several things: It can examine the situation description and the current text and decide on specific actions, like adding a word directly to the text; it can decide to invoke Yh recursively on a situation description and place the resulting text somewhere; or it can decide to invoke a sequence of experts using call by description.

An array marker is simply an index into an array which is used to keep one's place during a computation. In the Dutch National Flag program there are three array markers: one to mark the place to put red objects, which moves to the right; one to mark the place to put blue objects, which moves to the left; and one to scan through the array examining the color of things it finds, placing them in the right place.

Once it is decided to talk about one array marker, it is often wise to discuss all three in one place. Because these array markers are similar, it might be a good idea to talk about them similarly; perhaps the sentences can collapse to form one smooth, parallel sentence.

The array marker expert checks to see whether the array marker in question has already been discussed, which would have been posted as an influence. Then the array marker expert locates the array into which the array marker is an index; knowing this array, the expert locates all of its other array markers.

The expert sets up a dispreference for using the name of the array markers as the only referencing expression, and it calls Yh recursively to try to find a way to express the stereotyped phrase, there are n < objects>. The situation description that the expert uses is one which suggests a simple declarative sentence with subject (there), predicate noun (objects), and the modifier (n). Of course, representations of these components of the declarative sentence are used and not the words themselves. Writing this sentence is fairly straightforward: Yh adds the next sentence:

There are three array markers.

The array marker expert then decides to add the names to the right of this sentence as an ordered appositive, which will have a list of names and a parallel list of descriptions attached to the end. This is a standard way to introduce a list of objects with descriptions and names, attaching the names in a parallel construction, and, though it is very idiomatic, there is no good reason not to use this technique.

The array marker expert makes up an extended description of what it wants done: the stereotyped phrase description, the list of names and associated descriptions, and some other hints to the next writing expert to be called. Yh is then called recursively.

The stereotyped phrase expert adds the phrase,

, L, M, and R, standing for left, standing for middle, and standing for right, respectively.

Notice that the gerund form of the phrase, stands for x, must have been derived from the program text for the Dutch National Flag program. This derivation is performed by the parsing system in PSI, and the result is placed in Yh's dictionary.

Respectively is added to the end of the sentence. While inserting these infinite verb phrases, a verb phrase collapsing expert notices that they are the same and notes a possible collapse. Because there is an influence that states that it is better to collapse immediately than to wait, the collapsing is attempted right away.

Up to this point Yh has been lucky in that all of the things that it needed to do regarding special phrases or circumstances have been handled by an expert in that area. But luck can run out, and in the situation of collapsing these parallel phrases, there is none left. In this case the verb phrase collapsing expert can only notice that collapsings are possible, but it does not know how to actually collapse sentences with the same verb phrase! When this expert looks for another expert to actually do the collapsing, it finds only a general phrase collapsing expert. This general collapsing expert simply tries to eliminate all the common words from each phrase except the first. Thus, given the phrases, standing for left, standing for middle, and standing for right this expert will try to get rid of the phrase, standing for, from the second two.

The transformation, however, does not eliminate the extra words per se, but simply hides the words from the sentence printer, leaving the original wording available in case it is needed later: Perhaps a transformation will wish to recover that wording.

Observing

In the previous section I stated that one of Yh's experts 'notices' an event taking place, and earlier I stated that observing was an important part of the plan-execution part of Yh.

When any event takes place, the expert causing the event to occur formulates a description of the event and does a call by description on that description. The description states that an event is occurring, and the experts who observe events of that type are allowed to run.

For example, when noun phrases are added to the text, an announcement is made reporting that a noun phrase satisfying a certain situation description was inserted in the text by a certain expert and located at a particular place in the text. An expert that keeps track of all noun phrases is invoked and adds that information to its own database. A similar activity takes place for the verb and other sorts of phrases and words.

Recall that Yh is agenda-driven. One of the items on that agenda causes observation experts to perform activities based on what they have observed. If, for example, a noun phrase observer has noticed the same noun phrases being generated in different places, then an expert to consider merging the phrases will be eventually invoked. Additionally, observation experts are able to perform their actions right away, and this is what I asked Yh to do when I advised it to perform all collapsings immediately.

Pondering the Issues

Sometimes a simple examination of the explicit properties of an object does not bring forth all of the interesting things that might prove useful in writing about it. For instance, one interesting thing about an array marker is the value to which it is initialized. In the representation above this fact is not mentioned outright, but is hidden in the program code.

The array marker specialist invokes an expert that reads the code and finds out to what the various markers are initialized. Because the annotated code states the purpose of the lambda binding, it is possible to specify which lambda's cause initialization rather than saving/restoring.

The line:

(let ((l 0)(m 0)(r (1- n)))     ;initialize l,m, & r

is annotated to state that the initialization values are 0, 0, and n-1.

Thus, three new sentences are proposed, one for each initialization:

L is initialized to 0. M is initialized to 0. R is initialized to n-1.

The names, L, M, and R, are used because there is no requirement to describe fully the markers, because they have already been introduced.

To express (1- n), a special routine is called that will convert the standard Lisp prefix notation to mathematical infix notation for external printing purposes.

As these last three sentences are generated, Yh notices that the first two have the same direct object, and all three have the same verb phrase. Additionally, the previous sentence about the array markers is noted to have used the names, L, M, and R.

The Fun Begins

Given that the initial paragraph looks like:

The one-dimensional, zero-based array of n elements, FLAG, represents the flag. There are three array markers, L, M, and R, standing for left, middle and right, respectively. L is initialized to 0. M is initialized to 0. R is initialized to n–1.

there are still some loose ends to tie up and some transformations to apply. At this stage, this paragraph is the best that Yh can do by making local decisions about paragraph structure, sentence structure, and word choice.

Let me number the sentences:

(1) The one-dimensional, zero-based array of n elements, FLAG, represents the flag. (2) There are three array markers, L, M, and R, standing for left, middle and right, respectively. (3) L is initialized to 0. (4)} M is initialized to 0. (5) R is initialized to n–1.

First, it might be possible to collapse sentence (2) with some or all of sentences (3), (4), and (5) - L, M, and R are common noun phrases. But because in sentence (2) L, M, and R are used as direct objects and in sentences (3), (4), and (5) they are used as subjects, the only way to accomplish such a collapsing would be by making further relative clauses to the direct objects, which would result in a sentence like:

There are three array markers, L, which is initialized to 0, M, which is initialized to 0, and R, which is initialized to n-1, standing for left, middle, and right, respectively.

This would be a very complex sentence; this option is rejected on the grounds of complexity.

Sentences (3), (4), and (5) pose something of a problem because they are so closely related to each other. All three have the same verb phrase structure, and the first two have the same direct objects. The latter fact causes sentences (3) and (4) to be collapsed by merging the predicate parts. Therefore the subjects are conjoined with and, and the verb phrase is transformed into the plural. The direct object is left as it is. The situation description in the text annotations is patched to reflect the fact of the multiple noun phrase.

The new third sentence is then:

(3') L and M are initialized to 0.

Sentence (5) has the same verb phrase as sentence (3'), but sentence (3') is fairly complex already, so Yh choses to simply bring them closer together with a punctuation change. The last sentence of the paragraph, hence, becomes:

L and M are initialized to 0; R is initialized to n–1.

Another option for these last three sentences is to use the parallel construction:

L, M, and R are initialized to 0, 0, and n-1, respectively.

This is not done because it produces a sentence with the same structure as the one before it. This is determined by producing the description of the sentence that would result from the collapse of sentences (3), (4)}, and (5) and comparing that description with the description of sentence (2).

Finally, Yh has to transform the first sentence to the passive voice in order to change the focus from the array to the flag. The first sentence becomes:

The flag is represented by the one-dimensional, zero-based array of n elements, FLAG.

Alternative First Paragraphs

By increasing the dispreference of adjectives and adjusting the influences on how things such as modifiers can be introduced, the following paragraphs were generated in place of the first one:

The flag is represented by an array of n elements, FLAG. It is a 1-dimensional array. There are three array markers, L, M, and R, standing for Left, Middle, and Right, respectively. L and M are initialized to 0, the first element; R is initialized to n-1, the last element.

The flag is represented by an array of n elements, FLAG. It is a 1-dimensional array with N elements. There are three array markers, L, M, and R, standing for Left, Middle, and Right, respectively. L and M are øinitialized to 0, the first element; R is initialized to n–1.

A Weird Alternative

Suppose that all of the structure-producing experts were removed from Yh, leaving only the programming knowledge experts and the lexicon, what would Yh say? It would say:

Array N elements Flag Represent One-dimensional Zero-based Array markers Three L M R Standing for Left Middle Right L Initialize to 0 M Initialize to 0 R Initialize to n-1

Conclusion

Yh does a fair job of writing about a small class of programs, but it is not a production quality program. It does not even perform very many of the things that we saw go into good writing.

Yh does not do explicit reasoning about shared information, nor does it reason about the implications of facts introduced in the text it writes. However, in writing about simple programs very little reasoning is required, and, therefore, this is not much of a problem. There are commonsense reasoning programs that could easily be adapted for use in Yh. [Creary 1984][Gabriel 1983].

Yh does not explicitly consider whether its writing produces vivid and continuous images in the reader. Certainly there is no mechanism for Yh to experience those images itself. And Yh never actually re-reads any of its writing, although it reviews its writing using the description mechanism. The level of success of this review process is encouraging, and, combined with a commonsense reasoning expert which could reason about knowledge and belief, this technique could be sufficient for many writing tasks.

Yh takes some actions aimed at producing good writing: Yh plans its text carefully, it deliberates over word choice, and it is sensitive to potential ambiguities in its wording.

As I stated at the beginning, a writer has an intimate relationship with his human reader. Judgment, sensitivity, humor, the human facts and experiences - especially the literary experiences that help give a writer his voice - are things that I believe are difficult to give to a computer, but maybe not impossible.

References

[Creary 1984] Creary, Lewis G., The Epistemic Structure of Commonsense Factual Reasoning, Stanford Computer Science Department Memo, to appear.
[Gabriel 1981] Gabriel, R. P., An Organization for Programs in Fluid Domains, Stanford Artificial Intelligence Memo 342 (STAN-CS-81-856), 1981.
[Gabriel 1983] Gabriel, R. P., Creary, Lewis G., a reasoning program written by Gabriel and Creary at Stanford University from 1982 - 1983}, no documentation.
[Gardner 1984] Gardner, John, The Art of Fiction, Alfred A. Knopf, New York, 1984.
[Green 1977] Green, Cordell, A Summary of the PSI Program Synthesis System in the Fifth International Joint Conference on Artificial Intelligence - 1977, Cambridge, Mass, 1977.
[Simon 1969] Simon, Herbert A., The Sciences of the Artificial, MIT Press, Cambridge, 1969.
[Thomas 1974] Thomas, Lewis, The Lives of a Cell in The Lives of a Cell, Bantam Books, Inc., 1974.