| Second Generation Adventure Games | |
| Reprinted from | |
| The Journal of Computer Game Design | |
| Volume 1, number 2, (August 1987): pages 4-7 | |
| Copyright 1987 by David Graves | |
| dag@hpsemc.cup.hp.com | |
| Expanding the horizons of the traditional adventure game format has created a | |
| new genre in literary entertainment: the participant novel. In a | |
| participant novel, the player is more a character in a plot than a solver of | |
| puzzles. Implementing a participant novel demands special computer modeling | |
| techniques. As the program increases in complexity, use of the proper | |
| representation of the novel's simulated environment becomes critical. | |
| Without a strong model of the world and its physical laws, and without a | |
| dynamic interface to this world, the developer is doomed to producing a | |
| limited and unstimulating game. The field of Artificial Intelligence | |
| provides programming constructs useful for creating these games of greater | |
| complexity and sophistication. This paper will present a few ideas borrowed | |
| from the field of AI: use of lists and trees, attributes, natural language | |
| parsing, English generation, semantic frames, and goal-oriented routines, and | |
| show how they support the complex model that a participant novel requires. | |
| All of the ideas presented have been successfully applied to a working game | |
| system. | |
| LISTS AND TREES | |
| Given that a participant novel will be proliferated with a great many people, | |
| places, and things (hereafter called "objects"), there must be a dynamic | |
| method for tracking and altering the position and relationships of each | |
| physical object. One can create a model of the world as lists of containers: | |
| cities contain houses, houses contain rooms, rooms contain things, and things | |
| contain other things. The traditional method for representing complex | |
| relationships of objects is through lists of lists, also called trees. Under | |
| this arrangement, every object has pointers to its parent (the thing that | |
| contains it), its children (the things it contains), and its siblings (the | |
| things in the same place with it). This can be coded as three arrays: | |
| parent, child, and sibling. Each slot in the array contains either an | |
| object's identifying number or "nil" (a null pointer). An object that is | |
| empty would have nil in its child slot, and an object that is alone in a | |
| container would have nil in its sibling slot. (Object numbers would not be | |
| declared explicitly, of course; they would be assigned to each object's | |
| symbol name by a compiler). | |
| Objects containing several other objects are represented using the child and | |
| sibling arrays together. The container points to the first child and the | |
| first child points to his sibling, who points to the next sibling. The end | |
| of the list is indicated with a nil sibling pointer. For example, a chest | |
| containing a lamp and a rope would have the object number of the lamp in its | |
| child slot, and the object number of the rope would be in the lamp's sibling | |
| slot. The lamp's sibling slot would be nil. Both the lamp and rope would | |
| have the object number of the chest in their parent slots, and nil in their | |
| child slots. | |
| The sibling and child links define a formal tree. The links from a child | |
| back up to its parent result in a directed graph rather than a tree proper. | |
| The inclusion of the parent links allow easier traversal of the tree in | |
| either direction. | |
| Representing the world in this way makes it very easy to manipulate groups of | |
| objects. Moving a chest to new location involves changing only the pointers | |
| of the old and new locations and the chest. The contents of the chest never | |
| need be considered. No matter how large the object tree becomes, moving any | |
| object and all of its contents can be done by changing only a few pointers. | |
| Since people, places and things are all represented in the same type of data | |
| structure, only one routine is required for all types of object movement. | |
| It is important that the constructs used in the model be simple and | |
| consistent. The model should be general enough that special constructs will | |
| not be required each time the programmer thinks of a new type of object to | |
| implement. For example, supporting surfaces (such as tables and chairs) can | |
| be implemented using the same parent, child, sibling scheme. The objects | |
| resting on the table are its children. The internal representation of the | |
| table and its children is identical to the container representation, so | |
| supporting surfaces are easily added to the model. | |
| ATTRIBUTES OF OBJECTS | |
| All objects have characteristics or "attributes" that define that object's | |
| limitations. Some attributes are boolean (stating whether this object is | |
| flammable or immovable for example) and some may have a numeric value | |
| associated with them, such as the weight and volume of an object. Some | |
| attributes are unchangeable (such as flammability and mass) and some are | |
| variable, describing only the current state of an object (i.e. closed, | |
| locked, and burning). | |
| Using attributes to define how an object may be manipulated within the model | |
| world allows the programmer to define a much more consistent and dynamic | |
| world. Consider the case where the programmer creates a fireplace and some | |
| logs. He adds some special case logic to his program that will support | |
| burning the logs with the expectation that the player will attempt to burn | |
| the logs in the fireplace. However, if the player tries burning a chair in | |
| the fireplace, nothing happens because the luckless programmer didn't think | |
| of the player trying that. | |
| This mess can be avoided by building the game's verb routines around a system | |
| of attributes. By defining the attribute "flammable" and making the Burn | |
| verb check for the presence of this attribute, the programmer allows the | |
| player to burn down anything flammable, such as furniture, papers, doors, and | |
| fireplace logs. Thus, he defines the generalized phenomenon of "burning" | |
| rather than coding individual cases. | |
| Once the programmer has decided on the set of attributes he wishes to | |
| support, he need only write individual action routines that manipulate those | |
| attributes, then decide which attributes apply to each of his objects. Thus, | |
| some whiskey could be defined as having the attributes liquid, edible, and | |
| flammable. Groups of attributes can work together in subtle and wonderful | |
| ways, allowing the player to perform activities in ways not even considered | |
| by the programmer. Would you think of putting moonshine whiskey in a lantern | |
| if you had no kerosene? | |
| RECOGNIZING NATURAL LANGUAGE | |
| Understanding natural language has proven to be a difficult problem in AI, | |
| but much can be accomplished in a participant novel by recognizing relatively | |
| simple statements in the following syntactic form: | |
| [Subject] VerbClause [DirectObject] [Preposition IndirectObject] | |
| Optional sections of the syntax are enclosed in brackets. Each word in a | |
| command typed by the player is looked up by a dictionary routine and is | |
| identified as a verb, noun, adjective, or whatever. The most simple command | |
| or "statement of intent" is simply a verb. Instructions to another person | |
| are indicated by using that person's name as the subject of the sentence, as | |
| in "Sam, run!". Omitting the subject of the sentence implies that the player | |
| himself is the actor. Direct and indirect objects are noun clauses, which | |
| consist of some optional adjectives followed by a noun. Prepositions like | |
| "at," "in," and "on" separate a direct object from an indirect object. | |
| Conjunctions (such as "and"), punctuation, and articles (such as "a" and | |
| "the") can be tossed out by the parser as "noise", or verified to appear in | |
| the expected positions. | |
| This syntax will accept input strings such as "Put the gray stone into the | |
| large oak chest". By further extending the parser to allow a list of objects | |
| as a direct object and allowing sentences to be strung together, the program | |
| can recognize statements like "Open the chest, take the lamp, the knife, and | |
| the long rope, then climb the stairs". Since everyone has a slightly | |
| different way of saying what they mean, the program must have a very large | |
| dictionary with links between synonyms. Thus, "Set the lamp on the table" | |
| and "Place the lantern onto the table" could be recognized as having the same | |
| meaning. | |
| Travel is something that happens frequently in this type of game, so it | |
| should be very easy for the player to express what is desired. The parsing | |
| scheme above will recognize travel syntax such as "Climb the tree" or "Enter | |
| the cottage," but not "Go east" or its abbreviated forms, "East" and "e". | |
| Recognition of these few irregular forms can be added to the parser as a | |
| special case. | |
| The parser should be able to recognize modifiers for verbs. The simple case | |
| is the use of an adverb, as in "Carefully open the door". A more complex | |
| case is where the player uses a verb and a preposition together in a way that | |
| changes the meaning of the verb. For example "put down" means drop, "put | |
| out" means extinguish, and "put on" means wear. This can be implemented by | |
| creating a table with verb/preposition pairs followed by the resulting verb. | |
| When a verb/preposition pair in the command string matches an entry in the | |
| table, it can be replaced with the resulting verb. Parsing then continues as | |
| if no preposition has been seen yet. This technique also allows recognizing | |
| a command with a dangling preposition such as "Take the cloak off" which | |
| would otherwise be rejected by the parser as an incomplete sentence. | |
| We are all accustomed to using pronouns (such as "it") and expecting our | |
| listener to resolve what we mean by reviewing the preceding context. A | |
| natural language parser might be expected to do the same. A simple rule for | |
| implementing parsing of "it" could be "Replace the 'it' with the last direct | |
| object named". This allows proper parsing of "Put the lamp on the table and | |
| light it," but not "Put the lamp into the chest, then close it". In the | |
| second statement, the player probably meant to close the chest, not the lamp. | |
| Changing the rule to use the last direct or indirect object in place of "it" | |
| won't work in all cases either. The effect of this rule on the command "Put | |
| the lamp on the table and light it" would be to set the table on fire. While | |
| it is possible to define the semantics of "it" for each of these cases | |
| through complex programming, it might make more sense to choose a simple | |
| rule, such as "use the last direct object in place of 'it' in all cases," | |
| then inform the user of the limitations of the parser through the game | |
| documentation. | |
| The parser could be expanded to allow words such as "everything" or "all" in | |
| place of a direct object. This lets the player make statements like "Drop | |
| everything into the chest". Note that the semantics of "everything" change | |
| with the verb used. In "Drop everything," "everything" really means | |
| "everything that I'm holding," while in "Get everything," it means | |
| "everything that I'm not holding". For some verbs, such as "Examine," the | |
| scope of the word "everything" is "all visible objects, whether held or not". | |
| The player may also wish to set the scope of "everything" to a limited | |
| domain, as in "Look at everything on the table". Regardless of how | |
| "everything" is used in a sentence, it can be implemented by simply calling | |
| the verb action routine once for every object within the the command's scope. | |
| GENERATING ENGLISH | |
| While interacting with a participant novel, the player is going to read vast | |
| quantities of text. Some of this may be "canned text" and some will need to | |
| be generated. The constant attributes of a location could be described with | |
| fixed text, but certainly not the descriptions of the objects there. Since | |
| most objects can be moved around, placed in and on each other, and change | |
| state, the program must be capable of dynamically producing the text to | |
| describe them. The key to generating interesting text is to vary the length | |
| of the sentences generated, vary the sentence structure, and use different | |
| synonyms whenever possible. For example, to describe the "children" of a | |
| table the program might generate: "There is a knife, some bread, and a key | |
| on the table," or "A knife, some bread, and a key are on the table," or | |
| "Resting on the table are a knife, some bread, and a key". The description | |
| generator would have a list of synonymous phrases such as "resting on," | |
| "lying on," and "sitting on". | |
| A text generator must pay careful attention to the issue of grammar. When a | |
| sentence contains a list of objects, the objects must be counted so that the | |
| conjunction "and" can be inserted before the last object. Lists containing | |
| more than one object will use "is" instead of "are" (except when using the | |
| form "There is a <list>...."). Each object in a list should be followed by a | |
| comma (except the last two) and preceded by a the appropriate article ("a," | |
| "the" or "some"). Selecting the article for an object depends on the | |
| attributes of the object and the type of sentence. When introducing an | |
| ordinary object, the article is "a" or "an". An object that has the | |
| attribute of being "a quantity" (bread, wine, cheese) the article would be | |
| "some". Familiar people would be mentioned by name, without any preceding | |
| article. Objects that were mentioned recently would have the article "the". | |
| The attributes of an object will also dictate the type of preposition used in | |
| the generated text. For instance, describing the children of a surface (such | |
| as a table) would dictate the use of "on," while containers would have things | |
| "in" them. | |
| The result of the player's actions can be reported via a text generation | |
| facility as well. When the player enters a vague command such as "Drink," | |
| the program would give all the details of what happened, like "You drink some | |
| cool water from the stream," thus informing the player that his canteen has | |
| not been taxed. This reporting mechanism should make use of pronouns (like | |
| "he," "she," or "it") when an object is mentioned more than once in a | |
| sentence, to avoid generating text that sounds too mechanical. An example of | |
| this would be "You give the bowling ball to Sam, but he drops it". As | |
| before, the attributes of an object will aid in selecting the appropriate | |
| pronoun. | |
| FRAME-BASED EXCEPTIONS | |
| Any non-trivial model of reality will contain a large number of objects whose | |
| characteristics and behaviors differ from normal expectations. These | |
| exceptions might range from personality quirks to magical or "high- | |
| technology" physical properties that deviate from the norm. A computer model | |
| of such a world needs a simple method to define, organize, and deal with all | |
| these exceptions. One method is to attach to each exceptional object some | |
| information about the context in which an exception applies, and the effect | |
| of invoking that exceptional behavior. The effect is defined as a fragment | |
| of executable code. The context is defined in a semantic frame. In AI, a | |
| frame may be defined as "an instantiation of a context". In this scheme, a | |
| frame is a simple template for a single semantic action using a specific verb | |
| type and specific objects. Any object can point to a unique frame, which | |
| points to a code fragment. This way, a frame defines the semantic context in | |
| which an exception handling code fragment applies. | |
| [Addendum, August, 1991: Note that in object-oriented languages, frames are | |
| a built-in construct (usually called "selectors"). Thus an object-oriented | |
| language makes it very easy to implement a base of general rules with | |
| hierarchies of exceptional rules (methods)]. | |
| Assume we had modeled the mouth of a volcano as a large container. Normal | |
| semantics for an object tossed into container dictate that the object will | |
| come to rest inside the container. In this case, however, we want to have | |
| the object be completely destroyed, thus eliminating any possibility of | |
| getting it back. The volcano could then be defined with a frame that reads | |
| "Throw anything into the volcano", to provide the context in which the | |
| exception will be invoked. Whenever an actor throws any object into the | |
| volcano, this frame is selected as matching the actor's action. After the | |
| default processing for this action, the code fragment attached to the frame | |
| is executed. The code fragment would cause the object to be destroyed, and | |
| perhaps display a colorful description of the event. | |
| Frames may specify a specific context (by requiring a specific verb, direct | |
| object, and indirect object) or a more general context (with a range of | |
| verbs, and "wild card" matching of direct or indirect objects). Moreover, | |
| some frames must take precedence over others. Any action might match the | |
| frames attached to the direct object, the indirect object, the location, or | |
| the character performing the action. Since an action might invoke several | |
| conflicting exceptions, the game program must select the one frame that is | |
| the best match to this action. In most cases, "best match" means selecting | |
| the matching frame with the most specific statement of action. (Other | |
| criteria for precedence may be used as well, such as selecting the frame | |
| attached to the direct object rather than the indirect object). Consider the | |
| case of an explosive with a context frame stating "Throw the explosive into | |
| the volcano". This frame is more specific than the volcano's "Throw anything | |
| into the volcano", so the explosive's frame is selected. The code fragment | |
| can then define some effect that is an exception to the "normal" exception. | |
| An object may have any number of frames attached to it, and multiple frames | |
| may point to the same code fragment. A code fragment may override the | |
| default semantics for an action completely, or simply append additional | |
| semantics. Finally, objects may be defined which inherit frames from other | |
| objects, which speeds development and lowers the chance of coding error when | |
| defining objects with similar exceptional properties. It is important to | |
| note that the use of semantic frames is intended only for those actions or | |
| properties that are exceptional, and should therefore be used as sparingly as | |
| possible. Attaching the same exception frame to many objects can be an | |
| indication that it is not so exceptional after all. In this case, one may | |
| want to define a new object attribute for this "non-exceptional" exception, | |
| and handle those objects via the normal semantics routines. | |
| Incidentally, one can use a scheme of frames and code fragments to control | |
| behavior for simulated actors in the story, but the results are very limited. | |
| The actor becomes only a "stimulis-response" being; he responds only to | |
| actions or commands from the player, and speaks only when spoken to. For | |
| example, an actor named Sam might have a frame attached to him, stating | |
| "Throw anything at Sam". If the player (or another actor) throws any object | |
| at Sam, the attached code fragment is executed, which might put out a message | |
| like: "Sam is outraged by the insult, and draws his weapon". Clearly, the | |
| actor can only demonstrate those behaviors that were pre-programmed. Things | |
| become much more interesting when actors are capable of original behavior. | |
| Each computer-controlled actor should be able to recognize a need, then | |
| create a plan to fulfill it. Different personality types (based on | |
| attributes, once again) would give rise to different types of behavior. | |
| GOALS AND SUB-GOALS | |
| A standard feature in many adventure games is the requirement that the player | |
| remember long sequences of actions in minute detail and type these sequences | |
| when needed. Forgetting to perform one of the steps in a sequence results in | |
| an error message. For example, if the player sees a bottle of beer on the | |
| table and says "Drink the beer," he is likely to get an error message like | |
| "You aren't holding the beer" or "The beer isn't open". Memorizing detailed | |
| sequences of instructions is not what most people would call fun. Wouldn't | |
| it be much nicer if the computer could just "understand" the player's intent? | |
| It turns out that it is relatively easy to create a system where the | |
| programmer defines a corrective action for such an error, instead of giving | |
| an error message. By defining the prerequisite state attributes for any | |
| action, the system can automatically break a goal into sub-goals. Thus, when | |
| the Drink routine finds that the player is not holding the beer, a new sub- | |
| goal is set: get the beer. Upon resolving that sub-goal, the Drink routine | |
| is re-entered. It then checks to see if this container is open and if not, | |
| it sets a new goal: open the container. In the end the player is told of | |
| how all these events proceeded with a message like: "You take the beer from | |
| the table, open it, and drink from it". | |
| Filling in the implied prerequisite actions is a simple form of planning, | |
| called backwards chaining. Any new goals must be stored on a stack rather | |
| than in a static structure, since any goal can create new sub-goals, which | |
| may create others. Each goal must be resolved in order, the latest goal | |
| first. The game program simply attempts the action on the top of the stack, | |
| pops it off if successful, or pushes new sub-goals if needed. It is | |
| important to note that only errors concerning variable attributes are | |
| correctable. Errors caused by conflicts in fixed attributes (such as | |
| attempting to set fire to a stone) are not correctable by creating a new | |
| sub-goal. If a non-correctable error occurs, the player is informed and his | |
| goal stack is cleared. | |
| Note that once all of the verb routine have all their prerequisite states | |
| defined as sub-goals, it becomes very easy to simulate intelligent behavior | |
| by other actors in the story. For example, if the player asks another | |
| character to "Go out, get the rope and return," the character appears to make | |
| intelligent decisions like opening the door before attempting to exit, and | |
| untying the rope from a tree before attempting to walk away with it. The | |
| pace of the game remains brisk since the player need not specify each | |
| detailed step in a process. The software "understands" what is implied by | |
| the player. | |
| SUMMARY | |
| By selecting a simple, yet powerful representation for the system, the | |
| designer can create a much more dynamic, consistent model of the novel's | |
| fantasy world. Building each verb routine around a set of attributes instead | |
| of coding special cases allows the player to perform actions that were not | |
| thought of by the designer. A well defined parsing scheme, aided by a large | |
| dictionary, allows a player to form his commands many different ways, instead | |
| of just the one "right way". Dynamic generation of text in a variety of | |
| formats can make the computer generated text much more interesting for the | |
| player to read. Defining exceptions to normal semantics via frame-activated | |
| code fragments aids in managing the complexity of such exceptions. Routines | |
| that can detect simple state errors and create correcting sub-goals can fill | |
| in the implied actions, and thereby help the game keep pace. Of course, even | |
| the best software needs a well written story to make an interesting | |
| participant novel, but that a topic for another paper altogether. | |
Xet Storage Details
- Size:
- 23.2 kB
- Xet hash:
- 6c00218b7142c5c4d197a36a76ada66d8eae6944bebbca8d7147ce9a83acb7bc
·
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.