Sunday, February 5, 2017

Parsing in ZILF, part 3: From noun phrase to object

This is the third in a series of posts describing ZILF's parser. Read part 2 here.

In the second post, we covered noun phrases -- phrases like "all cubes except red" and "lamp, food, and bottle" that the player can type to refer to objects.

We also covered how the parser recognizes them and how it represents them in memory: the PARSE-NOUN-PHRASE routine scans the player's command and fills in a NOUN-PHRASE structure, which holds a list of adjective/noun pairs called OBJSPECs.

But how does the parser identify which objects the player is referring to?

After MATCH-SYNTAX finds a syntax line that matches the player's command, the FIND-OBJECTS routine combines the noun phrases with the find flag and search options from the syntax line, and decides what needs to be done for each object required by the syntax line.

There are a few possibilities, depending on what the player typed. It can:

  • match a noun phrase,
  • expand a pronoun,
  • supply a missing object,
  • ask the player to clarify,
  • or fail, printing an error message.


Matching a noun phrase

The most obvious choice: if the player typed "shiny lamp", we want to find an object whose ADJECTIVE and SYNONYM properties contain the words SHINY and LAMP, somewhere within the player's reach.

The routine MATCH-NOUN-PHRASE does just that. It runs through the OBJSPECs included by the noun phrase -- called YSPECs, "Y" for "yes" -- and for each one, it uses a MAP-SCOPE loop to search for objects that match the YSPEC and aren't excluded by any NSPECs ("except" clauses). The objects that match are written into a table (P-PRSOS or P-PRSIS), where they'll eventually be used to perform the action.

Scope stages

MAP-SCOPE is a powerful, flexible loop statement, which chooses a set of scope stages (from scope.zil) and runs the loop body for each object encountered in those stages.

In part 1, we saw that the syntax line contains search options telling the parser where to look for the direct and indirect objects. Those options correspond to scope stages: if we give MAP-SCOPE a set of search options, it'll only use the stages corresponding to those options.

When it comes to interpreting a player's command, though, the search options are really only guidelines. If the player types "pick up axe" when there are two axes available, they probably don't mean the axe they're holding, so we only want to match the one they aren't holding. But if there's only one axe and they're holding it, we want to settle for that one, so we can come up with a better error message than "You don't see that here."

So the parser might need to do more than one search for the same YSPEC: once with the scopes suggested by the syntax line, and again with a wider set of scopes if it didn't match any objects the first time.

(That's an oversimplification. The way it actually works is... complicated... in order to address some special cases. It seemed like a good idea at the time.)

Match quality and INVISIBLE

OBJSPECs contain an adjective and a noun, and they can match an object using either or both. In fact, since a word can be both a noun and an adjective, an OBJSPEC that only contains a noun can match an object's adjective.

If the player types "get polish" when there's a Polish sausage and a can of shoe polish available, we assume they mean the can of shoe polish, because the command makes more grammatical sense with a noun than an adjective. But if only the sausage is available, we accept a match using only the adjective.

That logic is in the REFERS? routine, which checks an OBJSPEC against an object's ADJECTIVE and SYNONYM properties and returns a match score: 0 for no match, 1 for adjective-only, 2 for noun-only, and 3 if the adjective and noun both matched. Only the highest-scoring set of objects are kept as matches: a single noun match takes precedence over all the adjective-only matches.

As a special case, objects with the INVISIBLE flag can never match. The parser skips them without even calling REFERS?.

OBJSPECs that don't match

Usually, it's an error if any of the OBJSPECs in a NOUN-PHRASE returns no matches: if the player types "get lamp and sword" when there's no sword, we assume they won't settle for just the lamp.

But there's an exception for "except". If every object matched by a YSPEC is excluded by one of the NSPECs, the parser lets it slide, so "get all swords and shields except rusty" can succeed even if every sword available is rusty. If every shield is rusty too, though, the parser will complain "There are none at all available!"

GENERIC-OBJECTS and pseudo-objects

GENERIC-OBJECTS is like GLOBAL-OBJECTS, but for objects that should only match as a last resort. Typically those are concepts like NUMBER, or conversation topics, or placeholders for objects the player can refer to when they're not present (like an NPC they might want to follow).

The parser never considers GENERIC-OBJECTS until a YSPEC fails to match in all the other scopes, at which point it enables "ludicrous scope" and starts over.

ALL by itself

Commands with no YSPECs, like "take all" and "drop all but lamp", form a special case that skips most of the steps above.

In this case, the parser calls MAP-SCOPE to find everything in the player's reach, but instead of REFERS?, it calls ALL-INCLUDES? to filter out objects that "all" shouldn't apply to. That means INVISIBLE objects and the WINNER for sure. For TAKE and DROP, it also skips any objects that need special care to pick up (ones with TRYTAKEBIT) or can't be picked up at all (ones missing TAKEBIT).

Also, in this case, GLOBAL-OBJECTS and GENERIC-OBJECTS are always skipped. We assume the player means all the nearby objects, not their hands, the sun, or conversation topics. LOCAL-GLOBALS can still match.

Returning the matches

If an error occurs in any step above, MATCH-NOUN-PHRASE returns zero after printing an error message.

If a single object was matched, it returns that object.

If multiple objects were matched, a few things can happen:
  • If the mode is "all" or if the player clearly asked for more than one object (by providing multiple OBJSPECs), it returns the placeholder MANY-OBJECTS, which tells PERFORM to repeat the action for all the objects in P-PRSOS or P-PRSIS.
  • If the mode is "any", it picks a random object from the set of matches, prints a clarifying message, and returns that object.
  • If any of the matched objects have a GENERIC function, it calls them all until one of them returns an object.
  • If none of that worked, it asks the player to clarify which matched object they mean, and puts the command on hold until they answer. (This is what most languages call disambiguation. In ZIL, it's called orphaning, and we'll cover it in more detail in a later post.)
That's all for MATCH-NOUN-PHRASE. But what if the player didn't name an object at all?

Expanding a pronoun

A noun phrase can also just be a pronoun. Before MATCH-NOUN-PHRASE is called, EXPAND-PRONOUN gets a chance to recognize the pronoun and load the appropriate list of objects.

The pronouns are defined in pronouns.zil, and the list of objects for each pronoun is saved either by the parser (which calls SET-PRONOUNS after a successful parse), or by explicitly calling THIS-IS-IT or CONTENTS-ARE-IT to make the pronouns refer to something the game mentioned.

Since the objects or the player may have moved since the list was written, EXPAND-PRONOUN also makes sure the player can still see them.

Supplying a missing object

Sometimes the syntax line that comes closest to matching the player's command still isn't a perfect match, because the command is incomplete.

For example, if the player types "enter" by itself, the only syntax line for that verb requires an object, so the parser needs to find an object to put there:
<SYNTAX ENTER OBJECT (FIND DOORBIT) (IN-ROOM) = V-ENTER>
The GWIM routine ("Get What I Mean") uses the find flag and search options to look for a single object that matches. If there's exactly one object with that flag (DOORBIT) in that scope (IN-ROOM), GWIM prints a clarifying message and uses that object as the missing noun. Otherwise, it fails, and FIND-OBJECTS orphans the command after asking "What do you want to enter?"

The find flag is optional, but important. If it's omitted, GWIM can only succeed if there's only one object at all in that scope.

KLUDGEBIT

GWIM is also called for perfectly good commands, when they happen to end with a preposition.

Recall from part 1 that every preposition in a syntax line has to be associated with a noun phrase. According to ZIL's primitive concept of sentences, "turn lamp off" isn't a valid sentence, and neither is "take inventory" (since "inventory" isn't a noun phrase, it's fixed syntax). But of course we want to support those syntaxes. The solution is to put the extra object slot in the syntax line, and mark it with a special find flag:
<SYNTAX TURN OBJECT (FIND DEVICEBIT) OFF OBJECT (FIND KLUDGEBIT) = V-TURN-OFF>
Now "turn lamp off" is parsed as if it's missing the second noun phrase, and when GWIM is called for it, it sees the KLUDGEBIT and silently fills in a special object for PRSI: ROOMS, which the player could never refer to otherwise.

This doesn't stop the player from providing a second noun phrase if they want to, which can be a mixed blessing. For example, the same syntax line that uses KLUDGEBIT to match "put brick down" could also match "put brick down chute"... but if the verb code isn't written to handle both uses, the second noun will simply be ignored.

No comments:

Post a Comment