Sunday, February 5, 2017

Parsing in ZILF, part 3: From noun phrase to object

This is the third in a series of posts describing ZILF's parser. Read part 2 here.

In the second post, we covered noun phrases -- phrases like "all cubes except red" and "lamp, food, and bottle" that the player can type to refer to objects.

We also covered how the parser recognizes them and how it represents them in memory: the PARSE-NOUN-PHRASE routine scans the player's command and fills in a NOUN-PHRASE structure, which holds a list of adjective/noun pairs called OBJSPECs.

But how does the parser identify which objects the player is referring to?

After MATCH-SYNTAX finds a syntax line that matches the player's command, the FIND-OBJECTS routine combines the noun phrases with the find flag and search options from the syntax line, and decides what needs to be done for each object required by the syntax line.

There are a few possibilities, depending on what the player typed. It can:

  • match a noun phrase,
  • expand a pronoun,
  • supply a missing object,
  • ask the player to clarify,
  • or fail, printing an error message.

Tuesday, July 19, 2016

It's an amazing smile, even the suit has teeth

(Note: vaporware: interactive fiction normally sticks to technical topics, but I'm making an exception to respond to a post directed at me that doesn't allow comments.)

In a way, it's flattering. Pillars of the IF community like Zarf and Emily Short have had detractors following them around for years, sniping at them in one forum after another. Now, I finally have one of my own.

And mine knows how to hold a grudge.

Last week, that grudge led him to write a number of false, demeaning things about me. Most of those were on a forum thread, which has since been expunged for code-of-conduct violations. Some were on a syndicated blog, which is why (with some regret) I'm responding here.

This all started about a year ago, when I made a flippant remark in an interactive fiction chat room run by this person in response to a bit of language policing: after one person made an announcement to the room, addressing it to "you guys [and] ladies", a second person took offense to this idiom for addressing a group and chided him to be more inclusive. I responded, sarcastically, that I too was offended because the second person's language theoretically still left some people out.

The moderator of this chat room warned me that such comments were unwelcome there, so I didn't repeat them. I wasn't banned, but I haven't returned there since last year, because there are a number of things I find unappealing about that environment, including a clumsy web interface. ifMUD is more welcoming, more usable, and it has a parrot.

I haven't really thought about that episode again until this weekend, when I saw his posts, which were prompted by a discussion in which other people mentioned feeling alienated in that chat room as a result of hostile interactions with the same moderator.

I guess he's been stewing over it all year, because today, his account of it is couched in so much hyperbole it's barely recognizable: misrepresenting the content of our exchange, assigning me motivations that are insulting and divorced from reality, and abusing words like "deliberately" and "explicitly" to dress up false assumptions in the language of impartial observations. He even accuses me of an unexplained "pattern" of "nasty behavior" that, from what I can tell, refers to a couple times over the years when I've defended critics and competition judges against attacks very similar to the ones he's making now.

I won't respond point-by-point here, partly because those attacks don't deserve any more airtime, and partly because I hope this was caused by misunderstanding rather than malice. I'd rather not burden him with being forever linked to those remarks if he chooses to retract them.

He could have reached out to me at any time in the past year to clear this up, but maybe he's been busy. I know when I get into the groove working on Guncho or ZILF, it can be hard to find time for anything else besides sleep and work, so this could be a sign that something great is in the works.

(Now back to the usual topics.)

Monday, November 30, 2015

New optimizations coming in ZILF 0.8

Left: Code generated by ZILF 0.7 (11 instructions).
Right: Code after new optimizations (1 instruction).


How ZILF turns the nasty code on the left into a single instruction, using three new optimizations:

==1==

?L14: BAND STACK,2048 >STACK
BAND STACK,32767 >STACK
ZERO? STACK /?L12
PUSH 1
JUMP ?L13
?L12: PUSH 0
?L13: ZERO? STACK \?L6
PUSH 1
JUMP ?L7
?L6:  PUSH 0
?L7: ZERO? STACK \FALSE

Combine sequential bitwise operations (this is new). BAND STACK,32767 goes away, because 2048 & 32767 == 2048.

==2==

?L14: BAND STACK,2048 >STACK
ZERO? STACK /?L12
PUSH 1
JUMP ?L13
?L12: PUSH 0
?L13: ZERO? STACK \?L6
PUSH 1
JUMP ?L7
?L6: PUSH 0
?L7:  ZERO? STACK \FALSE

Convert single-bit BAND+ZERO? test to BTST (this is new). BAND STACK,2048 >STACK followed by ZERO? STACK /?L12 becomes BTST STACK,2048 \?L12, because 2048 is a power of two.

==3==

?L14: BTST STACK,2048 \?L12
PUSH 1
JUMP ?L13
?L12: PUSH 0
?L13: ZERO? STACK \?L6
PUSH 1
JUMP ?L7
?L6: PUSH 0
?L7:  ZERO? STACK \FALSE

Bypass controlled conditional branch (this is new). PUSH 1 followed by a JUMP to ZERO? STACK \?L6 becomes JUMP ?L6, because we know the negative ZERO? branch will be taken.

==4==

?L14: BTST STACK,2048 \?L12
JUMP ?L6
JUMP ?L13
?L12: PUSH 0
?L13: ZERO? STACK \?L6
PUSH 1
JUMP ?L7
?L6:  PUSH 0
?L7:  ZERO? STACK \FALSE

Bypass controlled conditional branch. PUSH 0 followed by ZERO? STACK \?L6 becomes a JUMP to the new ?L15 (the instruction after ZERO?), because we know the negative branch won't be taken.

==5==

?L14: BTST STACK,2048 \?L12
JUMP ?L6
JUMP ?L13
?L12: JUMP ?L15
?L13: ZERO? STACK \?L6
?L15: PUSH 1
JUMP ?L7
?L6: PUSH 0
?L7: ZERO? STACK \FALSE

Bypass controlled conditional branch. PUSH 1 followed by a JUMP to ZERO? STACK \FALSE becomes RFALSE, because we know the negative branch will be taken.

==6==

?L14: BTST STACK,2048 \?L12
JUMP ?L6
JUMP ?L13
?L12: JUMP ?L15
?L13: ZERO? STACK \?L6
?L15: RFALSE
JUMP ?L7
?L6: PUSH 0
?L7: ZERO? STACK \FALSE

Bypass controlled conditional branch. PUSH 0 followed by ZERO? STACK \FALSE becomes a JUMP to the new ?L16, because we know the negative branch won't be taken.

==7==

?L14: BTST STACK,2048 \?L12
JUMP ?L6
JUMP ?L13
?L12: JUMP ?L15
?L13: ZERO? STACK \?L6
?L15: RFALSE
JUMP ?L7
?L6: JUMP ?L16
?L7: ZERO? STACK \FALSE
?L16: ...

Eliminate dead code and unused labels.

==8==

?L14: BTST STACK,2048 \?L12
JUMP ?L6
?L12: JUMP ?L15
?L15: RFALSE
?L6:  JUMP ?L16
ZERO? STACK \FALSE
?L16: ...

Eliminate branch to next instruction. JUMP ?L15 goes away; ?L15 merges with ?L12.

==9==

?L14: BTST STACK,2048 \?L12
JUMP ?L6
?L12: RFALSE
?L6: JUMP ?L16
ZERO? STACK \FALSE
?L16: ...

Disintermediate branch. BTST STACK,2048 \?L12 becomes BTST STACK,2048 \FALSE, because the instruction at ?L12 is RFALSE.

==10==

?L14: BTST STACK,2048 \FALSE
JUMP ?L6
?L12: RFALSE
?L6: JUMP ?L16
ZERO? STACK \FALSE
?L16: ...

Disintermediate branch. JUMP ?L6 becomes JUMP ?L16, because the instruction at ?L6 is JUMP ?L16.

==11==

?L14: BTST STACK,2048 \FALSE
JUMP ?L16
?L12: RFALSE
?L6: JUMP ?L16
ZERO? STACK \FALSE
?L16: ...


Eliminate dead code and unused labels.

==12==

?L14: BTST STACK,2048 \FALSE
JUMP ?L16
?L16: ...

Eliminate branch to next instruction. JUMP ?L16 goes away.

==13==

?L14: BTST STACK,2048 \FALSE

Sunday, October 25, 2015

Parsing in ZIL, part 2: Noun phrases

This is the second in a series of posts describing ZILF's parser. Read part 1 here, or read part 3 here.

In the first post, we covered the basic structure of a command: verbs, prepositions, and noun phrases. Recognizing a verb or preposition is easy; it's just a word tagged with that part of speech. Once the parser has the verb and prepositions, it uses them to find a matching grammar line. But how does it recognize noun phrases, and what does it do with them?

What is a noun phrase, anyway?

Here are some sample commands, with the noun phrases in bold:
throw axe at dwarf
get shiny brass lamp, tasty food, and bottle
drop any
get set of keys
get all except sign
put all cubes except red and blue in the chute
A noun phrase is anything the player can type to identify the object of a command. It can refer to a single object, like "axe"; a list of objects, like "lamp, bottle, and food"; a set of objects answering to the same name, like "cubes"; or a set of objects implicitly defined by what the player can see, like "all" or "any". It can exclude objects with "except" or "but". And it can quantify what you mean when there's more than one similar object: do you want all of them, a specific one, or any random one?

The grammar of an acceptable noun phrase is, more or less:
[all/any] [article] [adjective...] [noun]
The individual parts are optional, but at least one must be given. Larger phrases can also be made by stringing small ones together with commas or words like "and", "except", "but", and "of":
[noun phrase], [noun phrase] and [noun phrase] of [noun phrase] except [noun phrase] and [noun phrase]
When the parser encounters a noun phrase, it records the important parts in a data structure called NOUN-PHRASE, which it uses later to find matches within the set of objects available to the player. The structure itself is very simple, containing only a "mode", a list of adjective/noun pairs (called OBJSPECs) to include, and another list of OBJSPECs to exclude.

Let's see what the parser does with some of the noun phrases from above:

"axe"

Mode default
Include (*, AXE)
Exclude none

Simple enough: we're just looking for an object that lists AXE in its SYNONYM property. We're using the default match mode, which means if there's more than one "axe" in sight, the parser will ask which one you mean.

"shiny brass lamp, tasty food, and bottle"

Mode default
Include (SHINY, LAMP)
(TASTY, FOOD)
(*, BOTTLE)
Exclude none

This time we have a few pairs in the Include list, and two of them have adjectives. The parser will look for objects with SHINY in their ADJECTIVE properties and LAMP in their SYNONYM properties. Likewise for TASTY and FOOD. Notice that BRASS is nowhere to be seen -- the parser only remembers one adjective per OBJSPEC.

But BOTTLE can be surprising in a game like Adventure where "bottle" and "bottled water" are both objects. Because of Z-machine limits, "bottle" and "bottled" are treated as the same word: if the player types "get bottled", referring to the bottled water by its adjective, to the parser it looks the same as "get bottle", referring to the bottle by its noun. The parser needs to support both. So even though the structure lists BOTTLE as a noun, the parser will see there's no adjective paired with it and notice that BOTTLE can also be used as an adjective, and it'll look for it in the ADJECTIVE property as well. This is a lower-quality match, so if another nearby object has the word in its SYNONYM property, the parser will pick that one instead.

"set of keys"

Mode default
Include (*, KEYS)
Exclude none

Hey, what happened to SET?

Object names containing "of" are special. SET and KEYS are both in the object's SYNONYM property, but the parser only thinks about one adjective and one noun at a time, so they can't both be used. Adjacent nouns normally belong to separate noun phrases, e.g. "give troll money".

However, when the parser sees "of" in a noun phrase, it forgets the last noun it saw! So "set of keys" becomes simply "keys". This makes the parser's job easier, but at the cost of being unable to know whether the player wants the set of keys, the pile of keys, or the stack of boxes of pictures of keys. (It can still tell them apart by asking questions, as we'll see in the post on orphaning.)

"any"

Mode any
Include none
Exclude none

This time, we didn't give any nouns or adjectives, only a mode. Since the Include list is empty, the parser will choose from all available objects, and since the match mode is "any", it'll choose a random object if there's more than one match.

"all except sign"

Mode all
Include none
Exclude (*, SIGN)

Again, the Include list is empty, so the parser chooses from all available objects, then excludes any object with the word SIGN in its SYNONYM property. Since the match mode is "all", if there's more than one "sign" available, the parser will include exclude all of them.

"all cubes except red and blue"

Mode all
Include (*, CUBES)
Exclude (RED, *)
(BLUE, *)

This time, the Include and Exclude lists are both present. The parser will find all available objects with CUBES in their SYNONYM property, then exclude any that have RED or BLUE in their ADJECTIVE property.

Thursday, October 1, 2015

ZILF 0.7 and Adventure

Update: IF Archive links: ZILF 0.7, Advent.z3.

The IF Archive upload is still being processed, but for now you can download ZILF 0.7 from Bitbucket. (Or read the release notes.)

Among other things, this version includes a new sample project: a port of Adventure, aka Colossal Cave. Download Advent.z3 to play!

Sunday, September 20, 2015

Parsing in ZILF, part 1: The ideal sentence

This is the first in a series of posts describing ZILF's parser. Read part 2 here.

Writing one's own interactive fiction language is a popular pastime, but people who attempt it quickly find that the language itself is the easy part. The hard part is fleshing out the parser and world model to the standards that players have come to expect.

We've come a long way since Colossal Cave and its two-word sentences ("get lamp", "throw axe"). Most modern games understand things like:
  • Prepositions (the "up" in "pick up lamp")
  • Different actions for the same verb word ("look at" vs. "look in" vs. "look under")
  • Commands with two or more noun phrases ("put coin in slot")
  • Complex noun phrases ("take key, lamp, and all cubes except the red cube")
  • Multiple commands per line ("north. west. open door then go in.")
  • Ordering ("watson, take the prints to the crime lab")
  • Disambiguation ("take polish" => "Which do you mean, the shoe polish or the Polish sausage?")
  • Implied nouns ("take" by itself when there's only one thing to pick up)
  • Correcting mistakes ("oops") and repeating commands ("again")
In fact, even in the 80s, Infocom's games understood those, so ZILF really should as well.

In this series, I'll explain how ZILF deals with all of the above, but for now let's look at a simple command like "pick up lamp". Here's the source code that defines the syntax for that command:
<SYNTAX PICK UP OBJECT (FIND TAKEBIT) (MANY ON-GROUND IN-ROOM) = V-TAKE>
Look at the parts in turn:
  • PICK: The verb word. For this syntax to take effect, the verb in the player's command has to be "pick" (or one of its synonyms, if it had any). The compiler assigns a new verb number to "pick", which is given the constant name ACT?PICK; this would also become the verb number of any synonyms.
  • UP: A preposition. The compiler assigns a preposition number to "up", the constant PR?UP.
  • OBJECT: Shows how many objects the command wants. OBJECT can appear zero, one, or two times in the syntax line; since it appears once here, this is a one-object command.
  • (FIND TAKEBIT): The "find flag", hinting at what kind of objects the command prefers.
  • (MANY ON-GROUND IN-ROOM): The "search options", hinting at where and how the command prefers to find its objects.
  • V-TAKE: The name of the action routine, which contains the code that implements taking. The compiler assigns an action number based (mostly) on the name of the action routine, in this case the constant V?TAKE; the action number identifies what the player wants to do. There might be many different syntaxes — "get lamp", "take lamp" — that all use different verbs to represent the same action. (And yes, unfortunately, V for action numbers and ACT for verb numbers is the opposite of what you'd expect.)
Notice that nothing in that line says exactly which words have to appear in which order, and in fact the parser will accept "lamp pick up"! Unlike Inform, ZILF doesn't define a precise grammar for each command; it has a general idea of what makes a sentence, and it uses its rudimentary knowledge of English to extract the parts of a command and fit them into a pattern.

So, what makes a sentence? A verb, a preposition, a noun phrase, another preposition, and another noun phrase. Everything except the verb is optional, but there are a few rules: you can't have a second noun phrase without a first noun phrase, each preposition needs a corresponding noun phrase, and two or more prepositions in a row collapse into one (so "look up in air" becomes "look up air").

The parser looks for a verb, extracts the prepositions and noun phrases from the player's command to fit them into the ideal sentence pattern, then searches through all the syntax lines for that verb, hoping to find one that has the same prepositions and number of noun phrases that the player typed. If it finds one, it can move on to identifying which "lamp" the player is talking about.

If that doesn't work, the parser may still be able to find a syntax line that can be made to work with more information, in which case it'll ask a clarifying question ("What do you want to pick up?") and orphan the command, keeping it around just long enough to see if the next input gives more information. Or it may decide that none of the syntaxes can possibly match and issue an error.

[Update: Corrected the name of the preposition number constant.]

Saturday, May 31, 2014

Generating better code in ZILF 0.5

I've done a lot of work on the code generator and optimizer for the upcoming version of ZILF. Here's one example of how the generated code has been improved.

Parser.zil has a routine called TAKE-CONT-SEARCH, which helps V-TAKE determine whether the object being taken is in a container:
<ROUTINE TAKE-CONT-SEARCH (A "AUX" H)  
    <OBJECTLOOP I .A
        <COND (<OR <FSET? .I ,CONTBIT> <==? .I ,WINNER>>
                <COND (<IN? ,PRSO .I>
                        <SET H .I>
                        <RETURN>)
                    (ELSE
                        <SET H <TAKE-CONT-SEARCH .I>>
                        <AND .H <RETURN>>)>)>>
    <AND .H <RETURN .H>>>
In ZILF 0.4, that turned into 14 instructions:
      .FUNCT TAKE-CONT-SEARCH,A,H,I FIRST? A >I /?L1 ?L1: ZERO? I /?L3 FSET? I,CONTBIT /?L8 EQUAL? I,WINNER \?L12 ?L8: IN? PRSO,I \?L9 SET 'H,I JUMP ?L3 ?L9: CALL TAKE-CONT-SEARCH,I >H ZERO? H \?L3 ?L12: NEXT? I >I /?L1 JUMP ?L1 ?L3: ZERO? H \?L18 RETURN H ?L18: RETURN H 
In ZILF 0.5, it's only 11 after eliminating a ZERO?, a JUMP, and the duplicate RETURN:
.FUNCT TAKE-CONT-SEARCH,A,H,I
FIRST? A >I \?L3
?L19: FSET? I,CONTBIT /?L8
EQUAL? I,WINNER \?L12
?L8: IN? PRSO,I \?L9
SET 'H,I
JUMP ?L3
?L9: CALL TAKE-CONT-SEARCH,I >H
ZERO? H \?L18
?L12: NEXT? I >I /?L19
?L3: ZERO? H /FALSE
?L18: RETURN H
Here's another example. The REFERS? routine checks whether a noun, or adjective-noun pair, matches the vocabulary of some object:
<ROUTINE REFERS? (A N O)
    <AND <OR <0? .A> <IN-PB/WTBL? .O ,P?ADJECTIVE .A>>
         <IN-PWTBL? .O ,P?SYNONYM .N>>>
The ZIL code is pretty simple, but ZILF 0.4 handled it poorly, needing 12 instructions and a temporary variable:
.FUNCT REFERS?,A,N,O,?TMP
ZERO? A /?L5
PUSH 0
JUMP ?L6
?L5: PUSH 1
?L6: POP '?TMP
ZERO? ?TMP /?L4
PUSH ?TMP
JUMP ?L3
?L4: CALL IN-PBTBL?,O,P?ADJECTIVE,A >STACK
?L3: ZERO? STACK /FALSE
CALL IN-PWTBL?,O,P?SYNONYM,N >STACK
RSTACK
Nasty! ZILF 0.5 does it in five instructions, eliminating the temp variable and a whole lot of nonsense:
.FUNCT REFERS?,A,N,O
ZERO? A /?L2
CALL IN-PBTBL?,O,P?ADJECTIVE,A >STACK
ZERO? STACK /FALSE
?L2: CALL IN-PWTBL?,O,P?SYNONYM,N >STACK
RSTACK
Much of the benefit comes from better understanding the relationship between values and conditions, such as knowing when a value is immediately going to be tested against zero, or which branch instructions are testing conditions that have already been tested. The Cloak example is smaller by more than 200 bytes thanks to these optimizations.