Monday, November 30, 2015

New optimizations coming in ZILF 0.8

Left: Code generated by ZILF 0.7 (11 instructions).
Right: Code after new optimizations (1 instruction).


How ZILF turns the nasty code on the left into a single instruction, using three new optimizations:

==1==

?L14: BAND STACK,2048 >STACK
BAND STACK,32767 >STACK
ZERO? STACK /?L12
PUSH 1
JUMP ?L13
?L12: PUSH 0
?L13: ZERO? STACK \?L6
PUSH 1
JUMP ?L7
?L6:  PUSH 0
?L7: ZERO? STACK \FALSE

Combine sequential bitwise operations (this is new). BAND STACK,32767 goes away, because 2048 & 32767 == 2048.

==2==

?L14: BAND STACK,2048 >STACK
ZERO? STACK /?L12
PUSH 1
JUMP ?L13
?L12: PUSH 0
?L13: ZERO? STACK \?L6
PUSH 1
JUMP ?L7
?L6: PUSH 0
?L7:  ZERO? STACK \FALSE

Convert single-bit BAND+ZERO? test to BTST (this is new). BAND STACK,2048 >STACK followed by ZERO? STACK /?L12 becomes BTST STACK,2048 \?L12, because 2048 is a power of two.

==3==

?L14: BTST STACK,2048 \?L12
PUSH 1
JUMP ?L13
?L12: PUSH 0
?L13: ZERO? STACK \?L6
PUSH 1
JUMP ?L7
?L6: PUSH 0
?L7:  ZERO? STACK \FALSE

Bypass controlled conditional branch (this is new). PUSH 1 followed by a JUMP to ZERO? STACK \?L6 becomes JUMP ?L6, because we know the negative ZERO? branch will be taken.

==4==

?L14: BTST STACK,2048 \?L12
JUMP ?L6
JUMP ?L13
?L12: PUSH 0
?L13: ZERO? STACK \?L6
PUSH 1
JUMP ?L7
?L6:  PUSH 0
?L7:  ZERO? STACK \FALSE

Bypass controlled conditional branch. PUSH 0 followed by ZERO? STACK \?L6 becomes a JUMP to the new ?L15 (the instruction after ZERO?), because we know the negative branch won't be taken.

==5==

?L14: BTST STACK,2048 \?L12
JUMP ?L6
JUMP ?L13
?L12: JUMP ?L15
?L13: ZERO? STACK \?L6
?L15: PUSH 1
JUMP ?L7
?L6: PUSH 0
?L7: ZERO? STACK \FALSE

Bypass controlled conditional branch. PUSH 1 followed by a JUMP to ZERO? STACK \FALSE becomes RFALSE, because we know the negative branch will be taken.

==6==

?L14: BTST STACK,2048 \?L12
JUMP ?L6
JUMP ?L13
?L12: JUMP ?L15
?L13: ZERO? STACK \?L6
?L15: RFALSE
JUMP ?L7
?L6: PUSH 0
?L7: ZERO? STACK \FALSE

Bypass controlled conditional branch. PUSH 0 followed by ZERO? STACK \FALSE becomes a JUMP to the new ?L16, because we know the negative branch won't be taken.

==7==

?L14: BTST STACK,2048 \?L12
JUMP ?L6
JUMP ?L13
?L12: JUMP ?L15
?L13: ZERO? STACK \?L6
?L15: RFALSE
JUMP ?L7
?L6: JUMP ?L16
?L7: ZERO? STACK \FALSE
?L16: ...

Eliminate dead code and unused labels.

==8==

?L14: BTST STACK,2048 \?L12
JUMP ?L6
?L12: JUMP ?L15
?L15: RFALSE
?L6:  JUMP ?L16
ZERO? STACK \FALSE
?L16: ...

Eliminate branch to next instruction. JUMP ?L15 goes away; ?L15 merges with ?L12.

==9==

?L14: BTST STACK,2048 \?L12
JUMP ?L6
?L12: RFALSE
?L6: JUMP ?L16
ZERO? STACK \FALSE
?L16: ...

Disintermediate branch. BTST STACK,2048 \?L12 becomes BTST STACK,2048 \FALSE, because the instruction at ?L12 is RFALSE.

==10==

?L14: BTST STACK,2048 \FALSE
JUMP ?L6
?L12: RFALSE
?L6: JUMP ?L16
ZERO? STACK \FALSE
?L16: ...

Disintermediate branch. JUMP ?L6 becomes JUMP ?L16, because the instruction at ?L6 is JUMP ?L16.

==11==

?L14: BTST STACK,2048 \FALSE
JUMP ?L16
?L12: RFALSE
?L6: JUMP ?L16
ZERO? STACK \FALSE
?L16: ...


Eliminate dead code and unused labels.

==12==

?L14: BTST STACK,2048 \FALSE
JUMP ?L16
?L16: ...

Eliminate branch to next instruction. JUMP ?L16 goes away.

==13==

?L14: BTST STACK,2048 \FALSE

Sunday, October 25, 2015

Parsing in ZIL, part 2: Noun phrases

This is the second in a series of posts describing ZILF's parser. Read part 1 here, or read part 3 here.

In the first post, we covered the basic structure of a command: verbs, prepositions, and noun phrases. Recognizing a verb or preposition is easy; it's just a word tagged with that part of speech. Once the parser has the verb and prepositions, it uses them to find a matching grammar line. But how does it recognize noun phrases, and what does it do with them?

What is a noun phrase, anyway?

Here are some sample commands, with the noun phrases in bold:
throw axe at dwarf
get shiny brass lamp, tasty food, and bottle
drop any
get set of keys
get all except sign
put all cubes except red and blue in the chute
A noun phrase is anything the player can type to identify the object of a command. It can refer to a single object, like "axe"; a list of objects, like "lamp, bottle, and food"; a set of objects answering to the same name, like "cubes"; or a set of objects implicitly defined by what the player can see, like "all" or "any". It can exclude objects with "except" or "but". And it can quantify what you mean when there's more than one similar object: do you want all of them, a specific one, or any random one?

The grammar of an acceptable noun phrase is, more or less:
[all/any] [article] [adjective...] [noun]
The individual parts are optional, but at least one must be given. Larger phrases can also be made by stringing small ones together with commas or words like "and", "except", "but", and "of":
[noun phrase], [noun phrase] and [noun phrase] of [noun phrase] except [noun phrase] and [noun phrase]
When the parser encounters a noun phrase, it records the important parts in a data structure called NOUN-PHRASE, which it uses later to find matches within the set of objects available to the player. The structure itself is very simple, containing only a "mode", a list of adjective/noun pairs (called OBJSPECs) to include, and another list of OBJSPECs to exclude.

Let's see what the parser does with some of the noun phrases from above:

"axe"

Mode default
Include (*, AXE)
Exclude none

Simple enough: we're just looking for an object that lists AXE in its SYNONYM property. We're using the default match mode, which means if there's more than one "axe" in sight, the parser will ask which one you mean.

"shiny brass lamp, tasty food, and bottle"

Mode default
Include (SHINY, LAMP)
(TASTY, FOOD)
(*, BOTTLE)
Exclude none

This time we have a few pairs in the Include list, and two of them have adjectives. The parser will look for objects with SHINY in their ADJECTIVE properties and LAMP in their SYNONYM properties. Likewise for TASTY and FOOD. Notice that BRASS is nowhere to be seen -- the parser only remembers one adjective per OBJSPEC.

But BOTTLE can be surprising in a game like Adventure where "bottle" and "bottled water" are both objects. Because of Z-machine limits, "bottle" and "bottled" are treated as the same word: if the player types "get bottled", referring to the bottled water by its adjective, to the parser it looks the same as "get bottle", referring to the bottle by its noun. The parser needs to support both. So even though the structure lists BOTTLE as a noun, the parser will see there's no adjective paired with it and notice that BOTTLE can also be used as an adjective, and it'll look for it in the ADJECTIVE property as well. This is a lower-quality match, so if another nearby object has the word in its SYNONYM property, the parser will pick that one instead.

"set of keys"

Mode default
Include (*, KEYS)
Exclude none

Hey, what happened to SET?

Object names containing "of" are special. SET and KEYS are both in the object's SYNONYM property, but the parser only thinks about one adjective and one noun at a time, so they can't both be used. Adjacent nouns normally belong to separate noun phrases, e.g. "give troll money".

However, when the parser sees "of" in a noun phrase, it forgets the last noun it saw! So "set of keys" becomes simply "keys". This makes the parser's job easier, but at the cost of being unable to know whether the player wants the set of keys, the pile of keys, or the stack of boxes of pictures of keys. (It can still tell them apart by asking questions, as we'll see in the post on orphaning.)

"any"

Mode any
Include none
Exclude none

This time, we didn't give any nouns or adjectives, only a mode. Since the Include list is empty, the parser will choose from all available objects, and since the match mode is "any", it'll choose a random object if there's more than one match.

"all except sign"

Mode all
Include none
Exclude (*, SIGN)

Again, the Include list is empty, so the parser chooses from all available objects, then excludes any object with the word SIGN in its SYNONYM property. Since the match mode is "all", if there's more than one "sign" available, the parser will include exclude all of them.

"all cubes except red and blue"

Mode all
Include (*, CUBES)
Exclude (RED, *)
(BLUE, *)

This time, the Include and Exclude lists are both present. The parser will find all available objects with CUBES in their SYNONYM property, then exclude any that have RED or BLUE in their ADJECTIVE property.

Thursday, October 1, 2015

ZILF 0.7 and Adventure

Update: IF Archive links: ZILF 0.7, Advent.z3.

The IF Archive upload is still being processed, but for now you can download ZILF 0.7 from Bitbucket. (Or read the release notes.)

Among other things, this version includes a new sample project: a port of Adventure, aka Colossal Cave. Download Advent.z3 to play!

Sunday, September 20, 2015

Parsing in ZILF, part 1: The ideal sentence

This is the first in a series of posts describing ZILF's parser. Read part 2 here.

Writing one's own interactive fiction language is a popular pastime, but people who attempt it quickly find that the language itself is the easy part. The hard part is fleshing out the parser and world model to the standards that players have come to expect.

We've come a long way since Colossal Cave and its two-word sentences ("get lamp", "throw axe"). Most modern games understand things like:
  • Prepositions (the "up" in "pick up lamp")
  • Different actions for the same verb word ("look at" vs. "look in" vs. "look under")
  • Commands with two or more noun phrases ("put coin in slot")
  • Complex noun phrases ("take key, lamp, and all cubes except the red cube")
  • Multiple commands per line ("north. west. open door then go in.")
  • Ordering ("watson, take the prints to the crime lab")
  • Disambiguation ("take polish" => "Which do you mean, the shoe polish or the Polish sausage?")
  • Implied nouns ("take" by itself when there's only one thing to pick up)
  • Correcting mistakes ("oops") and repeating commands ("again")
In fact, even in the 80s, Infocom's games understood those, so ZILF really should as well.

In this series, I'll explain how ZILF deals with all of the above, but for now let's look at a simple command like "pick up lamp". Here's the source code that defines the syntax for that command:
<SYNTAX PICK UP OBJECT (FIND TAKEBIT) (MANY ON-GROUND IN-ROOM) = V-TAKE>
Look at the parts in turn:
  • PICK: The verb word. For this syntax to take effect, the verb in the player's command has to be "pick" (or one of its synonyms, if it had any). The compiler assigns a new verb number to "pick", which is given the constant name ACT?PICK; this would also become the verb number of any synonyms.
  • UP: A preposition. The compiler assigns a preposition number to "up", the constant PR?UP.
  • OBJECT: Shows how many objects the command wants. OBJECT can appear zero, one, or two times in the syntax line; since it appears once here, this is a one-object command.
  • (FIND TAKEBIT): The "find flag", hinting at what kind of objects the command prefers.
  • (MANY ON-GROUND IN-ROOM): The "search options", hinting at where and how the command prefers to find its objects.
  • V-TAKE: The name of the action routine, which contains the code that implements taking. The compiler assigns an action number based (mostly) on the name of the action routine, in this case the constant V?TAKE; the action number identifies what the player wants to do. There might be many different syntaxes — "get lamp", "take lamp" — that all use different verbs to represent the same action. (And yes, unfortunately, V for action numbers and ACT for verb numbers is the opposite of what you'd expect.)
Notice that nothing in that line says exactly which words have to appear in which order, and in fact the parser will accept "lamp pick up"! Unlike Inform, ZILF doesn't define a precise grammar for each command; it has a general idea of what makes a sentence, and it uses its rudimentary knowledge of English to extract the parts of a command and fit them into a pattern.

So, what makes a sentence? A verb, a preposition, a noun phrase, another preposition, and another noun phrase. Everything except the verb is optional, but there are a few rules: you can't have a second noun phrase without a first noun phrase, each preposition needs a corresponding noun phrase, and two or more prepositions in a row collapse into one (so "look up in air" becomes "look up air").

The parser looks for a verb, extracts the prepositions and noun phrases from the player's command to fit them into the ideal sentence pattern, then searches through all the syntax lines for that verb, hoping to find one that has the same prepositions and number of noun phrases that the player typed. If it finds one, it can move on to identifying which "lamp" the player is talking about.

If that doesn't work, the parser may still be able to find a syntax line that can be made to work with more information, in which case it'll ask a clarifying question ("What do you want to pick up?") and orphan the command, keeping it around just long enough to see if the next input gives more information. Or it may decide that none of the syntaxes can possibly match and issue an error.

[Update: Corrected the name of the preposition number constant.]