Tuesday, November 30, 2010

Branch optimization in ZAPF

I recently moved some unfinished changes from ZAPF's trunk into a development branch. Some have asked what these changes are all about, anyway, so here's an explanation.

The Z-machine's branch instructions, of which there are many, can take an offset in either "long" or "short" form. Long form takes two bytes and can hold any signed 14 bit offset, whereas short form takes one byte but can only hold an unsigned 6 bit offset. In order to save space, we always want to use the short form wherever possible: that is, whenever an instruction branches forward by up to 61 bytes. (Not 63: these offsets are biased by +2.)

But branching forward means we encounter the branch instruction before its target label, so how do we know how far it's branching? It's not easy, especially if there are other branch instructions in between this one and its target -- we can't know the true distance until we know what form those will be assembled in.

ZAPF's current approach is to make repeated passes over each function until all the label positions are discovered. When it sees a branch instruction targeting a label that hasn't been defined yet, it remembers the name of that label, then assembles the branch in long mode. Once the label's definition is encountered, ZAPF remembers its location and rewinds to the beginning of the function.

The next time it encounters the branch instruction, it can make a better judgment of the distance, which may allow it to use short mode. The next time it encounters the label definition, it compares the new location to the old location; if it has changed, it remembers the new position and rewinds again. It only reaches the end of the function once every label's final location has been discovered and every branch has been assembled correctly.

Unfortunately, this approach doesn't always work. Some inputs cause ZAPF to get stuck in a function and never reach the end. I haven't investigated too thoroughly because the algorithm is hard to reason about, inefficient, and silly enough that I decided to just rewrite it.

I found an interesting replacement algorithm in a 1978 paper by Thomas G. Szymanski called "Assembling Code for Machines with Span-Dependent Instructions". The paper proves that minimizing program length is NP-complete in the general case, when branch targets are allowed to be specified as arbitrary expressions of constants and labels, but presents an efficient algorithm for a restricted subset of expressions -- which works just fine for ZAPF.

The algorithm works by assuming in a first pass that all instructions can be assembled in short form, and remembering the location of every label and every branch instruction, then studying them after that pass to trace the dependencies between branch instructions, and forcing into long mode only the instructions that need it. The label locations are adjusted -- advanced forward by one byte for every preceding branch that was changed from short to long -- and the set of adjusted locations is used in a second pass to assemble each branch in the correct mode.

There's one more complication, though: the Z-machine's packed address system. Every routine must begin at an address divisible by 2, 4, or 8 (depending on the Z-code version), so there are usually a few padding bytes between routines. The algorithm works on all labels in the game at once, but because of the padding issue, forcing a branch into long mode doesn't actually push every following label ahead by one byte: for labels in subsequent routines, it's either zero bytes (when the extra byte can be absorbed by the next padding section) or 2/4/8 bytes (when it can't). And that's the part that isn't finished yet.

Monday, November 22, 2010

Hello World

I'm vaporware, and this is my IF blog.

Who am I, anyway? You may have encountered some of my IF-related projects:
  • FyreVM - The modified Glulx interpreter that runs Textfyre's games.
  • Guncho - An online multiplayer IF system based on Inform 7.
  • Snack - A language similar to TADS 2 that compiles to Glulx.
  • Yomin - An integrated development environment (IDE) for Inform 6 and ZIL.
  • ZAPF - A Z-machine assembler and disassembler.
  • ZILF - A compiler for Infocom's ZIL language.
  • ZLR - A Z-machine interpreter.
I've also contributed template code to Inform 7 and currently run the bug tracker site. I've written several extensions, and I was the technical editor on Aaron Reed's Creating Interactive Fiction with Inform 7.

Why this blog? Well, occasionally I come up with something cool that I want to talk about in greater detail than one or two lines on ifMUD. Don't expect a lot of comp reviews or storytelling tips: I intend to focus on more technical topics, from the status of the projects above to Inform tricks to VM implementation details.