Monday, November 30, 2015

New optimizations coming in ZILF 0.8

Left: Code generated by ZILF 0.7 (11 instructions).
Right: Code after new optimizations (1 instruction).


How ZILF turns the nasty code on the left into a single instruction, using three new optimizations:

==1==

?L14: BAND STACK,2048 >STACK
BAND STACK,32767 >STACK
ZERO? STACK /?L12
PUSH 1
JUMP ?L13
?L12: PUSH 0
?L13: ZERO? STACK \?L6
PUSH 1
JUMP ?L7
?L6:  PUSH 0
?L7: ZERO? STACK \FALSE

Combine sequential bitwise operations (this is new). BAND STACK,32767 goes away, because 2048 & 32767 == 2048.

==2==

?L14: BAND STACK,2048 >STACK
ZERO? STACK /?L12
PUSH 1
JUMP ?L13
?L12: PUSH 0
?L13: ZERO? STACK \?L6
PUSH 1
JUMP ?L7
?L6: PUSH 0
?L7:  ZERO? STACK \FALSE

Convert single-bit BAND+ZERO? test to BTST (this is new). BAND STACK,2048 >STACK followed by ZERO? STACK /?L12 becomes BTST STACK,2048 \?L12, because 2048 is a power of two.

==3==

?L14: BTST STACK,2048 \?L12
PUSH 1
JUMP ?L13
?L12: PUSH 0
?L13: ZERO? STACK \?L6
PUSH 1
JUMP ?L7
?L6: PUSH 0
?L7:  ZERO? STACK \FALSE

Bypass controlled conditional branch (this is new). PUSH 1 followed by a JUMP to ZERO? STACK \?L6 becomes JUMP ?L6, because we know the negative ZERO? branch will be taken.

==4==

?L14: BTST STACK,2048 \?L12
JUMP ?L6
JUMP ?L13
?L12: PUSH 0
?L13: ZERO? STACK \?L6
PUSH 1
JUMP ?L7
?L6:  PUSH 0
?L7:  ZERO? STACK \FALSE

Bypass controlled conditional branch. PUSH 0 followed by ZERO? STACK \?L6 becomes a JUMP to the new ?L15 (the instruction after ZERO?), because we know the negative branch won't be taken.

==5==

?L14: BTST STACK,2048 \?L12
JUMP ?L6
JUMP ?L13
?L12: JUMP ?L15
?L13: ZERO? STACK \?L6
?L15: PUSH 1
JUMP ?L7
?L6: PUSH 0
?L7: ZERO? STACK \FALSE

Bypass controlled conditional branch. PUSH 1 followed by a JUMP to ZERO? STACK \FALSE becomes RFALSE, because we know the negative branch will be taken.

==6==

?L14: BTST STACK,2048 \?L12
JUMP ?L6
JUMP ?L13
?L12: JUMP ?L15
?L13: ZERO? STACK \?L6
?L15: RFALSE
JUMP ?L7
?L6: PUSH 0
?L7: ZERO? STACK \FALSE

Bypass controlled conditional branch. PUSH 0 followed by ZERO? STACK \FALSE becomes a JUMP to the new ?L16, because we know the negative branch won't be taken.

==7==

?L14: BTST STACK,2048 \?L12
JUMP ?L6
JUMP ?L13
?L12: JUMP ?L15
?L13: ZERO? STACK \?L6
?L15: RFALSE
JUMP ?L7
?L6: JUMP ?L16
?L7: ZERO? STACK \FALSE
?L16: ...

Eliminate dead code and unused labels.

==8==

?L14: BTST STACK,2048 \?L12
JUMP ?L6
?L12: JUMP ?L15
?L15: RFALSE
?L6:  JUMP ?L16
ZERO? STACK \FALSE
?L16: ...

Eliminate branch to next instruction. JUMP ?L15 goes away; ?L15 merges with ?L12.

==9==

?L14: BTST STACK,2048 \?L12
JUMP ?L6
?L12: RFALSE
?L6: JUMP ?L16
ZERO? STACK \FALSE
?L16: ...

Disintermediate branch. BTST STACK,2048 \?L12 becomes BTST STACK,2048 \FALSE, because the instruction at ?L12 is RFALSE.

==10==

?L14: BTST STACK,2048 \FALSE
JUMP ?L6
?L12: RFALSE
?L6: JUMP ?L16
ZERO? STACK \FALSE
?L16: ...

Disintermediate branch. JUMP ?L6 becomes JUMP ?L16, because the instruction at ?L6 is JUMP ?L16.

==11==

?L14: BTST STACK,2048 \FALSE
JUMP ?L16
?L12: RFALSE
?L6: JUMP ?L16
ZERO? STACK \FALSE
?L16: ...


Eliminate dead code and unused labels.

==12==

?L14: BTST STACK,2048 \FALSE
JUMP ?L16
?L16: ...

Eliminate branch to next instruction. JUMP ?L16 goes away.

==13==

?L14: BTST STACK,2048 \FALSE

6 comments:

  1. Very cool. But what produced that code to start with?

    ReplyDelete
    Replies
    1. This comment has been removed by the author.

      Delete
    2. (Ugh, Blogger ate my ZIL code. Let's try that again...)

      The tangle of PUSHes and ZERO?s comes from using predicates in value context and vice versa.

      In general, there are a lot of ways that can happen: whenever you use a boolean function like AND or EQUAL? in a context where a numeric value is expected, ZILF generates branching code to turn it into a 1 or a 0. And when you use a numeric value in a context where a conditional expression is expected, ZILF tests it with the ZERO? opcode. This maps directly to the difference between store and branch operations in the Z-machine.

      But when you nest conditions, sometimes you can accidentally make a predicate context look like a value context, and then the inner expression has to do a conversion that's immediately undone by the outer one. For example, this works as expected:

      ZIL:

      <COND (<==? .X .Y> <TELL "Equal.">)>

      ZAP:

      EQUAL? X,Y \?L1
      PRINTI "Equal."
      ?L1: ...

      But if you explicitly test whether the result of ==? is nonzero instead of letting COND do it, you get an unwanted conversion:

      ZIL:

      <COND (<N==? <==? .X .Y> 0> <TELL "Equal.">)>

      ZAP:

      EQUAL? X,Y /?L1
      PUSH 0
      JUMP ?L2
      ?L1: PUSH 1
      ?L2: ZERO? STACK /?L3
      PRINTI "Equal."
      ?L3: ...

      If you nest a couple of those, you end up with a rat's nest of PUSH, JUMP, and ZERO?. And when you're using macros, it's easy to nest them.

      It doesn't have to be something as explicit as comparing to zero, either. There are some things, like VERSION?, that probably should be macros but are instead implemented as intrinsic functions, and some of them are only implemented in value context. So when you write this:

      <COND (<VERSION (ZIP <==? .X .Y>)> <TELL "Equal.">)>

      You get the same result as above, because VERSION? forces the condition to produce a number and then COND has to test whether it's nonzero.

      (Although in that case, you could rewrite it as %<VERSION? (ZIP '<==? .X .Y>)> to substitute the condition before the routine is compiled.)

      Delete
    3. Ah. I was more wondering why something that resulted in @test would start out so different. Were you manually doing a bitmask test, is that what the first @and is for? And then the next and is typecasting? And then the rest is ?

      Delete
    4. It swallowed my ZIL too. The rest is < COND >?

      Delete
    5. The first two ANDs are a manual bitmask test, yeah. The high bit is special in this case, so it's masked off: if the two values being compared both have the high bit set, but don't share any other bits, that's not a match.

      The rest is COND, NOT, and some other functions forcing the mask result back and forth between predicate and value context.

      Delete