Comment by mort96

20 days ago

This is cool! Makes me want to implement peephole optimizations in some of my own bytecode interpreters.

Though I'm wondering about the const_binary optimization:

    > This optimisation refers to a binary instruction
    > and both lhs and rhs are created via LoadImm directly
    > in the window beforehand:

        __entry:
                load_imm r0, #2
                load_imm r1, #3
                add r2, r0, r1 ; <-- this is a compile time known result := 2+3=5
                load_imm r1, #4
                load_imm r0, #1
                sub r3, r1, r0 ; <-- this is a compile time known result := 4-1=3
                mul r0, r2, r3

    > In this example add, sub and mul are constant.
    > The optimisation itself applies to all arithmetics.
    > Thus after optimising:

        __entry:
                load_imm r2, #5
                load_imm r3, #3
                mul r0, r2, r3

Is this valid? The first const_binary optimization changes the values of r0 and r1 (before the optimization they're 2 and 3, after the optimization they're unknown). The second const_binary optimization changes the values of r1 and r0 (before the optimization they're 4 and 1, after the optimization they're unknown). An optimizer which analyzes the whole block would be able to see that r0 and r1 are written to before they're read from, but the peephole optimizer only sees the 3 instructions, right?

I would get how this works in a stack based bytecode, as 'push 2; push 3; add' could be replaced with 'push 5' without affecting the machine state after the optimization. But why is the same optimization valid for this register machine? Does purple-garden impose some invariant that a register may be read at most once after being written to?

1 comment

mort96

xnacly 20 days ago

Hi, no it doesnt, this example of course breaks using temporary registers used after the optimisation. But thats why peephole is a fallback to ir optimisation, since they are only local.

I honestly didnt think of this possibility while building the optimisation. Since constant folding is done as an ir pass i will remove the peephole pass, thanks for noticing :)