← Back to context

Comment by LegionMammal978

1 year ago

At least for classic textbook CPUs, it's common to run both sides of a decision in parallel while it's still being made, then finally use a transistor to select the desired result after the fact. No one wants the latency of powering everything up and down every cycle, except on the largest scales where power consumption makes a big difference.

I don't understand what you mean by decision. In a textbook CPU, where there is no speculative execution and no pipelining, the hardware runs one instruction at a time, the one from the instruction pointer. If that instruction is a conditional jump, this is a single computation that sets the instruction pointer to a single value (either next instruction or the specified jump point). After the single new value of the instruction pointer is computed, the process begins again - it fetches one instruction from the memory address it finds in the instruction point register, decodes it, and executes it.

Even if you add pipelining, the basic textbook design will stall the pipeline when it encounters a conditional/computed jump instruction. Even if you add basic speculative execution, you still don't get both branches executed at once, necessarily - assuming you had a single ALU, you'd still get only one branch executed at once, and if the wrong one was predicted, you'll revert and execute the other one once the conditional finishes being computed.

  • > I don't understand what you mean by decision.

    I'm talking on a lower level than the clock cycle or instruction. Let's say circuit A takes X and outputs foo(X), and circuit B takes X and outputs bar(X). We want to build a circuit that computes baz(X, Y) = Y ? foo(X) : bar(X), where X is available before Y is. Instead of letting Y settle, powering up one of the circuits A or B, and sending X into it, we can instead send X to circuits A and B at the same time, then use Y to select whichever output we want.

    • Agreed, we can do that, and it is done for many kinds of circuits. But many others don't work like that, so it's wrong to say that hardware works exclusively like this.

      One other common pattern for implementing conditional logic is to compute a boolean expression in which the control signal is just another input variable. In that model, to compute Y ? foo(X) : bar(X), we actually compute baz(X, Y) whose result is the same. This is very commonly how ALUs work.

      And the other very common pattern is to split this in multiple clock cycles and use registers for intermediate results. If you don't have two circuits A and B, only one circuit that can compute either A or B based on a control signal (such as simple processors with a single ALU), this is the only option: you take one clock cycle to put Y in a register, and in the next clock cycle you feed both Y and X into your single circuit which will now compute either A or B.