← Back to context

Comment by antonvs

2 days ago

Subroutine calls are for the weak :)

There's some more detail here: https://ed-thelen.org/comp-hist/CRAY-1-HardRefMan/CRAY-1-HRM...

The following quote gives some sense of how "manual" this was:

> "On execution of the return jump instruction (007), register Boo is set to the next instruction parcel address (P) and a branch to an address specified by ijkm occurs. Upon receiving control, the called routine will conventionally save (Boo) so that the Boo register will be free for the called routine to initiate return jumps of its own. When a called routine wishes to return to its caller, it restores the saved address and executes a 005 instruction. This instruction, which is a branch to (Bjk), causes the address saved in Bjk to be entered into P as the address of the next instruction parcel to be executed."

Details were up to the compiler that produced the machine code.

Essentially, the B00 register is a Top Of (Return) Stack or TOS register. It’s great for leaf routines.

You have to push it to your preferred stack before the next operation. You do the cycle-counting to decide if it’s a good ISA for your implementation, or not.

Obviously, ISAs with a JSR that pushes to stack are always using an extra ALU cycle for the SP math, then a memory write.

Doing it with a (maybe costless) register transfer followed by (only sometimes) a stack PUSH can work out to the same number of cycles.

With improvements in memory speed or CPU speed, that decision can flip.

Consider that in this era, your ALU also had the job of incrementing the PC during an idle pipeline stage (maybe the instruction decode). Doing a SP increment for a PUSH might compete with that, so separating the two might make the pipeline more uniform. I don’t know any of the Cray ISAs so this is just a guess.