Comment by p_l

12 days ago

Motorola 68000 used had 16 data lines and 24 address lines, so it took at least two cycles to just transfer a CPU full word (disregarding timings on address latches etc).

Some of the code AFAIK used fancy multi-register copies to increase cycle efficiency in graphics code.

As for screen, IIRC making it easy to correlate "what's on screen" and "what's on paper" was major part of what drove Mac to be nearly synonymous with DTP for years.