← Back to context

Comment by cmrdporcupine

14 hours ago

Both Apple and NeXT had machines prototyped around it, but it was initially very expensive I believe, and I think Apple was easily convinced to go with PowerPC ... and rather than evolve it and push it further Motorola dropped it in favour of going in on PowerPC.

The sad thing is Intel showed there was still life left in CISC, and Motorola themselves ended up circling back on 68k in the form of ColdFire which proved you could do for 68k what Intel did w/ the Pentium. But by then all their 68k customers had moved on from the 68k ISA.

68k was much harder to optimize than x86, being way more CISC-y

68k like VAX was seen as dead avenue not only compared to RISC

  • Motorola had made a few design mistakes, like adding memory indirect addressing in MC68020, which were removed much later, in the ColdFire Motorola CPUs.

    But Intel had made much more design mistakes in the x86 ISA.

    The truth is that the success of Intel and the failure of Motorola had absolutely nothing to do with the technical advantages or disadvantages of their CPU architectures.

    Intel won and Motorola failed simply because IBM had chosen the Intel 8088 for the IBM PC.

    Being chosen by IBM was partly due to luck and partly due to a bad commercial strategy of Motorola, which had chosen to develop in parallel 2 incompatible CPU architectures, MC68000 intended for the high end of the market and MC6809 for the low end of the market.

    Perhaps more due to luck than due to wise planning, Intel had chosen to not divert their efforts into developing 2 distinct architectures (because they were already working in parallel at the 432 architecture for their future CPUs, which was a flop), so after developing the 8086 for the high end of the market they have just crippled it a little into the 8088 for the low end of the market.

    Both 8086 and MC68000 were considered too expensive by IBM, but 8088 seemed a better choice than Z80 or MC6809, mainly by allowing more memory than 64 kB, which was already rather little in 1980.

    In the following years, until 80486 Motorola succeeded to have a consistent lead in performance over Intel and they always introduced various innovations a few years before Intel, but they never succeeded to match again Intel in price and manufacturing reliability, because Intel had the advantage of producing an order of magnitude more CPUs, which helped solving all problems.

    Eventually Intel matched and then exceeded the performance of the Motorola CPUs, despite the disadvantages of their architecture, due to having access to superior manufacturing, so Motorola had to restrict the use of their proprietary ISAs to the embedded markets, switching to IBM POWER for general-purpose computers.

    • Analysis of issues in making more performant 68k and VAX are major part of what led to RISC development, with complex addressing (even in earliest 68000) being part of the problem. People think of x86 as CISC when reading about CISC vs RISC, but x86 was not much of a consideration when industry was switching to RISC-style designs - it was hitting walls on complex ISAs, especially VAX (which was allowed to live for way too long), but also to an extent 68k.

      N.b. 68000 was supposed to be a 16bit extension of 6800, which among others resulted in hilarious two layers of microcoding.

      AS for IBM PC, 68000 had major flaw of being newer while 8086 had been available for longer and with second sources - 68000 was released at the same time as reduced capability 8088, while equivalent reduced capability model for 68k arrived in 1982.

  • > 68k was much harder to optimize than x86

    Harder to optimize or because of its orthogonal instruction set easier to write code for?

    • Harder to optimize at microarchitectural level because each individual instruction represents way more complex execution model, including to even decode what the CPU is supposed to do.

      X86 is comparatively simple, with limited indirect addressing support to the point it can be inlined in execution pipeline, and many instructions either being actually "simple" to implement, or acceptable to do in slow path. M68k (and VAX even more) are comparatively harder to build modern superscalar chip for.

    • What matters is how easy it is to create an out-of-order implementation of an ISA, there isn't a 680x0 equivalent of the Pentium Pro.

  • Respectfully, this is nonsense.

    «More CISC-y» does not by itself mean «harder to optimise for». For compilers, what matters far more is how regular the ISA is: how uniform the register file is, how consistent the condition codes are, how predictable the addressing modes are, and how many nasty special cases the backend has to tiptoe around.

    The m68k family was certainly CISC, but it was also notably regular and fairly orthogonal (the legacy of the PDP-11 ISA, which was a major influence on m68k). Motorola’s own programming model gives one 16 programmer-visible 32-bit registers, with data and address registers used systematically, and consistent condition-code behaviour across instructions.

    Contrast that with old x86, which was full of irregularities and quirks that compilers hate: segmented addressing, fewer truly general registers (5 general purpose registers), multiple implicit operands, and addressing rules tied to specific registers and modes. Even modern GCC documentation still has to mention x86 cases where a specific register role reduces register-allocation freedom, which is exactly the sort of target quirk that makes optimisation more awkward.

    So…

      68k: complex, but tidy
    
      x86: complex, and grubby
    

    What worked for x86, though, was the sheer size of the x86 market, which resulted in better compiler support, more tuning effort, and vastly more commercial optimisation work than m68k. But that is not the same claim as «68k was harder to optimise because it was more CISC-y».

    • Notice I didn't write harder to optimize for - I am not talking about optimizing code, but optimizing the actual internal microarchitecture.

      Turns out m68k orthogonality results in explosion of complexity of the physical implementation and is way harder to optimize, especially since compilers did use that. Whereas way more limited x86 was harder to write code generation for, but it meant there was simpler execution in silicon and less need to pander to slow path only instructions. And then on top of that you got the part where Intel's scale meant they could have two-three teams working on separate x86 cpu at the same time.