Comment by librasteve
13 hours ago
Occam (1982 ish) shared most of BEAMs ideas, but strongly enforced synchronous message passing on both channel output and input … so back pressure was just there in all code. The advantage was that most deadlock conditions were placed in the category of “if it can lock, then it will lock” which meant that debugging done at small scale would preemptively resolve issues before scaling up process / processor count.
Once you were familiar with occam you could see deadlocks in code very quickly. It was a productive way to build scaled concurrent systems. At the time we laughed at the idea of using C for the same task
I spreadsheeted out how many T424 die per Apple M2 (TSMC 3nm process) - that's 400,000 CPUs (about a 600x600 grid) at say 1GIPs each - so 400 PIPS per M2 die size. Thats for 32 bit integer math - Inmos also had a 16 bit datapath, but these days you would probably up the RAM per CPU (8k, 16k?) and stick with 32-bit datapath, but add 8-,16-bit FP support. Happy to help with any VC pitches!
David May and his various PhD students over the years have retried this pitch repeatedly. And Graphcore had a related architecture. Unfortunately, while it’s great in theory, in practice the performance overall is miles off existing systems running existing code. There is no commercially feasible way that we’ve yet found to build a software ecosystem where all-new code has to be written just for this special theoretically-better processor. As a result, the business proposal dies before it even gets off the ground.
(I was one of David’s students; and I’ve founded/run a processor design startup raised £4m in 2023 and went bust last year based on a different idea with a much stronger software story.)
1 reply →