← Back to context

Comment by Aurornis

6 hours ago

Clean slate designs with arbitrarily radical designs are easy when you don’t have to actually build them.

There are reasons that current architecture are mostly similar to each other, having evolved over decades of learning and research.

> Perhaps model where processing and memory are one, A:very simple core per 1k of SRAM per 64k of DRAM per megabytes of flash,

To serve what goal? Such a design certainly wouldn’t be useful for general purpose computing and it wouldn’t even serve current GPU workloads well.

Any architecture that requires extreme overhauls of how software is designed and can only benefit unique workloads is destined to fail. See Itanium for a much milder example that still couldn’t work.

> machines with 2^n cores where each core has a direct data channel to every core with its n-bit core ID being one but different (plus one for all bits different).

Software isn’t the only place where big-O scaling is relevant.

Fully connected graph topologies are great on paper, but the number of connections scales quadratically. For a 64-core fully connected CPU topology you would need 2,016 separate data buses.

Those data buses take up valuable space. Worse, the majority of them are going to be idle most of the time. It’s extremely wasteful. The die area would be better used for anything else.

> A n=32 system would have four billion cores

A four billion core system would be the poster child for Amdahl’s law and a great example of how not to scale compute.

Let’s not be so critical of companies trying to make practical designs.

> Software isn’t the only place where big-O scaling is relevant.

> Fully connected graph topologies are great on paper, but the number of connections scales quadratically. For a 64-core fully connected CPU topology you would need 2,016 separate data buses.

Nitpick: I don't think the comment you're replying to is proposing a fully-connected graph. It's proposing a hypercube topology, in which the number of connections per CPU scales logarithmically. (And with each node also connected to its diagonal opposite, but that doesn't significantly change the scaling.)

If my math is right, a 64-core system with this topology would have only 224 connections.

  • This is what I meant. I also like the idea of optical, line-of-sight connections. If you do the hypercube topology everything a node connects to has a different parity, so you lay them out on two panels facing each other.

Perhaps not a true counterpoint, but there are systems like the GA144, an array of 144 Forth processors.

I think you're missing the point, and I don't think OP is "being critical of companies making practical designs."

Also, I think OP was imagining some kind of tree based topology, not connected graph since he said:

> ...but it would take talking through up to 15 intermediaries to communicate between any two arbitrary cores.

  • Are you aware of anyone who has used that system outside of a hobbyist buying the dev board? I looked into it and the ideas were cool, but no clue how to actually do anything with it.