← Back to context

Comment by andrewl-hn

1 day ago

I wish I could edit my post.

Somehow everyone wrote to me about baldes. These are not the same, though. Blade servers were mounded into units of 4u, 8u, etc, they occupied a portion of the overall cabinet and still had to do "plumbing" for power and networking behind the chassis to the rest of the cabinet or to the rest of the datacenter. A full-cabinet blade rig would have multiple 8u blade units and some off the shelf units for networking, storage, etc. Yes, you could mix and match different components based on your needs, but that also meant that there were extra wires, cables, mounting rails, and more importantly - all these different components ran a mix of software that had to integrate using common denominator protocols and speeds.

Steve rightly mentioned the integration below, and I didn't put it in my message because I kinda assumed that we include software in this discussion too.

HP in 2005 had an army of programmers writing all sorts of firmware and software and another army of hardware engineers, too. They could have made an Oxide computer back then, and it would sell really well. But they didn't, and none of their competitors did despite this being an obvious product (in hindsight), an THIS is what I find interesting.

Cabling, plumbing, etc, aside, all the "blade servers" I've ever worked with were still glorified IBM PC's. They still have BMC's strapped to legacy interfaces pretending to be decades-old hardware allowing for "headless" operation of a platform originally intended to be a single-user computer on a desk with a monitor and keyboard, etc.

That's what's so cool about Oxide's boxes to me-- the legacy garbage is gone and the strange undefined behavior part and parcel with overlapping edge cases will be minimized (and managed, as opposed to used as an excuse by a vendor). Dealing with incompatibilities and strange firmware interactions have made me come to see PC-based servers as a weird opposite of the "Swiss cheese" model. The various layers of interacting hardware, firmware, drivers, and OS act as a kind of "filter" for correct operation. When you swap or add one of these component you get one or more exciting new layers in the stack that, hopefully, have "holes" aligning with the existing.

  • FWIW the original IBM BladeCenter platform was developed before BMCs so it had a much smaller service processor per blade connected over an out of band RS-485 management bus to a chassis management module. It was closer to Oxide than what's being sold today. And then "Windows needs VGA" kicked in.

    • IBM gear, aside from Thinkpads and the EduQuest PCs, have never been a thing I've gotten to interact with. Their documentation was always very good (which I think implies a certain level of overall competence) Given their massive legacy in datacenters I can imagine their PC-based server gear probably benefited.

Forgot how I learned this but IIUC, blades failed because they expended per-rack weight and power budgets for datacenters with single enclosures. It turned out that you could compress 48U worth of computers into 8U or so, as long as you do not fill back the empty space with anything, because the cooling/cabling crawlspace collapses and circuit breakers will go off if you had done so. It wasn't because they still needed cabling.

This sounds like less of a problem for DCs with bare concrete flooring, but blades did fell out of fashion, so I guess the fractions of DCs with multiple levels or free-access floors were higher than anticipated.

(Also, maybe I'm just being an amateur, but I'd be scared of tolerance stacking with a "grape bunch" design like this. Individually enclosed chassis with cables and cage nuts are a lot more robust against dimensional issues)

  • > but blades did fell out of fashion,

    I prefer to think of that was "datacenter weren't ready for denser workloads"; blades were ahead of their time, and their adoption wasn't great because of that. Since then, new datacenters have been built with more ambitious power/cooling/weight budgets.

  • Seems to me that making it hyper-converged, with all storage make it make sense. Ultra dense compute alone isn't ideal. Oxide rack has the features of blade but is hyperconverged with everything. Plus the better integrated software.

Speaking of the army, it's not clear that the extremely narrow feature set of Oxide justifies the engineering effort required.

The storage market used to be dominated by Oxide-style vertical integration and bespoke engineering and almost every vendor has transitioned to modularity over the last 20 years. Pure Storage seems to be doing OK with custom hardware though, so maybe the rest of the industry just has a lack of courage.

  • > it's not clear that the extremely narrow feature set of Oxide justifies the engineering effort required.

    Their customers think so.

    (Think governments, security-serious applications, critical workloads, science, energy, financial, etc.)

    Some if it may even get sent to labs first for disassembly and inspection. Other times it may never ever be plugged in to an internet connection.

    For these customers, the feature set, supply chain ownership, integration and support is everything.