Comment by zie

3 months ago

In modern times, we have taken the everything breaks all the time, make redundancy and failover cheap/free approach.

VMS(and the hardware it runs on) takes the opposite approach. Keep everything alive forever, even with hardware failures.

So the VMS machines of the day had dual redundant everything, including interconnected memory across machines and SCSI interconnects and everything you could think of was redundant.

VMS clusters could be configured in a hot/hot standby situation, where 2 identical cabinets full of redundant hardware could failover during an instruction and keep going. You can't do that with the modern approach. The documentation was an entire wall of office bookcase almost clear full of books. There was a lot of documentation.

These days, nothing is redundant inside the box level usually, we instead duplicate the boxes and make them cheap sheep, a dime a dozen.

Which approach is better? That's a great question. I'm not aware of any academic exercises on the topic.

All that said, most people don't need decade long uptimes. Even the big clouds don't bother with trying to get to decade long uptimes, as they regularly have outages.

5 comments

zie

malux85 3 months ago

One of the things that blew my mind in my early career was seeing my mentor open the side of a VMS machine (I can’t remember the hardware model sorry) and slide out a giant board of RAM, and then slide in another board the same physical size but it had a CPU on it, and then enable the CPU

The daughter board in that machine could have RAM or CPUs in the same slot and it was changeable without reboots!

zie 3 months ago
Exactly! One would never, ever do that with x86.
- gforce_de 3 months ago
  
  You have not seen it, but there are vendors selling such stuff since ~20 years. Google for linux + hardware + cpu hotplug or memory hotplug. The PCI bus helps here.
  
  1 reply →
- jiggawatts 3 months ago
  
  I've seen x86 servers with hot-plug memory.