Comment by toast0

4 years ago

There's plenty of network controlled power outlets. Either enterprise/rackmount PDUs, or consumer wifi outlets, or rig something up with a serial/parallel port and a relay. You'd use an always on test runner computer to control the power state.

The computer under test would boot from PXE, on boot read from the drive and determine the last write, send that to the test runner for analysis, then begin the write sequence and report ASAP to the test runner at each flush. The test runner turns the power off at random, waits a minute (or 10 seconds, whatever) and turns it back on and starts again.

In a well functioning system, you should often get back the last reported successful write, and sometimes get back a write beyond the last reported write (two generals and all), but never a write before the last reported write. You can't use this testing to prove correct flushing, but if you run for a week and it doesn't fail once, it's probably likely not to lie.

I haven't evaluated the code, but here's a post from 2005 with a link to code that probably works for this. (Note: this doesn't include the pxe booting or the power control... This just covers the what to write to the disk, how to report it to another machine, and how to check the results after a power cycle)