Comment by dylan604

6 hours ago

And what recovery mechanisms do you have in place when the OTA flash goes wrong?

you can have 2 identical partitions on the ESP, the OTA flashes the inactive partition and signals to bootloader to attempt to boot it from there.

the device is restarted, if the new firmware is working correctly you signal the update process that everything is all right and it sets the new partition as default.

if the device doesn't boot correctly, or your sanity checks don't pass, either you or the watchdog restarts the device and it boot from the known-working partition.

  • I didn't ask what can you have. We could have whatever safety processes we wanted with multiple levels of redundancy. However, that's not what's available on COTS IoT devices though, so speculation does not help.

    Flashing the firmware of a cheap IoT device remotely OTA is not without risk.

    • Surely the basic flashing mechanisms used nowadays will first check checksum (and hopefully a device magic), and then you have a relatively short time window when it actually does the flashing after which it reboots? Even small devices nowadays seem to have the memory for it. So there is a window of failure, but it's not a very long one.

      Well, in addition to flashing the incorrect or buggy firmware.