Comment by sitkack
12 hours ago
Last thing.
You will need a way to coordinate LM with users due them being sensitive to LM blackouts. Not many workloads are, but the ones that are are the kinds of things that customers will just leave over.
If you are draining a host, make sure new VMs are on hosts that can be guaranteed to be maintenance free for the next x-days. This allows customers to restart their workloads on their schedule and have a guarantee that they won't be impacted. It also encourages good hygiene.
Allow customers to trigger migration.
Charge extra for a long running maintenance free host.
It is good you are hooked into the PCM already. You will experience accidentally antagonistic workloads and the PCM will really help debug those issues.
If I were building a DC, I put as many NICs into a host as possible and use SR-VIO to pass the nics into the guests. The switches should be sized to allow for full speed on all nics. I know it sounds crazy but if you design for a typical crud serving tree, you are a saving a buck but making your software problem 100x harder.
Everything should have enough headroom so it never hits a knee of a contention curve.
No comments yet
Contribute on Hacker News ↗