I mean in software we have automated integration testing for these purposes, but it's really difficult to reproduce in prod due to a billion different factors. Some ways around this now in the DC are separating data from config. That's what I try to do. Distros like fedora are coming out with new configuration based deployments. So you can configure your OS deployment as part of your infra as a config file.
With HA hypervisors, you can scratch deploy a new machine from a config file and mount shared data storage. Then upgrades are basically done from an IDE and can be easily tested before deployment. Then your hypervisor provide that abstraction layer so the VM doesn't know or care what it's running on.
But on the hypervisor side the same issue appears, now sitting on hardware again, it's availability in numbers, like it always has been.