> usually people spend their time with having DNS not working, having servers hanging, having etcd nodes not talking to each other without explaining why, having deployments say "SUCCESS" while actually not running
Not disagreeing with you, and those are all "noob problems" from my perspective as I've done all of those things wrong before myself, and I myself am a noob. But they are some actually very complex problems (and you can even have them on your managed platforms once in a while, too.)
Self-healing is great when it works, and those items you listed are all real problems. But they are not problems that you should expect to encounter on a managed service, at least hopefully not more than once. (YMMV, amiright?)
> usually people spend their time with having DNS not working, having servers hanging, having etcd nodes not talking to each other without explaining why, having deployments say "SUCCESS" while actually not running
I'm not saying k8s is free of problems.