Disconnected ESX Host
Got a call today.
Panic!!!!
All VM on an ESX host just went grey – all disconnected.
Trouble shooting steps:
-
Ping ESX host Service Console – All ok
-
Look in the VI client what is with the server – NOT OK – all machines are greyed out – (hey that is what they said wasn’t it).
-
SSH into the Service console - All ok
-
Direct GUI management to the server NOT OK. could not load the inventory
-
All VM’s on the host were running and responding to ping.
-
No failover was initiated in the cluster.
-
On the console – I saw that there were 7 processes of vmware-hostd each using a lot of RAM.
-
service mgmt-vmware stop
– to stop the service. GOT STUCK -
Off to this KB which helped me stop the service and get the host responsive again.
cd /var/run/vmware ls -l vmware-hostd.PID watchdog-hostd.PID (to get the current PID of the process) cat vmware-hostd.PID (i.e. 1234 is the PID) kill -9 <PID> (kill the process) rm vmware-hostd.PID watchdog-hostd.PID remove the files service mgmt-vmware start (restart the agent)
-
The host came back online – all VM’s were no longer grey.
Here starts my questions.