vCenter is Still a Single Point of Failure - An Update
My previous post vCenter is Still a Single Point of Failure caused a decent amount of discussion on Twitter and comments on the blog. I am actually glad about that because it raised a point that I think has been neglected for quite some time – and is finally being addressed.
Today I learned that the Availability Guide was recently updated – and there were a number of changes (the original version is no longer available – but are we all lucky that I have screenshots). The biggest one was a whole new section was added on the paper about clustering vCenter Services.
Firstly I would like to commend VMware on putting this out. Let’s go in and see what was added.
<sarcasm>
“vCenter can be a single point of failure” – I wonder where I have heard that before?
It is good to see that vCenter is now considered to be a critical component.
I wonder why VMware changed their tune as compared to the previous version? (below)

So what would such an architecture look like? This is described in the following diagram
You will need to setup a WSFC which will require a shared disk between the VM’s (RDM only is supported) and the document goes into some detail about how this should be set up. There are a large number of moving parts including DFS, How to perform upgrades, registry settings, certificates you name it. You will also need to protect the SQL / Oracle Database as well – this is not in the scope of this document.
You can see that a decent amount of work has gone into documenting the process. It is also apparent that this will be the same method used to provide the same availability features for vSphere 6.0 when it is released.
But one small caveat. I am sure you are not the only one who has gotten the feeling that VMware have been pushing us all towards the appliance – for a number of reasons, good and bad.
So where is the clustering solution for the vCenter Appliance? All of this is Windows Server vCenter only!
The only mention of such a solution is a tweet from my good friend Niran Even-Chen
The second update I would like to go into was a performance paper released last week regarding the failover time needed for vCenter 6.0 in the case of a Host failure. The original article can be found here.
Now we have been down this path before – but I would like to point out two things.
- 
A VM protected by vSphere HA – will be restarted in approximately 30-50 seconds after failure – which is pretty fast and what we would expect from HA. 
- 
The amount of time required to restart a 6.0 vCenter is even more than that of a 5.5 vCenter. Quite a bit more. From the original version of the paper:  5 minutes and 12 seconds. From the performance document for vCenter 6.0 I would like to take the middle scenario – where we are dealing with 32 hosts and around 4,000 VM’s (I think that is safe to call that a standard environment). The restart time has gone up to 7 minutes and 19 seconds – That is a 40% increase in downtime. That is a lot if you ask me – and probably due to the changes made in the architecture – hopefully for the better of us all. 
Again I would like to thank the authors for there hard work, also VMware for recognizing that this is something that was lacking in the previous whitepaper – and something they were quick to rectify.
Last minute update
It seems there was an article published yesterday (after I had written this post) vCenter Server 6 Deployment Topologies and High Availability with some more information. It seems that even thought the vCenter Server has been consolidated down into two components – they taken even longer to start up.
It also seems that the Appliance will not be supported (at least not yet)
And one more question I had for VMware.
Why is Windows 2012 Datacenter Edition required? According to this document WSFC is included in Standard as well.
As always your thoughts and comments are welcome.
 
      





