none
Designing a highly available Windows Server 2012 Hyper-V Cluster

    Question

  • hi all,

    With all the hoopla around SMB 3.0 and Storage Spaces and what not around Windows Server 2012, i need some help:

    We are desgning our first production Hyper-V cluster now, and plan to run it on the RC version until the RTM ships. This is initially going to be a 2-node cluster, which will be scaled out to around 8 nodes later on. Storage will be on a IBM Storwiz SAN, with HBAs in each cluster node.

    What I need to understand is: Do/Should I take into consideration the new storage technologies that Windows Server 2012 brings to the table? I mean, I could probably set up each hyper-v node as a file server and let Hyper-V access its storage through SMB3.0 instead of directly through CSVs, but will I gain anything?

    I'm having a hard time finding info on deploying Hyper-V with "good old" HBA/SAN-based storage because of all the new technology.

    (Later on, we might implement a SMB3.0-based file cluster for a netapp box and hook that up, but it's out of scope for now)

    Any pointers appreciated!

    Friday, July 20, 2012 7:45 AM

Answers

  • So you want to move from a cluster providing highly available storage and VMs, to standalone systems accessing the storage of the other host over the network instead of access the storage locally?  Or are you saying that you want to retain the cluster, but have the nodes of the cluster create SMB shares of their local storage to be used as the storage for VMs?  I'm not clear from your description as to exactly what you are trying to do.

    SMB 3.0 is neat stuff, no doubt about it.  For a cluster, though, what you need is a storage provider that offers SMB 3.0 in a highly available manner.  That can be provided by a typical storage vendor (don't know where IBM's Storwiz SAN is with this) who provides SMB in addition to iSCSI or FC or CIFS or NFS.  I know other storage vendors plan on upgrading their CIFS to SMB 3.0.  Most solutions like this have HA characteristics, so your SMB would be HA.  (If you are going to have a cluster for VMs to be HA, it only makes sense to ensure your storage is HA, too.)

    Another route would be to have Windows Server 2012 offer the SMB.  I think this is what you are thinking.  So you want that storage to be HA - able to continue offering services even though a single component is lost.  To do this, you need to create a 2012 cluster that is offering file services.  This means that you need a shared storage subsystem with dual power supplies.  One of the new features in 2012 is that you can have clustered PCI RAID controllers that reside in each 2012 node.  This is a new feature.  Previously, RAID controllers had to reside in the storage shelf and the controllers had to provide multiple ports for access by the nodes.  The new PCI RAID controllers (these are new RAID controllers) talk to each other and keep each other apprised of what is going on, so should one fail, the other can pick up where it left off.

    So, yes, it is possible to use 2012 as the host for providing HA SMB, but it will still require a shared storage component.  One of the really neat things with this solution is the continuous availability of SMB storage - this is another capability that was not available previously.  What it does is it brings continous availability to file shares in a cluster, something that was not available before.


    tim

    Friday, July 20, 2012 12:29 PM
  • The degree of degradation totally depends on your design and infrastructure.

    While it is a technical degradation of the storage IO.  If you design well you can have greater throughput in the end.

    And, it depends on your workload if the HA SMB share for VHDs is a good fit.  If you have high read/write IO workloads it is probably not a good fit.

    At the same time, Clsutering now uses SMB under the hood for node to node traffic.

    And don't forget LBFO teaming.  Create that big pipe.

    There are lots of different pieces that combine and align to make it as a system - far more individual pieces than we have had available in the past.


    Brian Ehlert
    http://ITProctology.blogspot.com
    Learn. Apply. Repeat.
    Disclaimer: Attempting change is of your own free will.

    Friday, July 20, 2012 3:30 PM

All replies

  • hi all,

    With all the hoopla around SMB 3.0 and Storage Spaces and what not around Windows Server 2012, i need some help:

    We are desgning our first production Hyper-V cluster now, and plan to run it on the RC version until the RTM ships. This is initially going to be a 2-node cluster, which will be scaled out to around 8 nodes later on. Storage will be on a IBM Storwiz SAN, with HBAs in each cluster node.

    What I need to understand is: Do/Should I take into consideration the new storage technologies that Windows Server 2012 brings to the table? I mean, I could probably set up each hyper-v node as a file server and let Hyper-V access its storage through SMB3.0 instead of directly through CSVs, but will I gain anything?

    I'm having a hard time finding info on deploying Hyper-V with "good old" HBA/SAN-based storage because of all the new technology.

    (Later on, we might implement a SMB3.0-based file cluster for a netapp box and hook that up, but it's out of scope for now)

    Any pointers appreciated!

    Putting SMB filer in front of a SAN will result longer I/O path, extra network hop to fetch actual data, increased latency and degraded performance. "New" does not necessary mean "good" and definitely does not mean "throw away what you have and start fixing what's actually not broken".

    -nismo


    Friday, July 20, 2012 8:50 AM
  • So you want to move from a cluster providing highly available storage and VMs, to standalone systems accessing the storage of the other host over the network instead of access the storage locally?  Or are you saying that you want to retain the cluster, but have the nodes of the cluster create SMB shares of their local storage to be used as the storage for VMs?  I'm not clear from your description as to exactly what you are trying to do.

    SMB 3.0 is neat stuff, no doubt about it.  For a cluster, though, what you need is a storage provider that offers SMB 3.0 in a highly available manner.  That can be provided by a typical storage vendor (don't know where IBM's Storwiz SAN is with this) who provides SMB in addition to iSCSI or FC or CIFS or NFS.  I know other storage vendors plan on upgrading their CIFS to SMB 3.0.  Most solutions like this have HA characteristics, so your SMB would be HA.  (If you are going to have a cluster for VMs to be HA, it only makes sense to ensure your storage is HA, too.)

    Another route would be to have Windows Server 2012 offer the SMB.  I think this is what you are thinking.  So you want that storage to be HA - able to continue offering services even though a single component is lost.  To do this, you need to create a 2012 cluster that is offering file services.  This means that you need a shared storage subsystem with dual power supplies.  One of the new features in 2012 is that you can have clustered PCI RAID controllers that reside in each 2012 node.  This is a new feature.  Previously, RAID controllers had to reside in the storage shelf and the controllers had to provide multiple ports for access by the nodes.  The new PCI RAID controllers (these are new RAID controllers) talk to each other and keep each other apprised of what is going on, so should one fail, the other can pick up where it left off.

    So, yes, it is possible to use 2012 as the host for providing HA SMB, but it will still require a shared storage component.  One of the really neat things with this solution is the continuous availability of SMB storage - this is another capability that was not available previously.  What it does is it brings continous availability to file shares in a cluster, something that was not available before.


    tim

    Friday, July 20, 2012 12:29 PM
  • Tim, thanks for your reply.

    I guess my questin is: Should I go about designing a Server 2012 Hyper-V cluster based on LUNs and HBAs the same way as I would design a 2008R2 Hyper-V cluster? Or are there new components in Windows Server 2012 that would be reasons to move away from the regular CSV-based storage that is common on 2008R2?

    I'm also intrigued by SMB 3.0, and down the road we'll probably hook onto some SMB storage. But for now, the scope of the design is strictly a 2-8 node Hyper-C cluster where all nodes are presented with LUNs through the HBA.

    So, I'm not thinking anything in particular, I just want to make sure i don't "miss out" on anything that could be used in our favour...

    Friday, July 20, 2012 12:37 PM
  • Migration is not happy if the Hyper-v Servers are also SMB file servers.  It is not supported and you cannot migrate to the Hyper-V Server that is also the file server host.

    Beyond that all the old cluster rules still apply, including HBAs and what not.  CSV is also greatly improved.

    Hosting the VHDs on an SMB share is simply a new option.  and a great one for some situations, but not all.

    Considering htat you now have storage migration, architect how you are familiar and then migrate the VM storage as your architecture changes.  It is now possible.


    Brian Ehlert
    http://ITProctology.blogspot.com
    Learn. Apply. Repeat.
    Disclaimer: Attempting change is of your own free will.

    Friday, July 20, 2012 3:26 PM
  • The degree of degradation totally depends on your design and infrastructure.

    While it is a technical degradation of the storage IO.  If you design well you can have greater throughput in the end.

    And, it depends on your workload if the HA SMB share for VHDs is a good fit.  If you have high read/write IO workloads it is probably not a good fit.

    At the same time, Clsutering now uses SMB under the hood for node to node traffic.

    And don't forget LBFO teaming.  Create that big pipe.

    There are lots of different pieces that combine and align to make it as a system - far more individual pieces than we have had available in the past.


    Brian Ehlert
    http://ITProctology.blogspot.com
    Learn. Apply. Repeat.
    Disclaimer: Attempting change is of your own free will.

    Friday, July 20, 2012 3:30 PM
  • Thanks, good advice.

    As for NIC-teaming, we're seeing some issues with clustered vm guests on 2012 Hyper-V hosts with teamed NICs, but that's a different story. Thanks for all the input, guys!

    Friday, July 20, 2012 5:31 PM
  • Yes.  There is LBFO at the Hyper-V / Server level. (the big pipe).

    And then there is teaming / LBFO at the VM level (redundancy) - there is a special setting required to support this on the vNIC.

    Combine that with SRIOV and configuraiton restrictions that make sence.. And it gets complex.


    Brian Ehlert
    http://ITProctology.blogspot.com
    Learn. Apply. Repeat.
    Disclaimer: Attempting change is of your own free will.

    Friday, July 20, 2012 5:37 PM
  • I'm actually talking about host-level NIC teamingon this one. I quickly wrote up what we saw on my blog here:

    http://hindenes.com/trondsworking/2012/07/16/windows-server-2012-hyper-v-clusters-network-teaming-converged-fabrics-and-vm-guest-clustering/

    Friday, July 20, 2012 5:49 PM
  • I have seen this issue of Exchange DAG not migrating to other nodes as an HA VM mentioned in other places.

    I have no knowledge of the DAG clustering implementation but from what I have heard and my experiences I make the following comment:

    Personally, I don't believe it has anything to do with Hyper-V or LBFO but rather with the Exchange DAG clustering implementation itself.  It seems that the DAG cluster is the thing that breaks due to the migration / HA event, nothing else.

    I will forward a pointer to your post into this other forum and lets see if it gets some attention from the right places.  ;-)


    Brian Ehlert
    http://ITProctology.blogspot.com
    Learn. Apply. Repeat.
    Disclaimer: Attempting change is of your own free will.

    Friday, July 20, 2012 5:58 PM
  • Thanks - I'm getting more blades next week, which means I can sit down and repro this. In the meantime: The minute we un-teamed our host NICS and instead used one dedicated NIC for hos OS and one dedicated one for the virtual switch, the problems went away. We're going off-topic here, but for the sake of contribution I'll keep this space updated with my findings.
    Friday, July 20, 2012 6:10 PM
  • Where can someone find information on these "new PCI RAID controllers"?   Are there some specific skus that can be offered up?

    Wednesday, August 01, 2012 4:01 PM
  • Where can someone find information on these "new PCI RAID controllers"?   Are there some specific skus that can be offered up?

    Check out whis whitepaper:

    http://www.lsi.com/downloads/Public/HA-DAS/docs/LSI_TB_HighAvailabilitySolutions.pdf

    Keep in mind there are other options except using clustered controllers, you can install SAS or FC back end or skip using dedicated hardware and mirror DAS installed directly on SoFS set. See this:

    http://www.starwindsoftware.com/sw-configuring-ha-shared-storage-on-scale-out-file-servers

    In a nutshell, you should end with the configuration like this one (CSV provider can be of any type):

    Hope this helped :)

    -nismo

    Wednesday, August 01, 2012 5:44 PM