Configuring Performance Options for Live Migration In Windows Server 2012 R2 Preview

July 10, 2013, 12:15 am

≫ Next: Preventing Live Migration Over SMB Starving CSV Traffic in Windows Server 2012 R2 with Set-SmbBandwidthLimit

≪ Previous: I am presenting at VeeamON 2017

New Options For Optimizing Live Migrations

In Windows Server 2012 R2 we have a whole range of options to leverage Live Migration of our environment and needs. Next to the new default (Compression) we can now also leverage SMB 3.0 (Multichannel, RDMA) for all forms of Live Migration and not just for Shared Nothing Live Migration (see Shared Nothing Live Migration Leverages SMB 3.0 Under the Hood) or Storage Live Migration when both the source and the target are SMB 3.0 storage.

TCP/IP

Here you can use a one NIC or a NIC Team for bandwidth aggregation for live migration (see Teamed NIC Live Migrations Between Two Hosts In Windows Server 2012 Do Use All Members). This is the process you have known in Windows Server 2012. You can select multiple NICs or even Teams of NICs but only one of those (one NIC or one Team) will be used. The other(s)will only be used when the first one is not available.

Compression

This option leverages spare CPU cycles to compress the memory contents of virtual machines being migrated. Only then is it sent over the wire via TCP/IP connection. This speeds up the Live Migration Process. This process is CPU load aware so it will only use idle cycles to protect the workload on the hosts. This is the default setting in Hyper-V running on Windows Server 2012 R2 Preview.

SMB

This setting will leverage two SMB 3.0 features. Multichannel and, if supported by and for the NICs involved, RDMA.

SMB Direct (RDMA) will be used when the network adapters of both the source and destination servers have Remote Direct Memory Access (RDMA) capabilities enabled.
SMB Multichannel will automatically detect and use multiple connections when a proper SMB Multichannel configuration is identified.

Where to set these options?

In Hyper-V Manager go to “Hyper-V Settings” in the Actions pane.

Expand the Live Migrations node under Server in the left pane (click the “+”) and select to “Advanced Features”.

Select the option desired under" “Performance Options”.

Happy testing!

EDIT: Aidan Finn posted the PowerShell commands to configure the performance options in Configuring WS2012 R2 Hyper-V Live Migration Performance Options Using PowerShell The MVP community at work & it rocks Smile

↧

Preventing Live Migration Over SMB Starving CSV Traffic in Windows Server 2012 R2 with Set-SmbBandwidthLimit

September 3, 2013, 12:20 am

≫ Next: Don’t Let Live Migration Settings Behavior Confuse You

≪ Previous: Configuring Performance Options for Live Migration In Windows Server 2012 R2 Preview

One of the big changes in Windows Server 2012 R2 is that all types of Live Migration can now leverage SMB 3.0 if the right conditions are met. That means that Multichannel & SMB Direct (RDMA) come in to play more often and simultaneously. Shared Nothing Live Migration & certain forms of Storage Live Migration are often a lot more planned due to their nature. So one can mitigate the risk by planning. Good old standard Live Migration of virtual machines however is often less planned. It can be done via Cluster Aware Updating, to evacuate a host for hardware maintenance, via Dynamic optimization. This means it’s often automated as well. As we have demonstrated many times Live Migration can (easily) fill 20Gbps of bandwidth. If you are sharing 2*10Gbps NICs for multiple purposes like CSV, LM, etc. Quality of Service (QoS) comes in to play. There are many ways to achieve this but in our example here I’ll be using DCB for SMB Direct with RoCE.

New-NetQosPolicy “CSV” –NetDirectPortMatchCondition 445 -PriorityValue8021Action 4
Enable-NetQosFlowControl –Priority 4
New-NetQoSTrafficClass "CSV" -Priority 4 -Algorithm ETS -Bandwidth 40
Enable-NetAdapterQos –InterfaceAlias SLOT41-CSV1+LM2
Enable-NetAdapterQos –InterfaceAlias SLOT42-LM1+CSV2
Set-NetQosDcbxSetting –willing $False

Now as you can see I leverage 2*10Gbps NIC, non teamed as I want RDMA. I have Failover/redundancy/bandwidth aggregation thanks to SMB 3.0. This works like a charm. But when leveraging Live Migration over SMB in Windows Server 2012 R2 we note that the LM traffic also goes over port 445 and as such is dealt with by the same QoS policy on the server & in the switches (DCB/PFC/ETS). So when both CSV & LM are going one how does one prevent LM form starving CSV traffic for example? Especially in Scale Out File Server Scenario’s this could be a real issue.

The Solution

To prevent LM traffic & CSV traffic from hogging all the SMB bandwidth ruining the SOFS party in R2 Microsoft introduced some new capabilities in Windows Server 2012 R2. In the SMBShare module you’ll find:

Set-SmbBandwidthLimit
Get-SmbBandwidthLimit
Remove-SmbBandwidthLimit

To use this you’ll need to install the Feature called SMB Bandwidth Limit via Server Manager or using PowerShell: Add-WindowsFeature FS-SMBBW

You can limit SMB bandwidth for Virtual machine (Storage IO to a SOFS), Live Migration & Default (all the rest). In the below example we set it to 8Gbps maximum.

Set-SmbBandwidthLimit -Category LiveMigration -BytesPerSecond 1000MB

So there you go, we can prevent Live Migration from hogging all the bandwidth. Jose Baretto mentions this capability on his recent blog post on Windows Server 2012 R2 Storage: Step-by-step with Storage Spaces, SMB Scale-Out and Shared VHDX (Virtual). But what about Fibre Channel or iSCSI environments? It might not be the total killer there as in SOFS scenario but still. As it turns out the Set-SmbBandwidthLimit also works in those scenarios. I was put on the wrong track by thinking it was only for SOFS scenarios but my fellow MVP Carsten Rachfahl kindly reminded me of my own mantra “Trust but verify” and as a result, I can confirm it even works to cap off Live Migration traffic over SMB that leverages RDMA (RoCE). So don’t let the PowerShell module name (SMBShare) fool you, it’s about all SMB traffic within the categories.

So without limit LM can use all bandwidth (2*10Gbps)

With Set-SmbBandwidthLimit -Category LiveMigration -BytesPerSecond 1250MB you can see we max out at 10Gbps (2*5Gbps).

Some Remarks

I’d love to see a minimum bandwidth implementation of this (that could include safety buffer for spikes in CSV traffic with SOFS). The hard cap limit might lead to some wasted bandwidth. In other scenarios you could still get into trouble. What if you have 2*10Gbps available but one of those dies on you and you capped Live Migration Traffic at 16Gbps. With one NIC gone you’re potentially in trouble until the NIC has been replaced. OK, this is not a daily occurrence & depending on you environment & setup this is less or more of a potential issue.

↧

Don’t Let Live Migration Settings Behavior Confuse You

January 2, 2014, 11:45 am

≫ Next: Unable to retrieve all data needed to run the wizard. Error details: "Cannot retrieve information from server "Node A". Error occurred during enumeration of SMB shares: The WinRM protocol operation failed due to the following error: The WinRM client sent a request to an HTTP server and got a response saying the requested HTTP URL was not available. This is usually returned by a HTTP server that does not support the WS-Management protocol.

≪ Previous: Preventing Live Migration Over SMB Starving CSV Traffic in Windows Server 2012 R2 with Set-SmbBandwidthLimit

Configuring live migration settings on a cluster

In the cluster under Networks, Live Migration Settings you can select what networks are available for live migration and their order of preference.

Basically this setting determines what NICs can be used. It also determines and in what order of preference the available networks can be used by Live Migration. It does not determine bandwidth aggregation or failover. All it does is provide the order in which the redundant networks will be used. It’s up to the cluster service NETFT, Multichannel or SMB Direct to provide the bandwidth aggregation if possible As you can see we use LM/CSV over SMB and as our two NICS are RDMA capable 10Gbps NIC, multichannel will discover RDMA capabilities & leverage SMB Direct if it can be establish otherwise it will just stick with multichannel. If you would team that NIC shows up as just one network. Also not that if you lose a NIC during live migration it might fail for some VMs under certain scenarios, but you cluster nodes will maintain the capability & recover. The names of the network reflect this: LM1+CSV2 & CSV1+LM2 will be used both but if for some reason multichannel goes completely south the names reflect the metrics of these networks. The lowest is CSV1+LM2 and the second lowest is LM1+CSV2, reflect on how NETFT will select to use which automatically based on the metrics. I.e. It’s “self documentation” for human consumption.

Sometimes you might get surprising results. An example. If you’ve selected SMB for Live Migration and you have selected only one of the NICs here. Still when you look at perform you might see both being used! What’s happening is that multichannel will kick in (and use two or more similar NICs when it finds them and if applicable move to RDMA.

So here we select SMB for the live migration type and the two equally capable 10GBps NICs available for live migration it will use them, even if you selected only one of them in the cluster network settings for live migration.

Why is that? Well, there is still another location where live migration settings are defined and that is in Hyper-V Manager. Take a look at the screenshot below.

The Incoming live migration settings is set to “Use any available network for live migration”. If you have this on it will still leverage both as when one is used multichannel drag the other one into action anyway, no matter what you set in network settings for live migration in the Cluster GUI (it set to use only one and dimmed out).

Do note that on Hyper-V Manager the settings for live migrations specify “Incoming live migrations”. That leads us to believe that it’s the target, the node where the VMs are migrated to that determines what NICs get used. But let’s put this to the test.

Testing A Hyper-V cluster with two nodes – Node A and Node B

On the cluster network settings you select only one network.

On Hyper-V cluster Node A you have configured the following for live migration via Hyper-V Manager to “Use these IP addresses for live migration”. You cannot add or remove networks, the networks used are defined by the cluster.

On Hyper-V cluster Node B you have configured the following for live migration via Hyper-V Manager to “Use any available network for live migration”.

As we now kick of a live migration from node A to node B we’ll see both NICs being used. Why well because Node B is the target and Node B has the setting “Use any available network for live migration”. So why then only these 2, why not pick up any other suitable NICs? Well we’ve configured the live migration on both nodes to use SMB. As this cluster is RDMA capable that means it will leverage multichannel/SMB direct. The auto configuration will select the best, equally capable NICs for this and that’s these two in our scenario. Remember the capabilities of the NICs have to match. So no mixtures of 1 * 1Gbps and 1 *10Gbps or 1 * multichannel and 1 SMB Direct.

The confusion really sets in that even if live migrate from Node B to A it will also use both NICs! Hum, that is “Incoming Live Migrations” is not always “correct” it seems, not when using SMB as a performance option at least. Apparently multichannel will kick in in both directions.

If you set both to Node A and Node B to “Use these IP addresses for live migration” and leave the cluster network setting with only one network it does only use one, even with SMB as a performance option for live migration. This is expected.

Note: I had one interesting hiccup while testing this configuration: when doing the latter one of the VMs failed in live migration of the entire host. I ran it again and that one VM still used both networks. All others went well during host migration with just one being used. That was a bit of a huh Confused smile moment and it sure tripped me up & kept me busy for a while. I blame RDMA and the position of the planets & constellations.

Things aren’t always what they seem at first and it’s good to keep that in mind. The moment you think you got if figured out, you’re wrong Winking smile . So look again & investigate.

↧

Unable to retrieve all data needed to run the wizard. Error details: "Cannot retrieve information from server "Node A". Error occurred during enumeration of SMB shares: The WinRM protocol operation failed due to the following error: The WinRM client sent a request to an HTTP server and got a response saying the requested HTTP URL was not available. This is usually returned by a HTTP server that does not support the WS-Management protocol.

January 28, 2014, 12:25 am

≫ Next: Live Migration over SMB Direct leaves more CPU cycles for Virtual RSS (vRSS) in Windows Server 2012 R2

≪ Previous: Don’t Let Live Migration Settings Behavior Confuse You

I was recently configuring a Windows Server 2012 File server cluster to provide SMB transparent failover with continuous available file shares for end users. So, we’re not talking about a Scale Out File Server here.

All seemed to go pretty smooth until we hit a problem. when the role is running on Node A and you are using the GUI on Node A this is what you see:

When you try to add a share you get this

"Unable to retrieve all data needed to run the wizard. Error details: "Cannot retrieve information from server "Node A". Error occurred during enumeration of SMB shares: The WinRM protocol operation failed due to the following error: The WinRM client sent a request to an HTTP server and got a response saying the requested HTTP URL was not available. This is usually returned by a HTTP server that does not support the WS-Management protocol.”

When you failover the file server role to the other node, things seem to work just fine. So this is where you run the GUI from Node A while the file server role resides on Node B.

You can add a share, it all works. You notice the exact same behavior on the other node. So as long as the role is running on another node than the one on which you use Failover Cluster Manager you’re fine. Once you’re on the same node you run into this issue. So what’s going on?

So what to do? It’s related to WinRM so let’s investigate that.

So the WinRM config comes via a GPO. The local GPO for this is not configured. So that’s not the one, it must come from the domain.The IP addresses listed are the node IP and the two cluster networks. What’s not there is local host 127.0.0.1, the cluster IP address or any of the IPV6 addresses.

I experimented with a lot of settings. First we ended up creating an OU in the OU where the cluster nodes reside on which we blocked inheritance. We than ran gpupdate /target:computer /force on both nodes to make sure WinRM was no longer configure by the domain GPO. As the local GPO was not configured it reverted back to the defaults. The listener show up as listing to all IPv4 and IPv6 addresses. Nice but the GPO was now disabled.

This is interesting but, things still don’t work. For that we needed to disable/enable WinRM

Configure-SMRemoting -disable
Configure-SMRemoting –enable

or via server manager

That fixed it, and we it seems a necessity to to. Do note that to disable/enable remote management it should not be configured via a GPO or it throws an error like

Some more testing

We experimented by adding 127.0.0.0-172.0.0.1 an enabling the GPO again. We then saw the listener did show the local host, cluster & file role IP address but the issue was back. Using * in just IPv 4 did not do the trick either.

What did the trick was to use * in the filter for IPv 6 and keep our original filters on IPv4. The good news is that having removed the GPO and disabling/enabling WinRM the cluster IP address & Filer Role IP address are now in the list. That could be good for other use cases.

This is not ideal, but it all works now.

What we settled for

So we ended up with still restricting the GPO settings for IPv4 to subnet ranges and allowing * for IPv6. This made sure that even when we run the Failover Cluster Manager GUI from the node that owns the file server role everything still works.

One workaround is to work from a remote host, not from a cluster member, which is a good practice anyway.

The key takeaway is that when Microsoft says they test with IPv6 enabled they literally mean for everything.

Note

There is a TechNet article on WinRM GPO Settings for SCVMM 2012 RC where they advice to set both IPv4 and IPv6 to * to avoid issues with SCVMM operations. How to Add Trusted Hyper-V Hosts and Host Clusters in VMM

However, we found that IPv6 is the key requirement here, * for just IP4 alone did not work.

↧

Live Migration over SMB Direct leaves more CPU cycles for Virtual RSS (vRSS) in Windows Server 2012 R2

February 6, 2014, 12:39 am

≫ Next: RDMA Over RoCE With DCB Requires Tagged Non Default VLANs

≪ Previous: Unable to retrieve all data needed to run the wizard. Error details: "Cannot retrieve information from server "Node A". Error occurred during enumeration of SMB shares: The WinRM protocol operation failed due to the following error: The WinRM client sent a request to an HTTP server and got a response saying the requested HTTP URL was not available. This is usually returned by a HTTP server that does not support the WS-Management protocol.

I recently (January 22nd 2014) gave a WebCast presentation for the Dutch Windows Management User Group (@WMUG_NL) in which I made the case for using SMB Direct with Live Migration to save CPU cycles other (VM) workloads. There are several areas where the CPU cycles are better spent but I used vRSS to show case one scenario.

We’re using a 2 node Windows Server 2012 R2 Hyper-V cluster on Dell PowerEdge R720 servers with Mellanox ConnectX-3 (CSV & live migration) and Intel X520-DA (Hyper-V switch), all 10Gbps.

This is what a CPU bottleneck looks like that can be solved by using vRSS in Windows Server 2012 R2.

The host machines are Hyper Threading enabled. The virtual switch is attached to a switch independent NIC team with dynamic mode. In this setup it’s normal that the sending VM is leveraging both members while the receiving VM traffic is coming in over one member of the host team.

No let’s enable vRSS in the VM and see what this does for this picture.

Pretty impressive isn’t it. DidierTest03 is the sending VM running on host A and DidierTest04 is the receiving VM that has vRSS enabled and is running on Host B. For vRSS you need both hosts and VMs to run Windows Server 2012 or Windows 8.1. You can see the load is spread across 7 vCPUs in the VM. DidierTest04 has 8 vCPUs. I configured vRSS in the VM to be able to use 7 vCPUs and leave vCPU 0, the default one, alone to handle those workloads.

Given multiple Logical CPUs & vCPUs we can get line speed with 10Gbps inside a virtual machine. This, ladies and gentlemen is a thing of beauty.

Now tell me, if you have business related needs for those CPU cycles why would you not offload the work that needs to be done for live migration to the NIC via SMB direct? This is about getting maximum VM density, performance & ROI form your infrastructure, whilst saving on servers, power and cooling. When you see the smile on your clients or bosses face, just say “you’re welcome” and smile back Open-mouthed smile .

↧

RDMA Over RoCE With DCB Requires Tagged Non Default VLANs

February 25, 2014, 2:20 am

≫ Next: I Can’t Afford 10GBps For Hyper-V And Other Lies

≪ Previous: Live Migration over SMB Direct leaves more CPU cycles for Virtual RSS (vRSS) in Windows Server 2012 R2

It’s DCB That Requires This

For those of you who are experimenting with the RoCE variant of RDMA for SMB Direct in Windows Server 2012 (R2), make sure you have a VLAN tag in your configuration if this is more than a simple RDMA over two NICs. The moment you get DBC with PFC & ETS involved you’ll need non default tagged VLANs. Do note that PFC alone is good enough, ETS is strictly speaking not a requirement, but I’d consider doing it if you can.

With Enhanced Transmission Selection (ETS) the network traffic type is classified using the priority value in the VLAN tag of the Ethernet frame. The priority value is the Priority Code Point (PCP), which is described in the IEEE 802.1Q specification and uses a 3-bit field in the VLAN tag with eight possible priority values (0 to 7).

Priority-based Flow Control (PFC) allows to individually pause priorities of tagged traffic and helps to provide lossless or “no drop” behavior for a certain priority at the receiving port. As above, each frame transmitted by a sending port is tagged with a priority value (0 to 7) in the VLAN tag. So for the traffic pause and resume functionality to work we need a VLAN tag to carry the priority value.

Does It Work Without?

But you’ll tell me that, as you may be lacking a DCB capable switch for lab purposes, you used a direct cable between your two RoCE NICs. And guess what RoCE, might have indeed worked for you without a VLAN tag. You can test & get a feel for what RoCE/RDMA can do for you with just the NICs. But as there is no switch involved you’re not using DCB for PFC/ETS and without that the need for the tagged VLAN isn’t there. Also see https://workinghardinit.wordpress.com/2013/05/03/smb-direct-roce-does-not-work-without-dcbpfc/.

So there you go. Design your RoCE/RDMA network based on DCB with PFC( and ETS) and not just on the tests with an direct cable or you might miss a few details that are quite important. Happy testing!

↧

I Can’t Afford 10GBps For Hyper-V And Other Lies

July 23, 2014, 1:25 am

≫ Next: SMB 3, ODX, Windows Server 2012 R2 & Windows 8.1 perform magic in file sharing for both corporate & branch offices

≪ Previous: RDMA Over RoCE With DCB Requires Tagged Non Default VLANs

You’re wrong

There, I said it. Sure you can. Don’t think you need to be a big data center to make this happen. You just need to think and work outside the box a bit and when you’re not a large enterprise, that’s a bit more easy to do. Don’t do it like a big name brand, traditionalist partner would do it (strip & refit the entire structural cabling in the server room, high end gear with big margins everywhere). You’re going for maximum results & value, not sales margins and bonuses.

I would even say you can’t afford to stay on 1Gbps much longer or you’ll be dealing with the fall out of being stuck in the past. Really some of us are already look at > 10Gbps connections to the servers, actually. You need to move from 1Gbps or you’ll be micro managing a way around issues sucking all the fun out of your work with ever diminishing results and rising costs for both you and the business.

Give your Windows Server 2012R2 Hyper-V environment the bandwidth it needs to shine and make the company some money. If all you want to do is to spent as little money as possible I’m not quite sure what your goal is? Either you need it or you don’t. I’m convinced we need it. So we must get it. Do what it takes. Let me show you one way to get what you need.

Sounds great what do I do?

Take heart, be brave and of good courage! Combine it with skills, knowledge & experience to deliver a 10Gbps infrastructure as part of ongoing maintenance & projects. I just have to emphasize that some skills are indeed needed, pure guts alone won’t do it.

First of all you need to realize that you do not need to rip and replace your existing network infrastructure. That’s very hard to get approval for, takes too much time and rapidly becomes very expensive in both dollars and efforts. Also, to be honest, quiet often you don’t have that kind of pull. I for one certainly do not. And if I’d try to do that way it takes way too many meetings, diplomacy, politics, ITIL, ITML & Change Approval Board actions to make it happen. This adds to the cost even more, both in time and money. So leave what you have in place, for this exercise we assume it’s working fine but you can’t afford to have wait for many hours while all host drains in 6 node cluster and you need to drain all of them to add memory. So we have a need (OK you’ll need a better business case than this but don’t make to big a deal of it or you’ll draw unwanted attention) and we’ve taking away the fear factor of fork lift replacing the existing network which is a big risk & cost.

So how do I go about it?

Start out as part of regular upgrades, replacement or new deployments. The money is their for those projects. Make sure to add some networking budget and leverage other projects need to support the networking needs.

Get a starter budget for a POC of some sort, it will get your started to acquire some more essential missing bits.

By reasonably cheap switches of reasonable port count that do all you need. If they’re readily available in a frame work contract, great. You can get it as part of the normal procedures. But if you want to nock another 6% to 8% of the cost order them directly from the vendor. Cut out the middle man.

Buy some gear as part of your normal refresh cycle. Adapt that cycle life time a bit to suit your needs where possible. Funding for operation maintenance & replacement should already be in place right?

Negotiate hard with your vendor. Listen, just like in the storage world, the network world has arrived at a point where they’re not going to be making tons of money just because they are essential. They have lots of competition and it’s only increasing. There are deals to be made and if you chose the right hardware it’s gear that won’t lock you into proprietary cabling, SPF+ modules and such. Or not to much anyway Smile .

Design options and choices

Small but effective

If you’re really on minimal budget just introduce redundant (independent) stand alone 10Gbps switches for the East-West traffic that only runs between the nodes in the data center. CSV, Live Migration, backup. You don’t even need to hook it up to the network for data traffic, you only need to be able to remotely manage it and that’s what they invented Out Off Band (OOB) ports for. See also an old post of mine Introducing 10Gbps With A Dedicated CSV & Live Migration Network (Part 2/4). In the smallest cheapest scenario I use just 2 independent switches. In the other scenario build a 2 node spine and the leaf. In my examples I use DELL network gear. But use whatever works best for your needs and your environment. Just don’t go the “nobody ever got fired for buying XXX” route, that’s fear, not courage! Use cheaper NetGear switches if that fits your needs. Your call, see my recent blog post on this 10Gbps Cheap & Without Risk In Even The Smallest Environments.

Medium sized excellence

First of all a disclaimer: medium sized isn’t a standardized way of measuring businesses and their IT needs. There will be large differences depending on you neck of the woods .

Build your 10Gbps infrastructure the way you want it and aim it to grow to where it might evolve. Keep it simple and shallow. Go wide where you need to. Use the Spine/Leaf design as a basis, even if what you’re building is smaller than what it’s normally used for. Borrow the concept. All 10Gbps traffic, will be moving within that Spine/Leaf setup. Only client server traffic will be going out side of it and it’s a small part of all traffic. This is how you get VM mobility, great network speeds in the server room avoiding the existing core to become a bandwidth bottleneck.

You might even consider doing Infiniband where the cost/Gbps is very attractive and it will serve you well for a long time. But it can be a hard sell as it’s “another technology”.

Don’t panic, you don’t need to buy a bunch of Nexus 7000’s or Force10 Z9000 to do this in your moderately sized server room. In medium sized environment I try to follow the “Spine/Leaf” concept even if it’s not true ECMP/CLOSS, it’s the principle. For the spine choose the switches that fit your size, environment & growth. I’ve used the Force10 S4810 with great success and you can negotiate hard on the price. The reasons I went for the higher priced Force10 S4810 are:

It’s the spine so I need best performance in that layer so that’s where I spend my money.
I wanted VLT, stacking is a big no no here. With VLT I can do firmware upgrades without down time.
It scales out reasonably by leveraging eVLT if ever needed.

For the ToR switches I normally go with PowerConnect 81XX F series or the N40XXF series, which is the current model. These provide great value for money and I can negotiate hard on price here while still getting 10Gbps with the features I need. I don’t need VLT as we do switch independent NIC teaming with Windows. That gives me the best scalability wit DVMQ & vRSS and allows for firmware upgrades without any network down time in the rack. I do sacrifice true redundant LACP within the rack but for the few times I might really need to have that I could go cross racks & still maintain a rack a failure domain as the ToRs are redundant. I avoid stacking, it’s a single point of failure during firmware upgrades and I don’t like that. Sure I can could leverage the rack a domain of failure to work around that but that’s not very practical for ordinary routine maintenance. The N40XXF also give me the DCB capabilities I need for SMB Direct.

Hook it up to the normal core switch of the existing network, for just the client/server.(North/South) traffic. I make sure that any VLANs used for CSV, live migration, can’t even reach that part of the network. Even data traffic (between virtual machines, physical servers) goes East-West within your Spine/Leave and never goes out anyway unless you did something really weird and bad.

As said, you can scale out VLT using eVLT that creates a port channel between 2 VLT domains. That’s nice. So in a medium sized business you’re pretty save in growth. If you grow beyond this, we’ll be talking about a way larger deployment anyway and true ECMP/CLOS and that’s not the scale I’m dealing with where. For most medium sized business or small ones with bigger needs this will do the job. ECMP/CLOS Spine/leaf actually requires layer 3 in the design and as you might have noticed I kind if avoid that. Again, to get to a good solution today instead of a real good solution next year which won’t happen because real good is risky and expensive. Words they don’t like to hear above your pay grade.

The picture below is just for illustration of the concept. Basically I normally have only one VLT domain and have two 10Gbps switches per rack. This gives me racks as failure domains and it allows me to forgo a lot of extra structural cabling work to neatly provide connectivity form the switches to the server racks .

You have a scalable, capable & affordable 10Gbps or better infrastructure that will run any workload in style.. After testing you simply start new deployments in the Spine/Leaf and slowly mover over existing workloads. If you do all this as part of upgrades it won’t cause any downtime due to the network being renewed. Just by upgrading or replacing current workloads.

The layer 3 core in the picture above is the uplink to your existing network and you don’t touch that. Just let if run until there nothing left in there and you can clean it up or take it out. Easy transition. The core can be left in place or replaces when needed due to age or capabilities.

To keep things extra affordable

While today the issues with (structural) 10Gbps copper CAT6A and NICs/Switches seem solved, when I started doing 10Gbps fibre cabling of Copper Twinax Direct Attach was the only way to go. 10GBaseT wasn’t an option yet and I still love the flexibility of fibre, it consumes less space and weighs less then CAT6A. Fibre also fits easily in existing cable infrastructure. Less hassle. But CAT6A will work fine today, no worries.

If you decide to do fibre, buy OM3, you can get decent, affordable cabling on line. Order it as consumable supplies.

Spend some time on the internet and find the SFP+ that works with your switches to save a significant amount of money. Yup some vendor switches work with compatible non OEM branded SPF+ modules. Order them as consumable supplies, but buy some first to TEST! Save money but do it smart, don’t be silly.

For patch cabling 10Gbps Copper Twinax Direct Attach works great for short ranges and isn’t expensive, but the length is limited and they get thicker & more sturdy and thus unwieldy by length. It does have it’s place and I use them where appropriate.

Isn’t this dangerous?

Nope. Technology wise is perfectly sound and nothing new. Project wise it delivers results, fast, effective and without breaking the bank. Functionally you now have all the bandwidth you need to stop worrying and micromanaging stuff to work around those pesky bandwidth issues and focus on better ways of doing things. You’ve given yourself options & possibilities. Yay!

Perhaps the approach to achieve this isn’t very conventional. I disagree. Look, anyone who’s been running projects & delivering results knows the world isn’t that black and white. We’ve been doing 10Gbps for 4 years now this way and with (repeated) great success while others have to wait for the 1Gbps structural cabling to be replaced some day in the future … probably by 10Gbps copper in a 100Gbps world by the time it happens. You have to get the job done. Do you want results, improvements, progress and success or just avoid risk and cover your ass? Well then, choose & just make it happen. Remember the business demands everything at the speed of light, delivered yesterday at no cost with 99.999% uptime. So this approach is what they want, albeit perhaps not what they say.

↧

SMB 3, ODX, Windows Server 2012 R2 & Windows 8.1 perform magic in file sharing for both corporate & branch offices

July 29, 2014, 12:30 am

≫ Next: Migrate an old file server to a transparent failover file server with continuous availability

≪ Previous: I Can’t Afford 10GBps For Hyper-V And Other Lies

SMB 3 for Transparent Failover File Shares

SMB 3 gives us lots of goodies and one of them is Transparent Failover which allows us to make file shares continuously available on a cluster. I have talked about this before in Transparent Failover & Node Fault Tolerance With SMB 2.2 Tested (yes, that was with the developer preview bits after BUILD 2011, I was hooked fast and early) and here Continuously Available File Shares Don’t Support Short File Names – "The request is not supported" & “CA failure – Failed to set continuously available property on a new or existing file share as Resume Key filter is not started.”

This is an awesome capability to have. This also made me decide to deploy Windows 8 and now 8.1 as the default client OS. The fact that maintenance (it the Resume Key filter that makes this possible) can now happen during day time and patches can be done via Cluster Aware Updating is such a win-win for everyone it’s a no brainer. Just do it. Even better, it’s continuous availability thanks to the Witness service!

When the node running the file share crashes, the clients will experience a somewhat long delay in responsiveness but after 10 seconds the continue where they left off when the role has resumed on the other node. Awesome! Learn more bout this here Continuously Available File Server: Under the Hood and SMB Transparent Failover – making file shares continuously available.

Windows Clients also benefits from ODX

But there is more it’s SMB 3 & ODX that brings us even more goodness. The offloading of read & write to the SAN saving CPU cycles and bandwidth. Especially in the case of branch offices this rocks. SMB 3 clients who copy data between files shares on Windows Server 2012 (R2) that has storage an a ODX capable SAN get the benefit that the transfer request is translated to ODX by the server who gets a token that represents the data. This token is used by Windows to do the copying and is delivered to the storage array who internally does all the heavy lifting and tell the client the job is done. No more reading data form disk, translating it into TCP/IP, moving it across the wire to reassemble them on the other side and write them to disk.

To make ODX happen we need a decent SAN that supports this well. A DELL Compellent shines here. Next to that you can’t have any filter drives on the volumes that don’t support offloaded read and write. This means that we need to make sure that features like data deduplication support this but also that 3rd party vendors for anti-virus and backup don’t ruin the party.

In the screenshot above you can see that Windows data deduplication supports ODX. And if you run antivirus on the host you have to make sure that the filter driver supports ODX. In our case McAfee Enterprise does. So we’re good. Do make sure to exclude the cluster related folders & subfolders from on access scans and schedules scans.

Do not run DFS Namespace servers on the cluster nodes. The DfsDriver does not support ODX!

The solution is easy, run your DFS Namespaces servers separate from your cluster hosts, somewhere else. That’s not a show stopper.

The user experience

What it looks like to a user? Totally normal except for the speed at which the file copies happen.

Here’s me copying an ISO file from a file share on server A to a file share on server B from my Windows 8.1 workstation at the branch office in another city, 65 KM away from our data center and connected via a 200Mbps pipe (MPLS).

On average we get about 300 MB/s or 2.4 Gbps, which “over” a 200Mbps WAN is a kind of magic. I assure you that they’re not complaining and get used to this quite (too) fast Winking smile .

The IT Pro experience

Leveraging SMB 3 and ODX means we avoid that people consume tons of bandwidth over the WAN and make copying large data sets a lot faster. On top of that the CPU cycles and bandwidth on the server are conserved for other needs as well. All this while we can failover the cluster nodes without our business users being impacted. Continuous to high availability, speed, less bandwidth & CPU cycles needed. What’s not to like?

Pretty cool huh! These improvements help out a lot and we’ve paid for them via software assurance so why not leverage them? Light up your IT infrastructure and make it shine.

What’s stopping you?

So what are your plans to leverage your software assurance benefits? What’s stopping you? When I asked that I got a couple of answers:

I don’t have money for new hardware. Well my SAN is also pré Windows 2012 (DELL Compellent SC40 controllers. I just chose based on my own research not on what VARs like to sell to get maximal kickbacks . The servers I used are almost 4 years old but fully up to date DELL PowerEdge R710’s, recuperated from their duty as Hyper-V hosts. These server easily last us 6 years and over time we collected some spare servers for parts or replacement after the support expires. DELL doesn’t take away your access to firmware &drivers like some do and their servers aren’t artificially crippled in feature set.
Skills? Study, learn, test! I mean it, no excuse!
Bad support from ISV an OEMs for recent Windows versions are holding you back? Buy other brands, vote with your money and do not accept their excuses. You pay them to deliver.

As IT professionals we must and we can deliver. This is only possible as the result of sustained effort & planning. All the labs, testing, studying helps out when I’m designing and deploying solutions. As I take the entire stack into account in designs and we do our due diligence, I know it will work. The fact that being active in the community also helps me know early on what vendors & products have issues and makes that we can avoid the “marchitecture” solutions that don’t deliver when deployed. You can achieve this as well, you just have to make it happen. That’s not too expensive or time consuming, at least a lot less than being stuck after you spent your money.

↧

Migrate an old file server to a transparent failover file server with continuous availability

August 3, 2014, 11:55 pm

≫ Next: More Tips On Dealing With Removing Short File Names When Migrating To a SMB3 Transparent Failover File Server Cluster

≪ Previous: SMB 3, ODX, Windows Server 2012 R2 & Windows 8.1 perform magic in file sharing for both corporate & branch offices

This is not a step by step “How to” but we’ll address some thing you need to do and the tips and tricks that might make things a bit smoother for you.

1) Disable Short file names & Strip existing old file names

Never mind that this is needed to be able to do continuous availability on a file share cluster. You should have done this a long time ago. For one is enhances performance significantly. It also make sure that no crappy apps that require short file names to function can be introduced into the environment. While I’m an advocate for mutual agreements there are many cases where you need to protect users, the business against itself. Being to much of a politician as a technologist can be very bad for the company due to allowing bad workarounds and technology debt to be introduced. Stand tall!

Read up on this here Windows Server 2012 File Server Tip: Disable 8.3 Naming (and strip those short names too. Next to Jose’s great blog read Fsutil 8dot3name on how to do this.

If you still have applications that depend on short file names you need to isolate and virtualize them immediately. I feel sorry for you that this situation exists in your environment and I hope you get the necessary means to deal with swiftly and decisively by getting rid of these applications. Please see The Zombie ISV® to be reminded why.

Some tips:

Only use the /F switch if it’s a non system disk and you can afford to do so as you’re moving the data LUN to a new server anyone. Otherwise you might run into issues. See the below example.
If you stumble on path that are too long, intervene. Talk to the owners. We got people to reduce “Human Resources Planning And Evaluations” sub folder & file names reduced to HRMPlanEval. You get the gist, trim them down.
You’ll have great success on most files & folders but if they are open. Schedule a maintenance window to make sure you can run without anyone connected to the shares (Stop LanManServer during that maintenance window).
Also verify no other processes are locking any files or folders (anti virus, backups, sync tools etc.)

2) Convert MBR disks to GPT if you can

With ever growing amounts of data to store and protect this makes sense. I’m not saying you need to start doing 64TB disks today but making sure you can grown beyond 2TB is smart. It doesn’t cost anything when you start out with GPT disks from the start. If you have older LUNs you might want to use the migration as an opportunity to convert MBR LUNs to GPT. That means copying the data and all NTFS permissions.

Please see NTFS Permissions On A File Server From Hell Saved By SetACL.exe & SetACL Studio for some tools that might help you out when you run into NTFS/ACL permissions and for parsing logs during this operation.

Here’s a useful Robocopy command to start out with:

ROBOCOPY L:\ V:\ /MIR /SEC /Z /R:2 /W:1 /V /TS /FP /NP /XA:SH /MT:16 /XD "System Volume Information" *RECYCLE* /LOG:"D:\RoboCopyLogs\MBR2GPTLUNL2V.txt"

3) Dump the existing shares on the old file sever into a text file for documentation an use on the new file server

Pre Windows Server 2012 the new SMB Cmdlets don’t work, but no fear, we have some other tools to use. Using NET SHARE does work and with you can also show the hidden and system share but the layout is a bit of a mess. I prefer to use.

Get-WmiObject –class Win32_Share > C:\temp\OldFileServerShares

It’s fast, complete and the layout is very user friendly. Which is what I need for easy use with PowerShell on the W2K12R2 file server. Some of you might say, what about the share security settings. 1) We’re going to cluster so exporting these from the registry doesn’t work and 2) you should have kept this plain vanilla and done security via the NFTS permissions on the folder structure only. But hey I’m a nice guy, so here’s a link to a community PowerShell script if you need to find out the share permissions: http://gallery.technet.microsoft.com/scriptcenter/List-Share-Permissions-83f8c419 I do however encourage you to use this time to consider just using security on NFTS.

4) Create the clustered file shares

Amongst the many gems in Windows Server 2012 R2 are the new SMB PowerShell Cmdlets. They are a simple and great way to create clustered files shares. Read up on these SMB Share Cmdlets and especially New-SmbShare

When we’ve unmapped the LUNs from the old file server and exposed them to the new file server cluster you’re ready to go. You can even reorganize the Shares, consolidate to less but bigger LUNs and, by just adapting the path to the share in the script make sure the users are not confused or nee to learn new shares and adapt how & what they connect to them. Here it goes:

New-SmbShare -Name "TEST2" -path "T:\Shares\TEST2" -fullaccess Everyone -EncryptData $True -FolderEnumerationMode AccessBased -ConcurrentUserLimit 0 -ScopeName TF-FS-MIG

First and foremost, this is where the good practice of not micro managing file hare permissions will pay back big time. If you have done security via NTFS permissions with AG(U)DLP principle to your folder structure granting should be breeze right?

Before you ask, no you can’t do the old trick of importing the registry export of the shares and their security settings form the old file server when you’re going to cluster the file shares. That might sound bad but with some preparation and the PowerShell I demonstrated above you’ll find it easy enough.

5) Recuperate old file server name (Optional)

After you have decommissioned the old file server you could use a cluster alias to keep the old file server UNC path. This has the drawback you will fall back to connecting to the SMB shares via NTLM as aliases don’t support Kerberos authentication. But there is another trick. Once you got rid of the old server object in AD you can rename. If you can do this you’ll be able to keep Kerberos for authentication.

So after you’ve gotten rid of the old server in Active Directory go to the file server role. Select properties and rename it to recuperate the old files server name.

Now look at the resources tab. Right click and select the properties tab of “Server Name”. Rename the DNS Name. That will update the server name and the DNS record. This will cause the role to go down temporarily.

Right click and select the properties tab of “File Server”. Rename the UNC path to reflect the older file server name.

For good measure and to test everything works: stop and restart the cluster role, connect to the shares and voila live should be good. Users can access the transparent failover file server like they used to do with the old non cluster file server and they don’t sacrifice Kerberos to be able to do so!

Conclusion

I hope you enjoyed the tips and pointers on migrating an old file server to a Windows Server 2012 R2 file share cluster. Remember that these tips apply for various permutations of P2V, V2V as well as for P2P migrations.

↧

More Tips On Dealing With Removing Short File Names When Migrating To a SMB3 Transparent Failover File Server Cluster

December 1, 2014, 9:57 am

≪ Previous: Migrate an old file server to a transparent failover file server with continuous availability

You might have read my blog posts on the capabilities and the process of migrating to a Transparent Failover File Server. If not, here they are:

These are a good read with some advice from real world experience and in this post I’ll offer some more tips. I’ve discussed the need to disable and get rid of short file names in my blog and offered other tips to prepare for your migration and get your file share LUNs in tip top, modern shape. But what if you run into short file name issues where you can seem to get rid of them?

Well here’s 3 more things to check:

1) Get rid of the shadow copies used for Previous Versions

The reason you’d better get rid of them is that they can also contain short files names & way to long path or file names. We don’t want them to ruin the party so we remove them all by disabling shadow copies on the LUNs to be copied. We can enable them again once the LUN is up and running in the new file cluster.

2) The logs indicate there are short file names you don’t have access to

If the NFTS permissions on the folder & file structure are OK you should not have to much problems bar some files being locked by being in use. Rerunning the fsutil command prior to migrating with the server service stopped will prevent any connectivity and use of file shares by people ignoring the request to log of or shut down their clients or automated jobs that otherwise keep accessing them.

But you might still get some indications in the log file(s) that state you can remove certain file names.

There is the good old trick of running your command under SYSTEM. That those the job! That helps get rid of short file name instances of folders where you normally don’t get access to. If system has rights you’ll be fine whether it’s a system folder or not.To do this the Sysinternals tools come in handy once again. You can launch a command prompt running under the NT AUTHORITY\SYSTEM account using psexec.exe by running the following from a elevated command prompt:

psexec -i -s cmd.exe or psexec -s cmd.exe

The-s switch runs the remote process in the System account. Psexec temporarily installs a service "psexec running psexesvc.exe" on the remote computer (or locally if that’s what you doing) which is removed when the app or process that’s running is closed. It’s obvious now I hope why you need an elevated command prompt to run this command.

Now should you do this by default? Nope. Just when you need to and as always have a realistic backup plan, a way to recover when things go south.

3) Anti virus sometime prevents the removal of short file names

Disable Anti-Virus, sometimes it holds a temporary entry in the registry for the file involved. At least that’s what I’ve seen as a transient issue in some of the large number of logs I gathered. Yeah, I ran a lot of fsutil against large NTFS volumes. What can I say. Due diligence pays off!

4) Run ChkDsk

Just make sure the volume is healthy and no repairs are needed. If your migrating from and older file server there might be outstanding issues and a check disk on volumes with lot’s of files take time. Some of the ones I’ve dealt with had more that 2 million files on a 2TB LUN and it it can take 24 hours. Fun when you have 10 LUNs :-/

↧