Sunday, November 11, 2012

Jumbo Frames, the gotcha's you need to know!

Hi Guys!

Today i am going to talk about a topic that confused me for quite a while, Jumbo Frames

The actual concept of what a Jumbo frame is never really phased me: It's a frame that can be up to 9000 bytes rather than the usual 1500 (the exact amount varies depending on if your talking about with or without overhead, but for now let's go with 9000 :)).


Some of the questions I often had was:

  • Is it safe to just enable jumbo frames on a switch?
  • What happens when a jumbo frame enabled host and a non-jumbo frame enabled host try to communicate?
  • What happens a Jumbo frame enabled host and another jumbo frame enabled host try to communicate but the switch inbetween does not support it?
  • Can Jumbo frames be routed and if so what are the caveats there?

First of all let's address the jumbo frames on switches remark. As far as I can now tell and I have investigated this quite a bit, your perfectly safe setting jumbo frames on a switch. The exact command varies depending on your switch model but is normally:

system mtu jumbo 9000

on some switching platforms this requires a reload. The config is very similiar for platforms like the 6500, where things become a little more complicated is on the Nexus Platforms.

Some quick background: The Cisco Nexus line of switches are all about converged networking, that is FC and Ethernet over the same piece of wire, the problem Cisco had was making sure that FC traffic is still treated with "no drop" when sent over ethernet, a protocol that inherently is "best effort" To do this they created a negotiation protocol called DCB, or data centre bridging.

On a Nexus, you can configure a policy like this:

class-map type network-qos class-gold-nq
  match qos-group 4

!


policy-map type network-qos global-netqos-map
  class type network-qos class-gold-nq
    mtu 9000
  class type network-qos class-fcoe
    pause no-drop
    mtu 2158
  class type network-qos class-default


As you can see there are three classes of traffic, default, class-gold and network-qos class-fcoe, behaviours are set on each of these classes to control things like, what should the MTU size be, should there be "no drops" of any of the packets? You can see in the above example i have set the MTU of the gold network class to 9000.


You then apply this policy globally:

system qos
  service-policy type network-qos global-netqos-map


so now any traffic marked with a QOS Group value 4 (CoS Value) will be able to use an MTU of 9000.

Note that key word there: be able to use.


So let's go back to the 3750 and say i have just enabled jumbo frames, will my whole network now come crumbling down as I have some hosts who have jumbo MTU support and some hosts who do not? the answer is a resounding NO. All you are doing is telling the switch NOT to drop frames unless they are larger than 9000, the switch will happily pass the frames on rather than if you had left the MTU at 1500, the frame would be dropped if you tried to send a 9000 byte frame through the switch.

So the key word there is "be able to use" jumbo frames, you can enable Jumbo MTU's on your switch and all your doing is saying the switch WON'T DROP Frames larger than the usual 1500 bytes, it's NOT saying "I will GENERATE frames larger than 1500"

We have our first answer, let's keep going.

So in what scenario's can MTU cause you problems?

It is highly recommended to AVOID having any situation where a host with Jumbo frames set is talking to a host without jumbo frames, this can cause problems as the host with jumbo frames will try and send frames that will get through the switch (since you've enabled jumbo frames on the switch!) and when they reach the host, the host will reject the frame as too big.

So first golden rule for Jumbo Frames:

Avoid situations where you have jumbo frame enabled host NIC's talking to non-jumbo frame enabled host NIC's. 

Ask yourself: Why are you enabling jumbo frames? Chances are it's something to do with NFS or iSCSI traffic: your probably trying to increase the speed of storage. OK no problem. The trick is to ensure that your NFS or ISCSI traffic is sent via a dedicated NIC, and your normal host traffic is sent via a non-jumbo-MTU enabled interface. this is fairly trivial to do in VMWARE ESXi and I am sure there are methods to do this with Hyper V.


So what would happen if you enabled two hosts for jumbo frames but forgot to enable the intermediaery switches? You will have difficulty with certain types of network traffic, a perfect example is NFS. (I  ran up the below in the lab and can verify this is the behavior you will see)

In my example, my ESXi Host is enabled for MTU 9000 on the interface that will generate NFS Traffic (just like we discussed)



My host for the storage (172.25.0.5) is also enabled for jumbo MTU, but I have done the cardinal sin of NOT setting my vSWITCH to have an MTU above 1500:




Let's see what happens when we load something up to the datastore:




The file will look like it's finished copying.. but just hang.



Eventually this error message will pop up:






The reason the file copy looks like it's going to work is because when your browsing the datastore you first upload to vCENTER, then vCENTER tries to copy the file via the VMKernel on the host to the storage itself. 

Once we set the MTU correctly on the vSWITCH however all is right with the world and you will be able to copy the file.

So our answer to this question is: bad things, be sure to use ping commands with the Don't fragment bit set to ensure that your hosts which are configured for jumbo frames are able to succesfully communicate with each other via jumbo frames!

The following ping commands should be helpful:

Windows:
ping -l 9000 -f

Linux:
ping -s 9000 -M do


Esx:
ping -d -s 8900 172.25.0.5
 
 

The final question is: what about routed subnets.

If for some reason you need to route jumbo frames, you certainly can as long as you enable jumbo frames on the SVI interface, or on the actual router interface itself, the commands for this vary by platform. The only caveat with this is that if you try and send jumbo frame traffic from a jumbo-frame enabled interface to a non-jumbo frame enabled interface, the router will fragment the traffic (unless the Don't fragment bit is set), this can cause massive CPU spike so is not recommended, again we go back to the old rule of make sure your jumbo frame traffic is only ever talking to another jumbo MTU enabled destination.







5 comments:

  1. On resultant fragmentation from jumbo frame routing through non jumbo-frame enabled interfaces... what type of CPU hit are we talking about?

    You say "massive" but router based fragmentation is inherent within the IPv4 specifications (I ack that it's excluded in IPv6) so I'm curious to know what you're seeing in your testing.

    ReplyDelete
  2. interesting thought... some customers are deploying iscsi connectivity within a UCS environment, which by default means that your iSCSI vlans are thrown into the default ethernet CoS 0 traffic line. CoS 0 by default has an MTU of 1500 and not 9k. this creates A LOT of issues.

    if you want to utilise jumbo frames on UCS, youd have to create another traffic class altogether (ie CoS 5), and set the MTU for that specific traffic class.

    The question then comes in "How does UCS identify the iSCSI vlan as CoS 5?" Fabric Interconnect does not provide you the ability to class-map etc a specific vlan. A way to do that would be to run a Nexus 1000V on your ESX/i and to configure MQC on that nexus 1000 in order to tag CoS 5 onto that Vlan. Unless the application does its own tagging, which i many cases do not happen.

    ReplyDelete
  3. Thank you for this write up!! Quick and easy answers to simple questions I had about enabling jumbo frames globally.

    ReplyDelete
  4. Thanks for sharing. Just one quick thought: qos-group 4 does not necessarily mean that CoS value of 4 was put into qos-group 4.

    ReplyDelete