Monday, October 7, 2013

CCIE DC: Definitive Jumbo frames

Hi Guys!

This blog post is attempting to be the DEFINITIVE guide on Jumbo MTU, It's a topic that DOES MY HEAD IN!

There are SO many possible iterations, there's

MTU with Nexus 5k
MTU with Catalyst Switches
MTU with MDS
MTU with Nexus 7k
MTU on SVI Interfaces
MTU on physical interfaces
MTU on UCS (Both on the FI itself and the vNIC'S)
MTU on C-Series
MTU on Nexus 1000v

What interaction does MTU have on VPC, Fabric Path, Port Channels, OTV? Routing Protocols?

What interactions does MTU have with FCoE? FC? SAN-Port-Channels?

MTU on FC?
Why 9216 vs 9000?


MTU on Nexus 7000

So this is the most complicated for me at least of the jumbo MTU discussion, let's start at the beginning:

A Nexus 7000 by default, has the following command configured:
system jumbomtu 9216


Under a VDC, I could not remove this command:

N7K5(config)# show run all | inc jumbo
system jumbomtu 9216
N7K5(config)# no system jumbomtu 9216N7K5(config)# show run all | inc jumbo
system jumbomtu 9216N7K5(config)# show run all | inc jumbo
system jumbomtu 9216
N7K5(config)#


I am not sure if this is a limitation of VDC's or simply something I am doing wrong, but regardless I was unable to turn off this command.

This fits with the story I have always heard, that a Nexus 7000 is enabled for jumbo frames, BY DEFAULT. And that you do not have to do ANYTHING, but of course it's a little bit more complicated than that.


If you look at the interfaces you have with a show run interface command, you will see an interesting default:

#show run all | beg 1/17

interface Ethernet1/17
 (Output omitted)
  mtu 1500
  snmp trap link-status
  logging event port link-status default
  logging event port trunk-status default
  bandwidth 10000000

 (Output omitted)
!

The MTU command is on ALL interfaces when you do a show run all, what the hell is it? Does it over-write my system Jumbo MTU? Do I have to set it too?

If you look at an interface with show int, you annoyingly see exactly the same thing:


N7K5(config-if)# show int eth1/17 | inc MTU
  MTU 1500 bytes, BW 10000000 Kbit, DLY 10 usec
N7K5(config-if)#


Yet we are lead to believe by certain blog posts that because we have the system jumbo MTU command we don't have to do anything.

Let's try not changing the value, and doing a ping, and seeing what happens


N7K6# ping 169.254.1.1 df-bit packet-size 8000
PING 169.254.1.1 (169.254.1.1): 8000 data bytes
Request 0 timed out
^C
--- 169.254.1.1 ping statistics ---
2 packets transmitted, 0 packets received, 100.00% packet loss
N7K6# ping 169.254.1.1 df-bit packet-size 1500
PING 169.254.1.1 (169.254.1.1): 1500 data bytes
Request 0 timed out
^C
--- 169.254.1.1 ping statistics ---
2 packets transmitted, 0 packets received, 100.00% packet loss
N7K6# ping 169.254.1.1 df-bit packet-size 1472
PING 169.254.1.1 (169.254.1.1): 1472 data bytes
1480 bytes from 169.254.1.1: icmp_seq=0 ttl=254 time=1.273 ms
1480 bytes from 169.254.1.1: icmp_seq=1 ttl=254 time=0.881 ms
1480 bytes from 169.254.1.1: icmp_seq=2 ttl=254 time=1.182 ms
1480 bytes from 169.254.1.1: icmp_seq=3 ttl=254 time=1.179 ms
1480 bytes from 169.254.1.1: icmp_seq=4 ttl=254 time=1.185 ms

--- 169.254.1.1 ping statistics ---
5 packets transmitted, 5 packets received, 0.00% packet loss
round-trip min/avg/max = 0.881/1.139/1.273 ms


No jumbo frames for us! Maximum we could do is 1472.

Let's show that we just have the default config:



N7K6# show run int eth1/17

!Command: show running-config interface Ethernet1/17
!Time: Sun Jun 30 05:50:14 2013

version 6.0(2)

interface Ethernet1/17
  switchport
  switchport mode trunk
  no shutdown

N7K6# show int e1/17 | inc MTU
  MTU 1500 bytes, BW 10000000 Kbit, DLY 10 usec




So! Right now we can see clear as day: if we don't change the MTU value, it does not appear to work!



So let's try changing this value and see what happens

N7K5(config-if)# int eth1/17
N7K5(config-if)# mtu 9216
 

Let's see if we change it to 9216 if our output has changed:

N7K5(config-if)# show int eth1/17 | inc MTU
  MTU 9216 bytes, BW 10000000 Kbit, DLY 10 usec

 

It has changed the output, do our pings work now?


N7K6(config-if)# exit
N7K6(config)# exit
N7K6# ping 169.254.1.1 df-bit packet-size 1472
PING 169.254.1.1 (169.254.1.1): 1472 data bytes
1480 bytes from 169.254.1.1: icmp_seq=0 ttl=254 time=1.457 ms
1480 bytes from 169.254.1.1: icmp_seq=1 ttl=254 time=0.86 ms

--- 169.254.1.1 ping statistics ---
5 packets transmitted, 5 packets received, 0.00% packet loss
round-trip min/avg/max = 0.77/0.952/1.457 ms
N7K6# ping 169.254.1.1 df-bit packet-size 1500
PING 169.254.1.1 (169.254.1.1): 1500 data bytes
1508 bytes from 169.254.1.1: icmp_seq=0 ttl=254 time=1.341 ms
1508 bytes from 169.254.1.1: icmp_seq=1 ttl=254 time=1.067 ms

--- 169.254.1.1 ping statistics ---
5 packets transmitted, 5 packets received, 0.00% packet loss
round-trip min/avg/max = 1.067/1.235/1.343 ms
N7K6# ping 169.254.1.1 df-bit packet-size 8972
PING 169.254.1.1 (169.254.1.1): 8972 data bytes
8980 bytes from 169.254.1.1: icmp_seq=0 ttl=254 time=1.872 ms
8980 bytes from 169.254.1.1: icmp_seq=1 ttl=254 time=6.154 ms
Request 2 timed out



So the RESULTS ARE IN!  Despite all the "Experts" on the internet claiming otherwise, who lead to me being lead astray myself, the Nexus 7000 system jumbo mtu command is NOT enough.  on M interface cards, you _MUST_ set the MTU


Regardless of L2 VS L3 Interface

 
Here is absolute proof:

http://www.cisco.com/en/US/docs/switches/datacenter/sw/5_x/nx-os/interfaces/configuration/guide/if_basic.html#wp1105874

I tested it myself, see below

N7K5(config-if)# mtu 1600
ERROR: Ethernet1/17: MTU on L2 interfaces can only be set to default or system-jumboMTU



You can only set the MTU to either your configured jumbo MTU size, or the normal MTU size on L2 interfaces, l3 interfaces you can configure what you want.


N7K8(config-if)# no switchport
N7K8(config-if)# mtu 4444

N7K8(config-if)# 

 Note you can change the MTU individually on both the M line cards OR F Line cards, but your better off with M line cards using the system-QOS, as we will see below


  What about a system-QOS Class?

Guess what, all the above changes when it comes to F-Based Linecards. Although you can manually set the F line card MTU just like you can on the M line cards and it will work, you can change it globally using network-qos:


http://www.cisco.com/en/US/docs/switches/datacenter/sw/6_x/nx-os/qos/configuration/guide/nt_qos.html


F-Based line cards require you to change the System-QOS Class just like on 5k, but unlike a 5k, on a 7k F1 line-card  if you change it, the value is changed under the interface:




SW1-1(config)# show int e4/1

Ethernet4/1 is up

  Dedicated Interface

  Hardware: 1000/10000 Ethernet, address: c464.1348.b2d8 (bia c464.1348.b2d8)

  MTU bytes (CoS values):  MTU  9216(0-2,4-7) bytes  MTU  2112(3) bytes

  BW 10000000 Kbit, DLY 10 usec, reliability 255/255, txload 1/255, rxload 1/255
!




Here is the config that was applied to make this happen:




policy-map type network-qos default-nq-7e-CCIE

  class type network-qos c-nq-7e-drop

    congestion-control tail-drop

    mtu 9216

  class type network-qos c-nq-7e-ndrop-fcoe

    pause

    mtu 2112


 

 So you can see that it matches up. So it makes it quite a bit easier to see what is going on! Great!

Here is a handy command on the Nexus 7k to see what is going on too, in the below example, I have an N7k which does NOT have any QoS config, because I was using a rack that you only have access to a single VDC on.

N7K8# show system internal qos network-qos hw-config module 2 | inc MTU
MTU         = 1500 [FCoE: No] - This line shows the MTU Value for CoS 0 on this hardware
MTU         = 1500 [FCoE: No] - This is MTU value for Cos 1
MTU         = 1500 [FCoE: No] - Cos 2
MTU         = 1500 [FCoE: No] - Cos 3
MTU         = 1500 [FCoE: No] - Cos 4
MTU         = 1500 [FCoE: No] - Cos 5
MTU         = 1500 [FCoE: No] - Cos 6
MTU         = 1500 [FCoE: No] - Cos 7
Interface    Config  Oper(VLs) MTU (value)





MTU on Nexus 5000

The Nexus 5000 is probably the simplest of all in terms of what we can and can't do with Jumbo MTU's.

The thing about the Nexus 5k is that it shares a similar architecture with the Nexus 7000 F1 Line cards, so thus it does not suprise me, that for Nexus 5K the only thing you have to do, to enable jumbo frames, is change the system QOS Class

policy-map type network-qos jumboMTU5k
  class type network-qos class-fcoe
    pause no-drop
    mtu 2158
  class type network-qos class-default
    mtu 9216



Just like on the Nexus 7000, if your working with an L3 interface, you must set the MTU manually


The annoying thing about the configuration on nexus 5000, is that the interface will still show the MTU as 1500:

N5K5(config-if)# show int eth1/6 | inc MTU
  MTU 1500 bytes, BW 10000000 Kbit, DLY 10 usec



This is a known bug:

CSCsl21529, Symptom: An incorrect MTU value is displayed in the show interface command output. The Cisco Nexus 5000 Series switch only supports class-based MTU. Per-interface level MTU configuration is not supported. The switch supports jumbo frames by default. However, the show interface command output currently displays an incorrect MTU value of 1500 bytes. 
But if you check the queuing on the interface, which is the QoS Applied as part of the global policy, it will show the correct MTU:



N5K5(config-if)# show queuing interface eth1/6 | inc MTU
    q-size: 243200, HW MTU: 9280 (9216 configured)


So let's do a ping between our two Nexus 5k's enabled for this.

N5K8# ping 169.254.1.1 df-bit packet-size 9000
PING 169.254.1.1 (169.254.1.1): 9000 data bytes
Request 0 timed out
Request 1 timed out
Request 2 timed out
Request 3 timed out
Request 4 timed out

--- 169.254.1.1 ping statistics ---
5 packets transmitted, 0 packets received, 100.00% packet loss
N5K8# ping 169.254.1.1 df-bit packet-size 2000
PING 169.254.1.1 (169.254.1.1): 2000 data bytes
2008 bytes from 169.254.1.1: icmp_seq=0 ttl=254 time=1.523 ms



Hmmm.. I can do a certain size but not over a certain size, this could simply be some sort of control-plane policing on the nexus 5k, as if i ping THROUGH these devices, on my nexus 7k's, I can do up to 9000 bytes.


So long story short, Nexus 5k: change the system class, your done. Change the MTU on any L3 interfaces you want to use. Done.


MTU on SVI,

There is no Trick to this aspect, it is as simple as this: M1, SVI, Nexus 5k, Nexus 7k, none of that matters: You must set an MTU for layer 3 interfaces in order for those interfaces to support jumbo frames. Think of it like setting your operating system MTU

-- Subtopic: MTU interaction with routing protocols

I am not going to spend too much time on this, you should know if your doing CCIE DC that many routing protocols require you to match your MTU 



MTU on Port-Channels

On Nexus 7k, As you would expect if you add ports to a port-channel, the port-channel will inherit the MTU of the physical ports, or you can just change the MTU on the port-channel, either way you must configure the MTU on the port-channel even if it is layer 2 or layer 3.

On Nexus 5k, as we expect, the only thing that matters is System Class. As long as that allows jumbo MTU you are laughing.



MTU and VPC


With VPC on 7k the peer-link is always set to a Jumbo MTU, you cannot change this:

N7K7(config-if)# mtu 1500
ERROR: port-channel10: Cannot configure port MTU on Peer-Link.

There is even a bug:
CSCtf05232
From 4.2 NX-OS release onwards, VPC peer-link MTU is set to 9216 + padding.
ISSU SW upgrade form 4.1 to 4.2 will keep VPC peer-link MTU settings to prior 4.2 change and shut/no shut is needed for changes to take effect.



if you have a port-channel configured and are using it with vPC, on the Nexus 7k you must enable the MTU to be jumbo under the port-channel on both interfaces.

To test this, I configured a back to back vPC between a Nexus 5k set and a Nexus 7k set.

Here is the config:


N5k1:

 N5K7# show run | sect vpc|feature|port
feature lacp
feature vpc
feature lldp
vpc domain 2
  peer-keepalive destination 192.168.0.58
interface port-channel10
  switchport mode trunk
  switchport mode trunk
  spanning-tree port type network
  vpc peer-link
interface port-channel11
  speed 10000
  vpc 11




N5k2:

 N5K8# show run | sect vpc|feature|port
feature lacp
feature vpc
feature lldp
vpc domain 2
  peer-keepalive destination 192.168.0.57

interface port-channel10
  switchport mode trunk
  switchport mode trunk
  spanning-tree port type network
  vpc peer-link
interface port-channel11
  speed 10000
  vpc 11



N7k1:
N7K7# show run | sect vpc|feature|port-
feature interface-vlan
feature lacp
feature vpc
vpc domain 1
  peer-keepalive destination 169.254.99.8 source 169.254.99.7 vrf default
interface port-channel10
  switchport mode trunk
  spanning-tree port type network
  vpc peer-link
interface port-channel11
  switchport
  vpc 11



N7k2:

N7K8# show run | sect vpc|feature|port-
feature interface-vlan
feature lacp
feature vpc
vpc domain 1
  peer-keepalive destination 169.254.99.8 source 169.254.99.7 vrf default
interface port-channel10
  switchport mode trunk
  spanning-tree port type network
  vpc peer-link
interface port-channel11
  switchport
  vpc 11

!

In my above example, the two vPC peers could ping each other with jumbo frames (i.e. N7k1 could ping N7k2 with jumbo frames and N5k1 and N5k2 could ping each other with jumbo frames) BUT the 7k could not ping a 5k with jumbo frames, this is because the port-channel 11 (which is the one that connects the 5k and the 7k), was not configured for jumbo frames on the N7k side, it IS configured for jumbo on the 5k, because the 5k only uses the system class remember? so we need to set the 7k port-channel 11 to have jumbo mtu:

N7K8(config)# int po11
N7K8(config-if)# mtu 9216


(Done on BOTH N7K's)

Now let's check it out


N5K7# ping 169.254.1.3 df-bit packet-size 2000
PING 169.254.1.3 (169.254.1.3): 2000 data bytes
2008 bytes from 169.254.1.3: icmp_seq=0 ttl=254 time=1.418 ms
2008 bytes from 169.254.1.3: icmp_seq=1 ttl=254 time=1.081 ms
2008 bytes from 169.254.1.3: icmp_seq=2 ttl=254 time=1.086 ms
2008 bytes from 169.254.1.3: icmp_seq=3 ttl=254 time=0.971 ms
2008 bytes from 169.254.1.3: icmp_seq=4 ttl=254 time=1.09 ms

Done, all very logical, makes sense that a vPC port-channel would act just like any other port-channel, the only real take-away from this is that the peer-link always has an MTU set to jumbo and you can't change it.

MTU and FabricPath

I have covered fabricpath MTU in a previous topic but will cover it here for the sake of completeness. FabricPath uses an outer header that makes it slightly bigger than the normal ethernet header, however because it's actually NOT an ethernet frame and instead is a Fabricpath frame with an ethernet header encapsulated inside, if your linecards (F Line cards and Nexus 5000's) support it, this overhead does not need to be taken into account.

However, if you have a VLAN that you want to run Jumbo frames on and this VLAN is also fabricpath enabled, you need to specify the interfaces as jumbo which we will cover here.

Here is our basic FabricPath Config, super simple and easy:


install feature-set fabricpath
feature-set fabricpath
vlan 10

  mode fabricpath
int eth1/9 - 10
  switchport mode fabricpath
  switchport mode fabricpath
fabricpath domain default


We then configure vlan 10 interface so we can ping between the hosts.

Look at our first bit of output that is relevant:

N5K-p1-2# show fabricpath isis interface  brief
Fabricpath IS-IS domain: default
Interface    Type  Idx State        Circuit   MTU  Metric  Priority  Adjs/AdjsUp
--------------------------------------------------------------------------------
Ethernet1/9  P2P   1     Up/Ready   0x01/L1   1500 40      64          1/1
Ethernet1/10 P2P   2     Up/Ready   0x01/L1   1500 40      64          1/1


So right now our fabric path is only between two 5k's and as you can see the MTU is set to 1500.

Let's test with a ping.

N5K-p1-2(config-if)# show run int vlan 10
interface Vlan10
  no shutdown
  mtu 9216
  ip address 169.254.1.2/24


N5K-p1-1# ping 169.254.1.2 df-bit packet-size 1500
PING 169.254.1.2 (169.254.1.2): 1500 data bytes
Request 0 timed out
Request 1 timed out

No dice, let's try when we modify the default QoS Policy on the Nexus 5000.

N5K-p1-1(config-sys-qos)# show run | sect policy-map
policy-map type network-qos JUMBO
  class type network-qos class-default
    mtu 9216

N5K-p1-2(config-pmap-nq-c)# system qos
N5K-p1-2(config-sys-qos)# service-policy type network-qos JUMBO


Let's try a ping again:

N5K-p1-1# ping 169.254.1.2 df-bit packet-size 8972
PING 169.254.1.2 (169.254.1.2): 8972 data bytes
8980 bytes from 169.254.1.2: icmp_seq=0 ttl=254 time=3.328 ms
8980 bytes from 169.254.1.2: icmp_seq=1 ttl=254 time=4.898 ms
8980 bytes from 169.254.1.2: icmp_seq=2 ttl=254 time=4.958 ms
8980 bytes from 169.254.1.2: icmp_seq=3 ttl=254 time=19.405 ms
8980 bytes from 169.254.1.2: icmp_seq=4 ttl=254 time=3.161 ms

Jackpot, but note the show fabricpath isis interface output does not change:

N5K-p1-1# show fabricpath isis inter brief
Fabricpath IS-IS domain: default
Interface    Type  Idx State        Circuit   MTU  Metric  Priority  Adjs/AdjsUp
--------------------------------------------------------------------------------
Ethernet1/9  P2P   1     Up/Ready   0x01/L1   1500 40      64          1/1
Ethernet1/10 P2P   2     Up/Ready   0x01/L1   1500 40      64          1/1


We have kind of come to expect that though on the 5K so no worries.

Let's involve the 7k.

N7K-1-2# ping 169.254.1.1 df-bit packet-size 1500
PING 169.254.1.1 (169.254.1.1): 1500 data bytes
Request 0 timed out
Request 1 timed out
Request 2 timed out


--- 169.254.1.1 ping statistics ---
4 packets transmitted, 0 packets received, 100.00% packet loss
N7K-1-2# show fabricpath isis interf brief
Fabricpath IS-IS domain: default
Interface    Type  Idx State        Circuit   MTU  Metric  Priority  Adjs/AdjsUp
--------------------------------------------------------------------------------
Ethernet1/1  P2P   1     Up/Ready   0x01/L1   1500 40      64          1/1
Ethernet1/2  P2P   2     Up/Ready   0x01/L1   1500 40      64          1/1
Ethernet1/3  P2P   3     Up/Ready   0x01/L1   1500 40      64          1/1
Ethernet1/4  P2P   4     Up/Ready   0x01/L1   1500 40      64          1/1


No dice until we up the interface MTU manually (or change the system QOS globally)

N7K-1-2(config)# int eth1/1 - 8
N7K-1-2(config-if-range)# mtu 9216
N7K-1-2(config-if-range)# exit
N7K-1-2(config)# exit
N7K-1-2# ping 169.254.1.1 df-bit packet-size 1500
PING 169.254.1.1 (169.254.1.1): 1500 data bytes
1508 bytes from 169.254.1.1: icmp_seq=0 ttl=254 time=10.147 ms
1508 bytes from 169.254.1.1: icmp_seq=1 ttl=254 time=3.067 ms
1508 bytes from 169.254.1.1: icmp_seq=2 ttl=254 time=0.804 ms
1508 bytes from 169.254.1.1: icmp_seq=3 ttl=254 time=0.833 ms
1508 bytes from 169.254.1.1: icmp_seq=4 ttl=254 time=2.296 ms

--- 169.254.1.1 ping statistics ---
5 packets transmitted, 5 packets received, 0.00% packet loss
round-trip min/avg/max = 0.804/3.429/10.147 ms
N7K-1-2# ping 169.254.1.1 df-bit packet-size 8972
PING 169.254.1.1 (169.254.1.1): 8972 data bytes
8980 bytes from 169.254.1.1: icmp_seq=0 ttl=254 time=1.652 ms
8980 bytes from 169.254.1.1: icmp_seq=1 ttl=254 time=6.884 ms
8980 bytes from 169.254.1.1: icmp_seq=2 ttl=254 time=6.992 ms
8980 bytes from 169.254.1.1: icmp_seq=3 ttl=254 time=7.053 ms
8980 bytes from 169.254.1.1: icmp_seq=4 ttl=254 time=15.796 ms

Done and done.



MTU on OTV

To test this one we used the most simple of configs to show you how it is done.
N7k1:

interface Overlay1
  otv join-interface Ethernet1/25
  otv extend-vlan 10
  otv use-adjacency-server 169.254.2.1 unicast-only
  no shutdown

!



N7k2:
interface Overlay1
  otv join-interface Ethernet1/25
  otv extend-vlan 10

  otv adjacency-server
  otv use-adjacency-server 169.254.2.2 unicast-only
  no shutdown

!



There is not much special about the config, and you will see that the default packet size you can send is shown below:


N5K7# ping 169.254.1.2 df-bit packet-size 1430
PING 169.254.1.2 (169.254.1.2): 1430 data bytes
1438 bytes from 169.254.1.2: icmp_seq=0 ttl=254 time=1.542 ms
1438 bytes from 169.254.1.2: icmp_seq=1 ttl=254 time=1.441 ms
1438 bytes from 169.254.1.2: icmp_seq=2 ttl=254 time=1.187 ms
1438 bytes from 169.254.1.2: icmp_seq=3 ttl=254 time=1.168 ms
1438 bytes from 169.254.1.2: icmp_seq=4 ttl=254 time=1.157 ms


The maximum you can do is 1430, this is because OTV adds 42 bytes of overhead on a typical IP frame, so if 1472 if your maximum, take away 42, that gives you 1430.

Let's try and configure some jumbo MTU's and see how much bigger we can get them to go.

First thing would be to enable Jumbo frames on the SVI's, Duh!


N5K7(config)# int vlan 10
N5K7(config-if)# mtu 9216


Still no dice, next we enable system qos class on the 5k's:

N5K7(config)# policy-map type network-qos JUMBO
N5K7(config-pmap-nq)# class type network-qos class-default
N5K7(config-pmap-nq-c)# mtu 9216
N5K7(config-pmap-nq-c)# exit
N5K7(config-pmap-nq)# system qos
N5K7(config-sys-qos)# service-policy type network-qos JUMBO



Still no dice,


However, if we enable our 7k join interface and M interface for MTU:

int eth2/27
(Interface towards 5k)
mtu 9216
int eth1/25
mtu 9216
!

Everything works as expected and we can send jumbo MTU.



N5K7# ping 169.254.1.2 df-bit packet-size 2173
PING 169.254.1.2 (169.254.1.2): 2173 data bytes
2181 bytes from 169.254.1.2: icmp_seq=0 ttl=254 time=1.825 ms
2181 bytes from 169.254.1.2: icmp_seq=1 ttl=254 time=1.389 ms
2181 bytes from 169.254.1.2: icmp_seq=2 ttl=254 time=1.213 ms
2181 bytes from 169.254.1.2: icmp_seq=3 ttl=254 time=1.344 ms
2181 bytes from 169.254.1.2: icmp_seq=4 ttl=254 time=1.358 ms




MTU on FC

As per the brilliant blog post by Tony Burke over at the Data Centre overlords:
http://datacenteroverlords.com/2013/04/01/jumbo-fc-frames/

You can modify MTU on a per-VSAN basis, and the MTU will be negotiated between the devices during FLOGI.


MTU and FCIP


http://www.cisco.com/en/US/docs/switches/datacenter/mds9000/sw/5_0/configuration/guides/ipsvc/fm/cfcip.html





MTU on UCS

So, I can tell you now definitively as I have tested it again and again and I know for sure:

So, The most important thing, is the System QOS

This does not suprise me again: The Fabric Interconnect is based off the Nexus 5000, so it makes sense it would work in a similar manner.

The System QOS Class is  the most important value to set: if you do not set this to support your jumbo frames, no matter what you set on the OS, no matter what you set on the vNIC (and in fact you won't be able to set ANYTHING on the VNIC that is higher than the system class) it won't work unless you have set the system class also.

That is found here:



You MUST change these values to 9216 or whatever jumbo MTU you want, this is key, without this you will get NOWHERE!


OK now we have that cleared up, and we have changed the value:



We need to know what this setting, under the vNIC, does here:

 

 (see highlighted section)

As per Jeff Said So's blog post on the topic: http://jeffsaidso.com/2012/04/cisco-ucs-mtu-sizing-with-vic/

This value is STRICTLY used to INFORM the OS that the network card supports this particular MTU value, if your OS supports auto detection of the network card MTU, you therefore do not have to manually set the MTU in the operating system, which saves time obviously.

Now, _IF_ you set the MTU value manually in the OS, As i will show you below:





 but you leave this value at 1500, as I have done in the below screenshot:





 Guess what? You can get jumbo frames:




So!!! This value in the vNIC is used for one reason and one reason only: to inform the OS that THIS is the available MTU, and _IF_ the OS supports it, it will set the adapter to the appropriate MTU, if you either forget to set this value, OR your OS does not support it, if you change the value in the OS MANUALLY, you will still have jumbo frames as long as the system class is configured.

My Final advice? In the real world Set the jumbo MTU on the adapter, it can't hurt if your OS supports the auto detection, and if your OS does not, well you just need to set the MTU manually.

This explains why the MTU you set on a vNIC does not come up under the actual vethernet config under UCS: It's just a recommendation to the operating system. I would be interested to see what changing this value does on NON Cisco VIC cards, and in actual fact can you even change this value? Any thoughts on this please make a comment.

So, this value is PURELY a suggestion to the OS, it is NOT enforced, but you might as well set it anyway.

The reason this only has a maximum value of 9000, even though you configured 9216 in the system-class, is because most operating system's only understand a value of 9000.


MTU on Nexus 1000v


Luckily this has become extremely simple since 4.0, there IS a system jumbo MTU command, but it does nothing, it cannot be changed and cannot be removed:
 


 If you try and change it:





Only valid options are to set it to 9000.



alright, so how do I set my jumbo MTU?


Well, you ONLY have to set it on your uplink interfaces, it works BY DEFAULT on all your vethernet interfaces

See below:



So you don't have to set anything, if you try and configure an MTU on a vethernet, the option will simply not be there:





You can only set it on the uplink interface, and this is the only place you need to set it:










25 comments:

  1. superb article Peter ty

    ReplyDelete
  2. perfect!!thanks mate, its clear my head for ccie exam

    BR,
    Imad

    ReplyDelete
  3. Peter, this is excellent work. It seems like jumbo frames are never set up correctly and I think your post highlights exactly why... because every switch seems to be different!!!! :) Thanks for taking the time to do this.

    ReplyDelete
  4. Thanks man!!!! Everything I have found on the internet has been confusing me about the "Global Setting".

    As you I did the ping tests myself to prove the MTU 9216 setting on the interface IS required.

    ReplyDelete
  5. Comprehensively done, very informative.

    ReplyDelete
  6. Here you said " Note you can change the MTU individually on both the M line cards OR F Line cards, but your better off with M line cards using the system-QOS, as we will see below'

    Did you mean to say F line cards or can you use the system QOS on the M cards as well. Also does this change from the M1/F1 to the M2/F2 cards.

    ReplyDelete
  7. Could you please add Nexus 4000 in, if possible?

    ReplyDelete
  8. Thanks for sharing the results of your tests! They certainly clarified some stuff for me that Cisco's documentation and other material just made confusing.

    You might want to remove your reference to that "MTU on FC" though. It appears someone took the time to elaborate all that for an April Fool's joke.

    Regards,

    Alfonso

    ReplyDelete
  9. Yep, it turns out that brilliant post that you make reference to was but an April Fool´s prank. The frame size for FC is negotiated during PLOGI, not globally by VSAN.
    You might want to correct your post, which is otherwise full of correct and useful info about MTU and Jumbo Frames.

    ReplyDelete
  10. Nice posting, just what I was looking for. Thank you.

    Kevin Dorrell
    CCIE #20765

    ReplyDelete
  11. Very good Article Peter. Thanks a lot

    ReplyDelete
  12. Good stuff, sir!

    Derek Fink
    CCIE# 38270

    ReplyDelete
  13. Excellent, thank you!

    ReplyDelete
  14. Thanks for your blog. It has been very useful for me.

    I have problems with OTV and Nexus 7706 with module F3 and Nexus 2248TP-E.
    Is it possible to change MTU to 9216 on Nexus 2248 interfaces ?
    There is communication between VDCs using OTV, but there is not communication, when I connect a PC in Nexus 2248 TP-E in a Site A to do connection with other server in the site B. It is not possible to change MTU in this server.
    Peter thank you in advance.

    ReplyDelete
  15. There are more inconsistencies with FEXs also. I found that on a 2248TP-E, you can set the MTU on the interface, but you also have to account for the queues. By default they are set to 2048. And they sure will drop the packets greater than 2048.

    # show queuing interface e170/1/1

    slot 1
    =======

    Ethernet170/1/1 queuing information:
    ...
    ...
    Queueing:
    queue qos-group cos priority bandwidth mtu
    --------+------------+--------------+---------+---------+----
    ctrl-hi n/a 7 PRI 0 2048
    ctrl-lo n/a 7 PRI 0 2048
    2 0 0 1 2 3 4 WRR 80 2048
    4 2 5 6 WRR 20 2048
    ...
    ...

    Here's the fun part. To up those queues, you have to adjust the system queues from the admin vdc.

    Example:

    class-map type network-qos match-any c-nq-8e-4q8q-custom
    match cos 0-7
    policy-map type network-qos c-nq-8e-4q8q-custom
    class type network-qos c-nq-8e-4q8q-custom
    congestion-control tail-drop
    mtu 9216
    system qos
    service-policy type network-qos c-nq-8e-4q8q-custom

    This document helps explain it a bit further(it explains more than just 9000). MTU on the nexus products is indeed frustrating. Not all FEX models function like this. Oh, and this change affects all VDCs, so yeah.. that's fun.

    http://www.cisco.com/c/en/us/support/docs/switches/nexus-9000-series-switches/118994-config-nexus-00.html

    ReplyDelete
  16. Cách điều trị đau bao tử hiệu quả nhanh , cùng xem cách chữa loét hành tá tràng , triệu chứng viêm amidan , Hỏi đáp bệnh amidan hốc mủ cấp tính ,Dân gian thuốc chữa mề đay nổi đỏ , thuoc chua benh gan nhiem mo , đặc trị bệnh viêm phế quản phổi hiệu quả , Mẹo hay chua ho hieu qua thế nào ,Thời tiết bệnh viêm mũi dị ứng ,xông mũi chữa viêm xoang mãn tính dịch mủ, Thuốc dân gian chữa đau dạ dày cấp tính , bệnh trào ngược thực quản phổ biến ,Làm sao chữa trao nguoc da day thuc quan bằng đông y

    ReplyDelete
  17. Thảo mộc tri rung toc nam nữ , phụ nữ mang thai mắc rối loạn kinh nguyệt có nguy hiểm không , mỏi xương khớp benh thoai hoa dot song co có chữa được không ,cách làm giảm bớt đau dạ dày tại nhà . công trình nghiên cứ cách chua viem gan b , bệnh viêm amidan thế nào để chữa bệnh viêm amidan , Cắt viêm amidan hốc mủ khó chịu , Bài thuốc chữa viêm đại tràng mãn tính . làm gì để kiêng bệnh đau dạ dày ăn gì cho nhanh khỏi . Khám phá ttriệu chứng bệnh đau dạ dày thế nào. đau rát viêm họng mãn tính chữa ra sao , Các mẹ cùng chua viem amidan o tre em đơn giản , chảy máu sau cắt viêm amidan kiêng ăn gì .
    Hạt đậu rồng chữa bệnh dạ dày .
    Những triệu chứng bệnh viêm xoang .
    Đông tây y hay thuốc dân gian chữa bệnh dạ dày tất cả đều hiệu quả . Mẩn ngứa mề đay ở người lớn và còn sảy ra benh me day o tre em nữa .

    ReplyDelete
  18. Awesome post, thank you very much for sharing. Any difference between the 7k and 9k NX-OS switches?

    ReplyDelete