Saturday, May 18, 2013

CCIE DC: The VPC Topology you never want to implement, the "Pink Slip" Topology

Hi Guys

So one of my most popular blog posts of all time is my "vPC - The gotcha's you need to know", with a whole.. 10.. maybe even 15!!! unique visitors, no I kid it's a few more than that and easily the most popular on the site.


Let's take a fresh approach to my blog here and write about a topology for vPC that you would NEVER want to implement, it will be funny for us to identify the mistakes made in this topology, plus we can learn all about vPC Failure scenario's too!

Just a few quick shout out's, A certain someone who if they want to take credit will post a reply is responsible for getting me access to a lab so i can actually show you all this, So cheers Anonymous Person!

The INE CCIE DC Video's where invaluable in the creation of this blog post, the advanced vPC video's are exceptionally good.

The blog post will contain "ALWAYS" tips that you can use to make sure your implementing a secure, robust vPC Topology first time every time.

It's worth noting that almost every step of the way, protection mechanisms in NXOS and vPC prevented me from trying to configure it the wrong way, for example, I was unable to configure the peer link before I configured the keepalive.

Anyway now that's out the way let's look at our Topology.









(As you can clearly see, I was actually meant to be a graphic artist but got lumped into networking ;))

OK so now we have our simple diagram showing our very simple topology. Our Hero Network Engineer, let's call him leethax0r decides to implement a vPC Topology, "how hard can it be?" He says to himself, After all I have seen the Cisco Certified guys do it tons of times, it's just like port-channel's with a stacked switch right? Of course leethax0r is too proud to learn how to do properly, if it was up to him, they'd be implementing an ABC (Anything But Cisco) Network anyway.

Through a series of serious mistakes our hero leethax0r (or hax0r for short) implements a terrible configuration, leading to a terrible vPC implementation (Which invaribly, he will blame on "cisco bugs", rather than his own misunderstanding and misconfiguration)


The first thing hax0r decides to do, is implement a peer keepalive, he knows this is the first step in implementing vPC


(Config shown in red so that hopefully people don't copy paste it and try and implement it!)

feature interface-vlan
feature lacp
feature vpc

int vlan 10
 no shut
 ip add 10.10.10.2/24
!





hax0r does the same thing on each side and changes the IP Address, so far he hasn't really done anything wrong, It's perfectly valid to use an SVI interface for your peer keepalives, although it is strongly recommended to place them in there own VRF so they are not affected by the global network IP routing, it's also recommended to create a dedicated link between the two switches that just carries this particular VLAN for this SVI, but this is where our hero hax0r goes horribly wrong...

Hax0r decides that he wants to implement OSPF and rely on the network upstream from his 5k to provide alternative paths for the peer keepalives to reach each other, what could possibly go wrong? he says to himself:

ALWAYS: Implement a totally seperate vRF for the vPC Peer keepalive and (where possible) have a back to back interface between the vPC Peers, a directly connected interface only carrying the peer keepalive, the peer-keepalive link prevents dual-active, the most catastrophic situation that can occur with vPC!

This is the config that hax0r implements:

Switch 2

interface Vlan10
  no shutdown
  ip address 10.10.10.2/24
  ip router ospf 1 area 0.0.0.0

interface Vlan20
  no shutdown
  ip address 10.20.20.1/24
  ip router ospf 1 area 0.0.0.0

 

Switch 1:
interface Vlan10
  no shutdown
  ip address 10.10.10.1/24
  ip router ospf 1 area 0.0.0.0

interface Vlan20

interface Vlan30
  no shutdown
  ip address 10.30.30.1/24
  ip router ospf 1 area 0.0.0.0



He then checks he can reach switch 1 from Switch 2 on there seperate VLAN interfaces that will be routed via VLAN 10 (the shared VLAN)

SIWTCH2# ping 10.30.30.1
PING 10.30.30.1 (10.30.30.1): 56 data bytes
64 bytes from 10.30.30.1: icmp_seq=0 ttl=254 time=0.823 ms
64 bytes from 10.30.30.1: icmp_seq=1 ttl=254 time=0.619 ms
64 bytes from 10.30.30.1: icmp_seq=2 ttl=254 time=0.61 ms
64 bytes from 10.30.30.1: icmp_seq=3 ttl=254 time=7.755 ms
64 bytes from 10.30.30.1: icmp_seq=4 ttl=254 time=9.645 ms

L33thax0r pats himself on the back at his correct implementation of OSPF.

Next, hax0r implements the vPC Peer Keepalive:

vpc domain 1
  peer-keepalive destination 10.20.20.1 source 10.30.30.1 vrf default

!

hax0r checks show vpc to see if the peer is up


SIWTCH2# show vpc
Legend:
                (*) - local vPC is down, forwarding via vPC peer-link

vPC domain id                     : 1
Peer status                       : peer link not configured
vPC keep-alive status             : peer is aliveConfiguration consistency status  : failed
Per-vlan consistency status       : failed
Configuration inconsistency reason: vPC peer-link does not exist
Type-2 consistency status         : failed
Type-2 inconsistency reason       : vPC peer-link does not exist
vPC role                          : none established
Number of vPCs configured         : 0
Peer Gateway                      : Disabled
Dual-active excluded VLANs        : -
Graceful Consistency Check        : Disabled (due to peer configuration)
Auto-recovery status              : Disabled


Hax0r is now happy that his vPC is showing as up, hax0r, out of his depth notices something about a "peer link", some quick googling leads hax0r to some sample configuration, so hax0r implements the config he found on a popular ABC podcast


interface Ethernet1/20
  switchport mode trunk
  channel-group 1 mode active
!

SIWTCH2(config)# int po1
SIWTCH2(config-if)# vpc peer-link


This is dead wrong: Hax0r didn't use more than one link for the peer-link and  he didn't set the spanning-tree port type to network (Thankfully this last one the switch will do this automatically for him)

ALWAYS: Implement multiple links, bundled together for your vPC Peer-link, if your using a Nexus 7000 this should be across multiple linecards (and if you don't have multiple linecards in your nexus 7000, GET THEM!)

ALWAYS: Use spanning-tree port type network on the vPC Peer Link (Switch will do this automatically for you)

Hax0r now checks the vPC Output:

SIWTCH2# show vpc
Legend:
                (*) - local vPC is down, forwarding via vPC peer-link

vPC domain id                     : 1
Peer status                       : peer adjacency formed ok
vPC keep-alive status             : peer is alive
Configuration consistency status  : success
Per-vlan consistency status       : success
Type-2 consistency status         : success
vPC role                          : primary
Number of vPCs configured         : 0
Peer Gateway                      : Disabled
Dual-active excluded VLANs        : -
Graceful Consistency Check        : Enabled
Auto-recovery status              : Disabled

vPC Peer-link status
---------------------------------------------------------------------
id   Port   Status Active vlans
--   ----   ------ --------------------------------------------------
1    Po1    up     1,10


Hax0r has done it! The VPC is up! Truly his l33tness should  be known everywhere, maybe he'll even be able to convince the bosses to take out this evil Cisco network and put ABC vendor in, now he's shown his l33t skills.

However, little known to hax0r, his shoddy vPC config is a timebomb waiting to go off.

L33t Hax0r calls his co-worker and system administrator, , LulzAdmin (or Lulz for short) to come and plug his servers in, the vPC is ready.

LulzAdmin (Who is almost as good as l33t hax0r at his job) has his server ready. The server is MISSION Critical, the entire business depends on this server.

l33t hax0r and lulzadmin can't work out how to dual attach there server, they can't seem to get the teaming working across two ports, and after blaming "cisco bugs" they decide that the server will be fine single attached.

ALWAYS: Dual Attach your servers to both vPC Peers, dual attach EVERYTHING, you never want single-attached devices.
 
This mission critical server, is about to have some major problems...

L33t hax0r plugs the server into one of the ports on the switch:

int eth1/1
 channel-group 10 mode active

!

int po10
 switchport mode access
 switchport access vlan 50
!


Examine the below output:


SWITCH1# show vpc orphan-ports
Note:
--------::Going through port database. Please be patient.::--------

VLAN           Orphan Ports
-------        -------------------------
1              Eth1/10, Eth1/15, Eth2/1, Eth2/2
10             Eth1/10
50             Eth1/1, Eth1/10


Here is our first problem, as mentioned, we have some single attached ports, which in this case is vPC 50,



SIWTCH2# show mac address-table vlan 10
Legend:
        * - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC
        age - seconds since last seen,+ - primary entry using vPC Peer-Link
   VLAN     MAC Address      Type      age     Secure NTFY   Ports/SWID.SSID.LID
---------+-----------------+--------+---------+------+----+------------------
* 10       547f.eeaf.1cbc    static    0          F    F  Router
* 10       547f.eeaf.3a3c    static    0          F    F  Po1

 



What is wrong with this output? Let's see, the OSPF neighbor relationship between the two switches is via the Port-Channel link! That means if the vPC Peer link goes down, the keepalives will also go down as well, which means the vPC Secondary will suspend any vPC Peer Ports for risk of a dual active situation!


After configuring this mess, hax0r and lulz go for a "Hard earnt" drink at the local bar, meanwhile, the Data Centre they are hosting the Nexus at has a scheduled Power Outage for UPS B Feed, if you have your equipment dual attached to two power supplies on two diffirent feeds, you should be good, but take a guess as to if our hero hax0r did such a thing ;).

The Secondary nexus, connected to UPS Feed B, powers off.

Lulz and Hax0r receive a phone call, it's BigBossMan, the Webserver has gone Offline! What have you done?


Lulz and Hax0r rush to the data centre to investigate the problem, upon logging into the powered up nexus, Lulz and Hax0r see the following:


SWITCH1# show vpc
Legend:
                (*) - local vPC is down, forwarding via vPC peer-link

vPC domain id                     : 1
Peer status                       : peer link is down
vPC keep-alive status             : Suspended (Destination IP not reachable)
Configuration consistency status  : success
Per-vlan consistency status       : success
Type-2 consistency status         : success
vPC role                          : secondary, operational primary
Number of vPCs configured         : 1
Peer Gateway                      : Disabled
Dual-active excluded VLANs        : -
Graceful Consistency Check        : Enabled
Auto-recovery status              : Disabled

vPC Peer-link status
---------------------------------------------------------------------
id   Port   Status Active vlans
--   ----   ------ --------------------------------------------------
1    Po1    down   -

vPC status
----------------------------------------------------------------------------
id     Port        Status Consistency Reason                     Active vlans
------ ----------- ------ ----------- -------------------------- -----------
10     Po10        down   failed      Peer-link is down          -SWITCH1#

The port channel to the server! Po10! It's down, Why? What is this "peer link down" all about, just because the other switch died, why should this one fail?

The problem is that the vPC peer link can't come up until the peer keepalive is up, and the peer keepalive can't come up until the vPC Peer Link comes up... Even though VLAN 10 is configured to go upstream to the rest of the network (as per hax0r's plan to use the rest of the global routing table to allow the keepalives to function) when a VPC Peer-link goes down, any SVI's that belong to that vPC Peer-link will shut down.

ALWAYS: Make sure the L3 link which the two vPC peers use for there routing protocol adjacency  does NOT travel over the vPC Peer-link, as if the vPC peer link dies, these SVI's will be suspended!




Lulz and hax0r, do some quick googling and seem to think they need a command called "auto-recovery, as quickly as they can they configure auto-recovery on the switches and reload the switches...

NOW, they truly are hosed, by implementing auto-recovery which tells both switches to go active after a reload if they can't see the peer, and by not having the peer-link able to come up because the keepalives can't get through, they will now cause a dual-active scenario.

The Dual Active Scenario leads the entire network to come down, Cisco Certified Engineers come in and inspect what happened, proving the misconfiguration on hax0r's part, hax0r is summarily shown his pink slip, still cursing "cisco bugs" for his downfall.


This is just some of the mistakes that can be made when your configuring vPC that can lead it to cause major problems, Now that we know what we shouldn't do, let's examine the diffirent commands in VPC and what they protect us from.

First of all, I explain the peer gateway command in detail in my vPC - The gotcha's you need to know article. You should always implement peer-gateway, especially if you have netapp or F5 Devices.

What about Auto-Recovery, what exactly does that Do?

As we explained in our previous example, auto-recovery is useful in a situation where both switches power off, but only one switch turns back on (maybe the other one was hit by an electrical surge or something else), in this particular scenario, if auto-recovery was not turned on what would happen is that the vPC would never establish, so the switch that is now ON would never turn on it's vPC member port,s because the vPC peer-link would never have come up.

Auto-recovery will resolve this, Auto-recovery says that after a certain period (default is 240 seconds), the switch will assume the peer has died and bring up the ports.

The Most important thing, when enabling Auto-recovery, is to be damn sure that if both switches reset, they will always be able to get the vPC Peer Link up and/or the vPC Peer Keepalive so that they can detect a dual-active scenario.

Therefore, it is recommended to turn on auto-recovery, and as long as you can satisfy the above criteria, you are safe to turn this on.

NOTE: You can turn on auto-recovery retroactively, so if you ever walked into a situation where this was occuring and you hadn't turned  it on previously, if you turn it on, 240 seconds later the vPC will become active.

Auto Recovery also assists in another situation, if you have a vPC Peer-link go down:

SWITCH2(config)# int po1
SWITCH2(config-if)# shut
2013 May 18 15:23:35 SWITCH2 %$ VDC-1 %$ %VPC-2-VPC_SUSP_ALL_VPC: Peer-link going
down, suspending all vPCs on secondary


The Secondary will suspend all vPC Member Ports as it should, this is the behavior that is executed as part of vPC to prevent Dual-Active Scenario's and to ensure correct forwarding, because the peer-link is quite important.

So, now what happens if the primary now dies in this situation, the secondary vPC will STILL leave the ports down:


vPC Peer-link status
---------------------------------------------------------------------
id   Port   Status Active vlans
--   ----   ------ --------------------------------------------------
1    Po1    down   -

vPC status
----------------------------------------------------------------------------
id     Port        Status Consistency Reason                     Active vlans
------ ----------- ------ ----------- -------------------------- -----------
10     Po10        down   failed      Peer-link is down          -


With auto-recovery, the secondary switch can realise that the primary is not coming back and enable the vPC member ports.

Again however, it's exceptionally important that your positive the vPC Peer Keepalive will always work reliably so as to avoid dual active scenario's, this is the reason that auto-recovery is not enabled by default, because it could potentially lead to dual-active.


The following output is typically what you will see when a vPC PO has been enabled due to auto-recovery:

vPC status
----------------------------------------------------------------------------
id     Port        Status Consistency Reason                     Active vlans
------ ----------- ------ ----------- -------------------------- -----------
10     Po10        up     success     Type checks were bypassed  50
                                      for the vPC



Let's look at two other commands, graceful consistency check and Peer-switch.

Graceful consistency check helps in the following situations, Let's pretend our friend hax0r has been rehired at a company, much to his chagrin they have vPC, our friend hax0r is asked to reconfigure a vPC port-channel from an access port to a trunk port.

Woops! the ports have just gone down now as the paramters don't match, our friend hax0r is not having much luck!

With the graceful-consistency check command configured, one end of the link will suspend but the other will remain up, giving you chance to match the parameters on both side, so you don't have to bring the link down to make changes, therefore this is highly recommended.


With Peer-Switch, unfortunately in my topology I do not have enough switches to show you what particular scenario is needed for it to be helpful, but to cut a long story short, if you have a switch connected via a vPC port channel to your two vPC Peers, if one of these Peers is the root of the spanning-tree, Peer-switch can be useful to ensure that in the event of failure of this root bridge, and subsequent recovery, that there is no delay in forwarding while waiting for spanning-tree to reconverge, because both the vPC Peers will appear as one giant bridge from the perspective of spanning-tree (since they will share the same bridge-ID), I recommend you turn this command on, i cannot think of any reason not to.


So In Conclusion, I personally recommend the following as default config for vPC:

Graceful Consistency Check
Auto Recovery (Just be sure your peer-keepalive or peer-link will always work after a reboot)
Peer-Gateway
Peer-Switch
 

Finally, there are a few interesting options for your vPC that could allow you to do some naughty configuration if you wanted to.


The Dual-active command under vPC Config allows you to specify that a particular SVI interface will NOT go down if the peer-link fails, but instead is excluded from suspension, this could have been useful to our friend hax0r for VLAN 10, but it's still not recommended to configure it this way, you should not need to use this dual-active command in most situations, except in certain orphan port situations where you want to ensure that if an orphan port exists, and the peer-link dies, that the orphan port on the secondary can get to it's SVI interface.


I hope you enjoyed this blog post I spent a while on it, I hope I covered off your questions :)


















Sunday, May 12, 2013

CCIE DC: First Official Rack Rental! Spanning-tree Bridge Assurance and LACP suspend-individual

Hi Guys!

Today marked an important occasion as I had my first "official" rack rental (I have had others thanks to some generous people, but this was my first with the "big two" training vendors)

I concentrated on:

- CFS
- Port Channels (LACP)
- Spanning-tree Bridge Assurance
- FEX stuff


Here is the first few useful things i found.


I am sure we have all done this before:


You configure a few member interfaces for your etherchannel:


SW3(config)# int eth1/4, eth1/2
SW3(config-if-range)# channel-group 2 mode active



You configure a few options for the Port channel:

interface port-channel2
  description ### I am L33t ###
  switchport mode trunk
  switchport trunk allowed vlan 1
  spanning-tree port type network
  speed 10000



These then apply on your member ports:

interface Ethernet1/4
  switchport mode trunk
  switchport trunk allowed vlan 1
  channel-group 2 mode active



But woops you forgot a port, you meant to add Eth1/1 too!



SW3(config)# int eth1/1
SW3(config-if)# channel-group 2 mode active
command failed: port not compatible [Ethernet Layer]



Damn what a pain in the ass! Now I have to go and add all the options to the port, like the spanning-tree mode etc.. or do I?

 SW3(config-if)# channel-group 2 force mode active


Let's take a look at the config now:


 SW3(config-if)# show run int eth1/1
 

interface Ethernet1/1
  switchport mode trunk
  switchport trunk allowed vlan 1
  channel-group 2 mode active



Awesome! All the appropriate config has applied without me having to put it all in manually. A bit of a time saver on the Lab, when every second will count!

Let's talk more about LACP, there are two commands available for LACP that are not on non-nexus platforms, and these commands are enabled by default, and they can actually be quite a pain:

lacp suspend-individual

and

lacp graceful-convergence


Let's talk about suspend-individual.

so the idea behind LACP suspend-individual is that if a port-channel does not receive any LACP PDU's on a particular port-channel, in the normal case these ports would be placed into "Individual" state:


SW1# show port-channel sum
Flags:  D - Down        P - Up in port-channel (members)
        I - Individual  H - Hot-standby (LACP only)
        s - Suspended   r - Module-removed
        S - Switched    R - Routed
        U - Up (port-channel)
        M - Not in use. Min-links not met
--------------------------------------------------------------------------------
Group Port-       Type     Protocol  Member Ports
      Channel
--------------------------------------------------------------------------------
1     Po1(SD)     Eth      LACP      Eth3/1(I)    Eth3/3(I)   


 

This means that LACP will treat these as two independent links, not as an etherchannel, but let's say you had the other end of the link misconfigured, and had port channel in ON mode on the other end:


SW2# show run int eth1/1

interface Ethernet1/1
  switchport mode trunk
  channel-group 10



Suddenly you have a potential loop in the network, and you will see some very strange spanning-tree behavior:

SW2# show spanning-tree vlan 30

VLAN0030
  Spanning tree enabled protocol rstp
  Root ID    Priority    4126
             Address     547f.eec2.7d01
             This bridge is the root
             Hello Time  2  sec  Max Age 20 sec  Forward Delay 15 sec

  Bridge ID  Priority    4126   (priority 4096 sys-id-ext 30)
             Address     547f.eec2.7d01
             Hello Time  2  sec  Max Age 20 sec  Forward Delay 15 sec

Interface        Role Sts Cost      Prio.Nbr Type
---------------- ---- --- --------- -------- --------------------------------
Po10             Desg FWD 1         128.4105 P2p
SW2# show spanning-tree vlan 30

VLAN0030
  Spanning tree enabled protocol rstp
  Root ID    Priority    4126
             Address     547f.eec2.7d01
             This bridge is the root
             Hello Time  2  sec  Max Age 20 sec  Forward Delay 15 sec

  Bridge ID  Priority    4126   (priority 4096 sys-id-ext 30)
             Address     547f.eec2.7d01
             Hello Time  2  sec  Max Age 20 sec  Forward Delay 15 sec

Interface        Role Sts Cost      Prio.Nbr Type
---------------- ---- --- --------- -------- --------------------------------
Po10             Desg BLK 1         128.4105 Dispute P2p




The port will block/unblock as it keeps seeing "Designated" BPDU's.

We can resolve this by telling the port on the upstream switch to suspend ports if they are part of an etherchannel and we are expecitng to receive LACP PDU's for them:


SW1(config)# int po1
SW1(config-if)# lacp suspend-individual
ERROR: Cannot set/reset lacp suspend-individual for port-channel1 that is admin up
SW1(config-if)# shut
SW1(config-if)# lacp suspend-individual
SW1(config-if)# no shut




Now the ports will show as suspended:

SW1(config-if)# end
SW1# show port-channel sum
Flags:  D - Down        P - Up in port-channel (members)
        I - Individual  H - Hot-standby (LACP only)
        s - Suspended   r - Module-removed
        S - Switched    R - Routed
        U - Up (port-channel)
        M - Not in use. Min-links not met
--------------------------------------------------------------------------------
Group Port-       Type     Protocol  Member Ports
      Channel
--------------------------------------------------------------------------------
1     Po1(SD)     Eth      LACP      Eth3/1(s)    Eth3/3(s)   



As soon as we reconfigure our etherchannel correctly on the other side:


SW2(config)# int eth1/1, eth1/3
SW2(config-if-range)# channel-group 10 mode active

The port comes out of suspended state and traffic will flow

SW1# show port-channel sum
Flags:  D - Down        P - Up in port-channel (members)
        I - Individual  H - Hot-standby (LACP only)
        s - Suspended   r - Module-removed
        S - Switched    R - Routed
        U - Up (port-channel)
        M - Not in use. Min-links not met
--------------------------------------------------------------------------------
Group Port-       Type     Protocol  Member Ports
      Channel
--------------------------------------------------------------------------------
1     Po1(SU)     Eth      LACP      Eth3/1(P)    Eth3/3(P)   


So as you can see, you didn't need to no shut the ports or anything on the Sw2 side, you just had to get it to start advertising LACP PDU's

The PROBLEM is that some linux hosts and some other hosts, will not bring up the LACP until they receive the LACP PDU's first, so this can make the switch place the ports into the suspended state indefinately, since the switch is expecting LACP PDU's, but the host never sends them, so the port channel remains down.

So for linux hosts and other devices that may not send PDU's straight away, turn off LACP suspend individual, for all other ports your perfectly safe having it enabled.


Let's talk about bridge assurance.


Bridge assurance is a feature on the Nexus platforms that uses BPDU's as a method to perform "pruning" of unwanted VLAN's (although this is more of an unforseen benefit of the design) and to protect against unidirectional links.

The way bridge assurance works is, if you specify a port as spanning-tree port type network (which is NOT set by default by the way, except on vPC Peer-Links) then what will happen is spanning-tree bridge assurance will force both links to constantly send BPDU's both directions, as sort of a method of keepalive, if spanning-tree bridge assurance notices that these BPDU's go missing on either end, it knows that there is a unidirectional fault on the link (or another fault on the link) and immediately blocks the port via spanning-tree so that an alternative path can be taken.

The added advantage of this technology, is that when using rapid spanning-tree, each VLAN has it's own BPDU's right? Let's say we have a config like this:



Switch 1 has VLAN 1, 10, and 30

Switch 3 has VLAN 1 and 10


On the switches port-channels to each other, we specify these are "network" ports:



SW3# show run int po2

interface port-channel2
  description ### I am L33t ###
  switchport mode trunk
  switchport trunk allowed vlan 1
  spanning-tree port type network
  speed 10000

!

SW1# show run int po2
interface port-channel2
  switchport
  switchport mode trunk
  spanning-tree port type network


As you can see here, we have port type network on both switches, lets see what spanning-tree on Sw1 has to say about this:

SW1# show spanning-tree int po2

Vlan             Role Sts Cost      Prio.Nbr Type
---------------- ---- --- --------- -------- --------------------------------
VLAN0001         Desg FWD 1         128.4097 Network P2p

VLAN0010         Desg BKN*1         128.4097 Network P2p *BA_Inc

VLAN0030         Desg BKN*1         128.4097 Network P2p *BA_Inc


As you can see, spanning-tree bridge assurance is blocking vlan 10 and 30 from going out on this link because on sw3 we have said switchport trunk allowed vlan 1, so the upstream swithc (Swithc 1) is not receiving any BPDU's, so as  far as he is concerned there's no good reason to send the traffic down.

If we add VLAN 10 to the Switch 3 trunk interface:



SW3(config)# int po2
SW3(config-if)# switchport trunk allowed vlan add 10



This will instantly as soon as the BPDU's are advertised unblock the port on the upstream switch:


SW1# 2013 May 12 10:02:47 SW1 %$ VDC-1 %$ %STP-2-BRIDGE_ASSURANCE_UNBLOCK: Bridge Assurance unblocking port port-channel2 VLAN0010.

Now we have seen how spanning-tree bridge assurance works, let's see what can happen if we misconfigure it.


In this example, we have a trunk between SW1 to SW2:

SW1# show run int po1


interface port-channel1
  switchport
  switchport mode trunk
  spanning-tree port type network

On Switch2:

interface port-channel10
  switchport mode trunk
  speed 10000


Just to make this example a bit easier to follow, on Switch 2 we have made Switch 2 the root of the spanning tree.

Let's take a look at the show spanning-tree on Switch 1:

SW1# show spanning-tree interf po1

Vlan             Role Sts Cost      Prio.Nbr Type
---------------- ---- --- --------- -------- --------------------------------
VLAN0001         Desg BKN*1         128.4096 Network P2p *BA_Inc

VLAN0010         Desg BKN*1         128.4096 Network P2p *BA_Inc

VLAN0030         Root FWD 1         128.4096 Network P2p 



As you can see from the above example, the switch has placed the port into BLOCKING based on Bridge Assurance (BA_INC).

It has however kept vlan 30 unblocked, why?


SW2# show spanning-tree inter po10

Vlan             Role Sts Cost      Prio.Nbr Type
---------------- ---- --- --------- -------- --------------------------------
VLAN0001         Root FWD 1         128.4105 P2p

VLAN0030         Desg FWD 1         128.4105 P2p




Because for VLAN 30, SW2 is the root switch. Since SW2 is the root for VLAN 30, that means that the port facing upstream is a designated port from SW2's perspective, BPDU's are always sent out Designated ports, therefore SW1 is receiving BPDU's from SW2:


SW2# debug spanning-tree bpdu_tx

2013 May 12 06:59:11.492578 stp: RSTP(30): transmitting RSTP BPDU on port-channel10
2013 May 12 06:59:11.492608 stp: vb_vlan_shim_send_bpdu(1977): VDC 1 Vlan 30 port port-channel10 enc_type 1 len 42
2013 May 12 06:59:13.492584 stp: RSTP(30): transmitting RSTP BPDU on port-channel10
2013 May 12 06:59:13.492615 stp: vb_vlan_shim_send_bpdu(1977): VDC 1 Vlan 30 port port-channel10 enc_type 1 len 42
2013 May 12 06:59:15.492581 stp: RSTP(30): transmitting RSTP BPDU on port-channel10
2013 May 12 06:59:15.492610 stp: vb_vlan_shim_send_bpdu(1977): VDC 1 Vlan 30 port port-channel10 enc_type 1 len 42
2013 May 12 06:59:17.490094 stp: RSTP(30): transmitting RSTP BPDU on port-channel10
2013 May 12 06:59:17.490124 stp: vb_vlan_shim_send_bpdu(1977): VDC 1 Vlan 30 port port-channel10 enc_type 1 len 42
2013 May 12 06:59:19.490097 stp: RSTP(30): transmitting RSTP BPDU on port-channel10
2013 May 12 06:59:19.490127 stp: vb_vlan_shim_send_bpdu(1977): VDC 1 Vlan 30 port port-channel10 enc_type 1 len 42




As you can see from the above debug output, SW2 is sending BPDU's out po10 on vlan 30, since SW1 is receiving BPDU's for this VLAN, the bridge assurance feature says well im receiving BPDU's, so we are good to go here, lets unblock this port.


What is missing from this output though is SW2 sending BPDU's for VLAN 1 and 10, it will NOT send these, why? because for VLAN 1 and 10 Port10 is SW2's root port (the port where it can find the root bridge) and spanning-tree does not transmit BPDU's up the root port, therefore spanning-tree bridge assurance is not receiving any BPDU's for these VLAN's and is therefore blocking the port.


This shows the importance that if you are going to use spanning-tree bridge assurance, you need to make sure you set the spanning-tree port type network on BOTH ends of the link, if your connecting to a 6500 for example, you can't do spanning-tree bridge assurance, therefore you want to turn it off for any ports facing a 6500. (or just don't specify spanning-tree port type network, because it will only run on ports configured as spanning-tree port type network)


Let's fix up the spanning-tree port type network on switch 2:

SW2(config)# int po10
SW2(config-if)# spanning-tree port type network


As soon as we do this, SW1 unblocks:


SW1# 2013 May 12 10:14:05 SW1 %$ VDC-1 %$ %STP-2-BRIDGE_ASSURANCE_UNBLOCK: Bridge Assurance unblocking port port-channel1 VLAN0001.


Sunday, April 21, 2013

CCIE DC: FCoE NPV

Hi Guys!

Final blog post for tonight I have spent lots of time this weekend working super hard on my study!

So let's check out FCoE NPV, first a quick word of advice: Make sure when switching fibre channel modes (NPV vs FC Switching) make sure you completely write erase before you switch modes.

Without further adue, let's look at the config!

First thing's first, let's turn on the FCoE NPV Feature on the switch that is going to be on the actual switch performing NPV:

feature fcoe-npv

On the FC switch that will act as our core and WILL participate in the FC as a FC Forwarder, we need to turn on NPIV

feature npiv


As usual with FCoE, let's create our VSAN's and our VLAN's

vsan database
  vsan 10

!
vlan 10
  fcoe vsan 10

!

Next, we configure our ethernet interface that faces up towards the core, we can do this on both our core switch and our NPV Switch



interface Ethernet1/10
  switchport mode trunk
  switchport trunk allowed vlan 10,20
!


Next, let's configure our VFC Interface:


interface vfc1
  bind interface Ethernet1/10
  switchport mode NP
  switchport trunk mode on
  no shutdown



This is now ready on our NPV side, on our Core switch the config is only slightly diffirent:


interface vfc1
  bind interface Ethernet1/10
  switchport trunk allowed vsan 10
  no shutdown

As you can see on the upstream switch our core switch just has this configured as an F Port.


Now finally, we need to make each of them realise that the port is a member of the VSAN:

vsan database
  vsan 10 interface vfc1


Now we can check to see if the vfc interface is up on our NPV Switch:

switch# show int vfc1
vfc1 is trunking (Not all VSANs UP on the trunk)
    Bound interface is Ethernet1/10
    Hardware is Ethernet
    Port WWN is 20:00:54:7f:ee:af:1c:bf
    Admin port mode is NP, trunk mode is on
    snmp link state traps are enabled
    Port mode is TNP
    Port vsan is 10
    Trunk vsans (admin allowed and active) (1,10)
    Trunk vsans (up)                       (10)



Success! the next step is to get our server facing interface going, for the sake of brevity i will show all the configuration for this at once:

switch# show run int vfc2

interface vfc2
  bind interface Ethernet1/1
  switchport trunk mode on
  no shutdown

!

interface Ethernet1/1
  switchport mode trunk
  switchport trunk allowed vlan 10,20
  spanning-tree port type edge trunk

!

vsan database
  vsan 10 interface vfc2

Let's check the server facing interface:

switch# show int vfc2
vfc2 is trunking (Not all VSANs UP on the trunk)
    Bound interface is Ethernet1/1
    Hardware is Ethernet
    Port WWN is 20:01:54:7f:ee:af:1c:bf
    Admin port mode is F, trunk mode is on
    snmp link state traps are enabled
    Port mode is TF
    Port vsan is 10
    Trunk vsans (admin allowed and active) (1,10)
    Trunk vsans (up)                       (10)

!

Success! Let's check the NPV flogi table which is very useful.


switch# show npv flogi-table
--------------------------------------------------------------------------------
SERVER                                                                  EXTERNAL INTERFACE VSAN FCID             PORT NAME               NODE NAME       INTERFACE
--------------------------------------------------------------------------------
vfc2      10   0xd80002 20:00:a4:4c:11:13:8c:d1 10:00:a4:4c:11:13:8c:d1 vfc1


Very Useful output showing us that the port vfc2 is logged in and is using vfc1 for its external interface.


The show npv command is actually quite powerful:


switch# show npv ?
  external-interface-usage  Show external interface usage by server interfaces
  flogi-table               Show information about FLOGI sessions
  internal                  Show internal NPV information
  status                    Show NPV status
  traffic-map               Show information about Traffic Map



In particular, i found the following command VERY helpful:

in parti


switch# show npv internal events

1) Event:E_DEBUG, length:94, at 270505 usecs after Sun Apr 21 10:39:22 2013
    [538976288] E(10,vfc1) Received GMAL Response from core switch with Core Switch Inet Addr: 10.1.1.15



2) Event:E_DEBUG, length:252, at 270497 usecs after Sun Apr 21 10:39:22 2013
    [538976288] E(10,vfc1) npivp_get_ext_intf_fsm(1466): Is Core VF Capable: TRU
E
, Is Phy Login Done: TRUE, Is Port Channel: FALSE RID: { Type: Ext-If-VSAN(8),
IfIndex: vfc1(0x1e000000), UCD: 0, VSAN: 10, PWWN: 00:00:00:00:00:00:00:00 }, St
ate NPIVP_EXT_IF_ST_UP
This, along with show npv internal errors is really helpful in troubleshooting upstream issues.


If for example the upstream switch was not enabled for NPIV, I would see the following output in this command:




11) Event:E_DEBUG, length:98, at 464269 usecs after Sun Apr 21 10:41:22 2013
    [112] E(10,vfc1) FC Upstream switch is not NPIV enabled. Bringing down the external interface: vfc1



Saturday, April 20, 2013

CCIE DC: Multihop FCoE on Nexus 5k

Hi Guys.

There is a lot of confusion in my own mind for FCoE Multihop, I remember back in the day that the argument was "well, you can do FCoE with the Nexus 2k's which is not true multihop but it kind of is!" I also remember that this blog post: http://brasstacksblog.typepad.com/brass-tacks/2011/06/fcfcoe-connectivity-options-as-of-6272011.html by Erik was one of the best posts on the topic. I have asked Erik Via Twitter to update this diagram as a lot of it has changed now in regards to Cisco UCS.

In this blog post I will show one of the examples given by Erik, which is multihop FCOE Configuration on the Nexus 5k's in the hope that it will help someone out there.

Authors note: I think that the order of operation is very important when configuring multihop FCoE as I had major issues getting this going the first time I attempted it. My advice is to get the absolute basic FCoE Connectivity going first, then add complexity once your comfortable that it's all configured correctly.

Steps:
After enabling FCoE Ofcourse with feature FCOE on your nexus 5k,The first thing to do on both your Nexus 5k's is configure the appropriate VLAN's and VSAN's.

vsan database
 vsan 10
!
 
vlan 10
  fcoe vsan 10

vlan 20
  name DataVLAN


In this example I have configured VSAN 10, and bound it to VLAN 10.

Next, I need to configure the ethernet interface between my two switches appropriately:

interface Ethernet1/10
  switchport mode trunk
  switchport trunk allowed vlan 10

!

At this point I personally still don't "no shut" the interfaces and I wait until I have configured everything.

The next step is to configure the VFC Interface:


interface vfc1
  bind interface Ethernet1/10
  switchport mode E
  switchport trunk allowed vsan 10
  no shutdown

!

Now this is done, I no shut the ethernet interface on each end:

int eth1/10
no shut
!
 
Now let's look at our VFC1 interface:


switch# show int vfc1
vfc1 is trunking
    Bound interface is Ethernet1/10
    Hardware is Ethernet
    Port WWN is 20:00:54:7f:ee:af:1c:bf
    Admin port mode is E, trunk mode is on
    snmp link state traps are enabled
    Port mode is TE
    Port vsan is 1
    Trunk vsans (admin allowed and active) (10)
    Trunk vsans (up)                       (10)
    Trunk vsans (isolated)                 ()
    Trunk vsans (initializing)             ()
    1 minute input rate 216 bits/sec, 27 bytes/sec, 0 frames/sec
    1 minute output rate 200 bits/sec, 25 bytes/sec, 0 frames/sec
      3667 frames input, 440724 bytes
        0 discards, 0 errors
      3790 frames output, 521116 bytes
        0 discards, 0 errors
    last clearing of "show interface" counters Sun Apr 21 04:54:33 2013

    Interface last changed at Sun Apr 21 05:34:21 2013



Success!  Our VSAN is trunking across the link, a show fcns database helps us verify this:


switch# show fcns database

VSAN 10:
--------------------------------------------------------------------------
FCID        TYPE  PWWN                    (VENDOR)        FC4-TYPE:FEATURE
--------------------------------------------------------------------------
0x0f08d1    NL    21:00:00:0a:60:55:17:69 (Seagate)       scsi-fcp:target
                  [disk]
0x0f08d2    NL    21:00:00:11:d6:3e:ee:2e                 scsi-fcp:target
                  [disk1]




Finally an FCPING helps us verify 100 percent.


switch# fcping pwwn  22:00:00:04:cf:21:a5:2e vsan 10
28 bytes from 22:00:00:04:cf:21:a5:2e time = 1759 usec
28 bytes from 22:00:00:04:cf:21:a5:2e time = 287 usec
28 bytes from 22:00:00:04:cf:21:a5:2e time = 222 usec
28 bytes from 22:00:00:04:cf:21:a5:2e time = 304 usec
28 bytes from 22:00:00:04:cf:21:a5:2e time = 270 usec


At this point we have an FCoE Multihop topology.


If all you came for is how to configure an FCoE Trunk between the two 5k's you can stop reading now as now we are getting into a bit more detail


So the first thing I asked myself is, does this VLAN run spanning tree?


ToSanSWITCH# show spanning-tree vlan 10
Spanning tree instance(s) for vlan does not exist.


 Alright, that answers that question, but what happens if i assign a normal ethernet port to this FCoE VLAN?:

ToSanSWITCH(config)# int eth1/1
ToSanSWITCH(config-if)# switchport access vlan 10
ToSanSWITCH(config-if)# end


What does my show spanning-tree say now?

ToSanSWITCH# show spanning-tree vlan 10

VLAN0010
  Spanning tree enabled protocol rstp
  Root ID    Priority    32778
             Address     547f.eeaf.3a3c
             This bridge is the root
             Hello Time  2  sec  Max Age 20 sec  Forward Delay 15 sec

  Bridge ID  Priority    32778  (priority 32768 sys-id-ext 10)
             Address     547f.eeaf.3a3c
             Hello Time  2  sec  Max Age 20 sec  Forward Delay 15 sec

Interface        Role Sts Cost      Prio.Nbr Type
---------------- ---- --- --------- -------- --------------------------------
Eth1/1           Desg FWD 2         128.129  P2p



So according to this, the VLAN is forwarding on Eth1/1, at first I was quite confused and wondered to myself what kind of effect this would have on the FCoE Traffic, but it occured to me that the ethertype for FCoE Traffic is diffirent, what I imagine would happen is that Eth1/1 would receive copies of the FCoE Frames, but if the device plugged in at Eth1/1 has no idea what those frames are (i.e. doesn't recognise or want to use the ethertype) chances are it will just ignore them.


Regardless, this configuration had no ill effect on my VFC Interface as it was still showing as up:

ToSanSWITCH# show int vfc1
vfc1 is trunking
    Bound interface is Ethernet1/10
    Hardware is Ethernet
    Port WWN is 20:00:54:7f:ee:af:3a:3f
    Admin port mode is E, trunk mode is on
    snmp link state traps are enabled
    Port mode is TE
    Port vsan is 1
    Trunk vsans (admin allowed and active) (10)
    Trunk vsans (up)                       (10)
    Trunk vsans (isolated)                 ()
    Trunk vsans (initializing)             ()



I decided to see if I could convince it to "break", the first thing I tried was adding a VSAN to this interface that was not configured with an appropriate FCoE VLAN, which in this case was VSAN 1

I configured this on both ends:

 
switch(config)# int vfc1
switch(config-if)# switchport trunk allowed vsan add 1


The worst that happens is that the VSAN 1 stays in isolation mode:


switch(config)# int vfc1
switch(config-if)# switchport trunk allowed vsan add 1


 Next, I configured on one of the switches a port facing a server:

interface Ethernet1/1
  switchport mode trunk

  switchport trunk native vlan 20
  spanning-tree port type edge trunk


!

interface vfc10
  bind interface Ethernet1/1
  switchport trunk allowed vsan 10
  no shutdown

!

vsan database
  vsan 10 interface vfc10

!


The server port comes up fine:

switch# show int vfc10
vfc10 is trunking
    Bound interface is Ethernet1/1
    Hardware is Ethernet
    Port WWN is 20:09:54:7f:ee:af:1c:bf
    Admin port mode is F, trunk mode is on
    snmp link state traps are enabled
    Port mode is TF
    Port vsan is 10
    Trunk vsans (admin allowed and active) (10)
    Trunk vsans (up)                       (10)



At this point I am determined to make it my mission to break this FCoE Link as when I first tried to set this up, I had major problems, I want to see what configuration or what order of configuration you have to do for it NOT to work.

Some of you may have noticed that despite what the cisco configuration guide mentions I did NOT have to add these trunk interfaces to my vsan database, i.e. i did NOT need the following configuration:

vsan database
  vsan 10 interface vfc1




I decide to add this config to see if this kills the VFC, it does not:



switch# show int vfc1
vfc1 is trunking (Not all VSANs UP on the trunk)
    Bound interface is Ethernet1/10
    Hardware is Ethernet
    Port WWN is 20:00:54:7f:ee:af:1c:bf
    Admin port mode is E, trunk mode is on
    snmp link state traps are enabled
    Port mode is TE
    Port vsan is 10
    Trunk vsans (admin allowed and active) (1,10)
    Trunk vsans (up)                       (10)



So at this point I have not been able to get the VFC not to work for VSAN 10, it does not work for VSAN 1 but that is because I do not have an equivilant FCoE VLAN for that VSAN, which I can totally understand.

 Next I try making the Ethernet interface carry both LAN and SAN Traffic:


interface Ethernet1/10
  switchport mode trunk
  switchport trunk allowed vlan 10,20

switch# show spanning-tree int eth1/10

Vlan             Role Sts Cost      Prio.Nbr Type
---------------- ---- --- --------- -------- --------------------------------
VLAN0020         Desg FWD 2         128.138  P2p


This does not faze the device and the VFC interface remains up:


switch# show int vfc1
vfc1 is trunking (Not all VSANs UP on the trunk)
    Bound interface is Ethernet1/10
    Hardware is Ethernet
    Port WWN is 20:00:54:7f:ee:af:1c:bf
    Admin port mode is E, trunk mode is on
    snmp link state traps are enabled
    Port mode is TE
    Port vsan is 10
    Trunk vsans (admin allowed and active) (1,10)
    Trunk vsans (up)                       (10)


At this point I decide to up the ante by adding another interface:


interface Ethernet1/20
  switchport mode trunk
  switchport trunk allowed vlan 10,20

!
 interface vfc2
  bind interface Ethernet1/20
  switchport mode E
  no shutdown

!


This new interface comes up just fine:

switch# show int vfc2
vfc2 is trunking (Not all VSANs UP on the trunk)
    Bound interface is Ethernet1/20
    Hardware is Ethernet
    Port WWN is 20:01:54:7f:ee:af:1c:bf
    Admin port mode is E, trunk mode is on
    snmp link state traps are enabled
    Port mode is TE
    Port vsan is 1
    Trunk vsans (admin allowed and active) (1,10)
    Trunk vsans (up)                       (10)
    Trunk vsans (isolated)                 ()
    Trunk vsans (initializing)             (1)



I even get nice load balancing across the links:

switch# show fspf database vsan 10

FSPF Link State Database for VSAN 10 Domain 0xd8(216)
LSR Type                = 1
Advertising domain ID   = 0xd8(216)
LSR Age                 = 341
LSR Incarnation number  = 0x80000013
LSR Checksum            = 0x79e0
Number of links         = 3
 NbrDomainId      IfIndex   NbrIfIndex    Link Type         Cost
-----------------------------------------------------------------------------
    0x0f(15) 0x00040000     0x00040000               1          125
   0x9a(154) 0x001e0000     0x001e0000               1          100
   0x9a(154) 0x001e0001     0x001e0001               1          100

So at this point, everything is working pretty well. I decide to try making the link between the two switches a port-channel:





feature lacp

 
interface Ethernet1/10
  switchport mode trunk
  switchport trunk allowed vlan 10,20
  channel-group 20 mode active

 !
interface Ethernet1/20
  switchport mode trunk
  switchport trunk allowed vlan 10,20
  channel-group 20 mode active

!
interface vfc1
  bind interface port-channel20
  switchport mode E

  switchport trunk allowed vsan  10
  no shutdown

!

The VFC Int is still up and carrying traffic:


switch# show int vfc1
vfc1 is trunking (Not all VSANs UP on the trunk)
    Bound interface is port-channel10
    Hardware is Ethernet
    Port WWN is 20:00:54:7f:ee:af:1c:bf
    Admin port mode is E, trunk mode is on
    snmp link state traps are enabled
    Port mode is TE
    Port vsan is 10
    Trunk vsans (admin allowed and active) (1,10)
    Trunk vsans (up)                       (10)
    Trunk vsans (isolated)                 ()


At this point I don't understand why I had trouble getting the FCoE VE Int to come up the first time but am willing to accept that the feature works perfectly and it must have been some sort of order of operation issue.










CCIE DC: Roll your own OTV

Hi Guys!

So today I played with OTV and managed to get it going on the new Cisco Cloud Services Router (CSR 1000V)

In this blog post I am going to show you the receipe I used to get this going :).

You will need:

  • An ESX Server to run virtual images on
  • Two guests on that VM running Windows or Linux (doesn't really matter)
  • Two Cisco Cloud Services Router VM Guests
  • Nexus 1000V and a VMWare DvSwitch, if no Nexus 1000V is available you can get away with just two DV-Switches :)

Basically, your going to have a topology that looks a little something like this:

Now if you haven't got the Cisco Cloud Services Router or you haven't set that up, which is really the first step you should check out the blog post from INE that explains how to configure it in VMWARE here.

OK, now that's set up. Let's quickly look at our topology.


So basically as you can see from the topology diagram our goal is to get the server 172.28.0.11 talking to 172.28.0.20 even though they are not in the same Layer 2 domain by using the CSR Routers to perform OTV for us :).

Let's quickly look at the initial configuration on the CSR Routers:

CSR 1:
interface GigabitEthernet1
 ip address 172.27.0.1 255.255.255.252
end


CSR 2:

interface GigabitEthernet1
 ip address 172.27.0.2 255.255.255.252
end



Can we ping across these hosts?

CSR1#ping 172.27.0.2
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 172.27.0.2, timeout is 2 seconds:
!!!!!


Yes we can, so now we get onto the next stage.


The next stage involves configuring our dvSwitch and our Nexus 1000V, if we are using the Nexus 1000V basically I create a port profile that trunks up the VLAN to my CSR:

port-profile type vethernet EXTEND_SITE_TRUNK
  vmware port-group
  switchport mode trunk
  switchport trunk allowed vlan 499-501
  no shutdown
  state enabled



VLAN 499 is my VLAN that the server will sit on, so I have created a port profile for that:


port-profile type vethernet EXTEND_VLAN_SERVER_1
  vmware port-group
  switchport access vlan 499
  switchport mode access
  no shutdown
  state enabled


Now my server and my CSR #1 Router can see each other once I assign the port profiles to the appropriate VM's.


Ok now we have to duplicate the same sort of thing on the dvSwitch, because in this example we are trying to make the two hosts be on totally seperate switches that are not connected in any way. To do this I created a DVSwitch with No Uplinks and assigned from there:












As you can see I set the server to use the untagged port group and the Router to use the Trunk Port Group, the reasoning for this will become clear later.


Ok now I have all the basic infrastructure setup, let's get onto the OTV Config.

First thing I need to do on both CSR's is specify a site identifier and a site VLAN (Or in the case of CSR, which is not like configuring OTV on a nexus 7000, I specify a Bridge-domain)

CSR1(config)#otv site bridge-domain 1
CSR1(config-otv-site)#exit

CSR1(config)#otv site-identifier 0x2

And then the same on CSR2 but with a diffirent site Identifier


CSR2(config)#otv site bridge-domain 1
CSR2(config-otv-site)#exit

CSR2(config)#otv site-identifier 0x1


Now we need to configure the actual OTV Interfaces, the first thing to do is go to our join interfaces (which are the 172.27.0.0/30 links we setup previously) and specify IGMP Version 3 to support SSM multicast

CSR1(config)#interface GigabitEthernet1
CSR1(config-if)# ip igmp version 3

!

Now we repeat the above on CSR2 and then we are ready to configure the OTV Interface:

CSR 1:
interface Overlay1
 no ip address
 otv join-interface GigabitEthernet1
 otv use-adjacency-server 172.27.0.1 unicast-only
 otv adjacency-server unicast-only

!

CSR 2:


interface Overlay1
 no ip address
 otv join-interface GigabitEthernet1
 otv use-adjacency-server 172.27.0.1 unicast-only
!


In the above example I have used an adjacency server that lives on CSR, by default OTV wants to use multicast but I had trouble getting this going with multicast, I believe the issue is related to the way I have the CSR's join interfaces connected to each other, so I did this to make life easier for myself for now.

As you can see one router (CSR1) is the adjacency server and I have configured CSR2 to point to this, you can have multiple adjacency servers specified for redundancy if you like.

OK now we need to configure the "extend" interface on our inside of our CSR, which is the interface on VLAN 499 on each switch respectively that heads down towards our end host servers.

Here is the configuration there:

interface GigabitEthernet2
 no ip address
 negotiation auto

 service instance 2 ethernet
  encapsulation dot1q 499
  bridge-domain 200
 !
end



I believe that the CSR has the same restriction that a Nexus 7000 has in terms of you can't have IP addresses assigned to interfaces that your trying to extend over the OTV. If anyone knows for sure if this is possible/not possible please post in the comments section :).

So the most important part of this configuration is where we specify a service instance, this will be how we map our OTV VLAN, as you can see the VLAN number is set here as well as a bridge domain.

If we now go back to our OTV interface:

interface Overlay1
 no ip address
 otv join-interface GigabitEthernet1
 otv use-adjacency-server 172.27.0.1 unicast-only
 service instance 2 ethernet
  encapsulation dot1q 499
  bridge-domain 200
  no shutdown

!
 
end


 Here we are actually mapping that service instance we created, now when we perform a no shut we should see the OTV tunnel come up, here are some useful troubleshooting commands to verify:


CSR1#show otv detail
Overlay Interface Overlay1
 VPN name                 : None
 VPN ID                   : 1
 State                    : UP
 AED Capable              : Yes
 Join interface(s)        : GigabitEthernet1
 Join IPv4 address        : 172.27.0.1
 Tunnel interface(s)      : Tunnel0
 Encapsulation format     : GRE/IPv4
 Site Bridge-Domain       : 1
 Capability               : Unicast-only
 Is Adjacency Server      : Yes
 Adj Server Configured    : Yes
 Prim/Sec Adj Svr(s)      : 172.27.0.1
 OTV instance(s)          : 0
 FHRP Filtering Enabled   : Yes
 ARP Suppression Enabled  : Yes
 ARP Cache Timeout        : 600 seconds



This first command shows us that the AED has been elected (it's us), that our join interface is Gi1 and that our bridge ID is 1, also that we are using an adjacency server and that we ourselves are configured as an adjacency server. If you have errors in here you should revisit your config and make sure your not missing any of the critical commands such as the site bridge domain.

The next useful troubleshooting command is to establish if an ISIS adjacency exists between the hosts:


CSR1#show otv isis neighbors

Tag Overlay1:
System Id      Type Interface   IP Address      State Holdtime Circuit Id
CSR2           L1   Ov1         172.27.0.2      UP    25       CSR1.01 


If the OTV is successful the only state for this should be UP, you will also notice that you can see the hostname of your peer.

CSR1#show otv adjacency
Overlay 1 Adjacency Database
Hostname                       System-ID      Dest Addr       Up Time   State
CSR2                           001e.4925.fc00 172.27.0.2      00:20:11  UP 

!

Here is another helpful command that just helps you verify the connectivity.

So far, it looks like everything is up and we should be good to go, let's do a ping from our host!








Success! Our OTV Tunnel is carrying the traffic, we can do some verification here too (this will also be helpful to you when troubleshooting if it does not work)




CSR1#show otv route

Codes: BD - Bridge-Domain, AD - Admin-Distance,
       SI - Service Instance, * - Backup Route

OTV Unicast MAC Routing Table for Overlay1

 Inst VLAN BD     MAC Address    AD    Owner  Next Hops(s)
----------------------------------------------------------
 0    499  200    0050.5601.face 50    ISIS   CSR2
 0    499  200    0050.56a0.1aad 40    BD Eng Gi2:SI2


As you can see this shows some very useful information, you can see that the above routes are in the ISIS routing table, and what bridge-domain and VLAN they are attached to, also the owner helps you know if its local to you or over the OTV tunnel.

You can also see that it knows that one of the Mac's is actually pointed out interface Gi2:SI2 (Gigabit 2, Service Instance 2)


So this is quite a helpful command. Another useful command is:

CSR1#show otv arp-nd-cache
Overlay1 ARP/ND L3->L2 Address Mapping Cache
BD     MAC            Layer-3 Address  Age (HH:MM:SS) Local/Remote
200    0050.5601.face 172.28.0.11      00:02:46       Remote


This will show you the ARP cache that the router will respond with whenever it see's an ARP request, this is part of the proxy-arp function that OTV performs to ensure that ARP flooding is not occuring too often, this would be a good place to look if you had recently changed IP address or recently changed a host on the OTV and where wondering why you couldn't see it, it could be that the ARP cache is out of date.


Finally, if your using the Nexus 1000V you can also use the following helpful troubleshooting command:

DCNexus1000V# show mac address-table vlan 499
VLAN      MAC Address       Type    Age       Port                           Mod
---------+-----------------+-------+---------+------------------------------+---
499       0050.56a0.115e    static  0         Veth11                         3 
499       0050.56a0.1aad    static  0         Veth2                          3 
499       001e.bd93.82bc    dynamic 15        Veth11                         3 
499       0050.5601.face    dynamic 248       Veth11                         3




So Veth11 is the interface on my Nexus 1000V that goes towards my CSR router, so the fact that I have learnt the mac 0050.5601.face via this interface shows that the traffic is coming over the OTV.


I hope this helps someone out there, especially my fellow CCIE DC candidates.