Saturday, February 2, 2013

CCIE DC Multicast Part 1.

Hi Guys!

As we all wait anxiously for the training vendors to release Rack Rentals (Come on guys! At least give us a FIRM date so we can plan appropriately!) I think it's a good idea to look at things you can study on the cheap that are going to end up in the exam.

One of those things is Multicast, the blueprint specifically mentions there will be multicast troubleshooting and I would bet bottom dollar it's to do with the OTV use of multicast, which means Nexus 7000, which Means PIM. We can test PIM in GNS, we can test PIM on our own home gear, OK the Syntax is not going to match between the platforms all the time, but the CONCEPTS will and CONCEPTS are what pass exams and lead to you being an expert!

The very best multicast book i have read thus far, and where i got most of the information for this blog post can be found at the end of the blog post :).



In a first for this blog post, I am going to provide the GNS3 .net file I used to write these tutorials so you can follow along at home or try your own experiments :). The GNS files can be found
Here. Gns3 itself can be found at www.gns3.net.

Note that I emulated 7200 routers when I did this because the thing about 7200 routers is that in GNS you can run quite recent IOS Images on it, no I will not provide any IOS Images that you need to run it so please don't ask, get yourself a 7200 IOS image OR modify the template to use diffirent routers :).


The topology is shown below





Before we go any further, let's take a super quick look at multicast and the diffirent options we are talking about.


A quick multicast review is in order:

Multicast Basics:
One packet, copied to multiple receivers from a single source (one to many), and always UDP because acknowledgements can't be sent (responses are never sent to multicast traffic)

224.0.0.0 to 239.255.255.255 is set aside for multicast.

anything in the 224.0.0.X range is reserved and no matter what these packets are NEVER FORWARDED outside the local subnet, they are strictly internal.

224.0.1.X is also a reserved range but does get forwarded.

Two very good examples for this range (224.0.1.X) that we will talk about later are 224.0.1.39 and 224.0.1.40 which are used for Cisco AUTO RP Discovery.

The 239.0.0.0/8 range has been set aside for us engineers to use for our own multicast applications.


Multicast Routing:

Multicast routing works almost in reverse to traditional routing where by the SOURCE of the traffic is most important. There are a few multicast routing protocols around but for our CCIE DC, We use PIM (Protocol Independent Multicast),

PIM Relies on the unicast routing table already on the router to evaluate it's Reverse Path Forward (RPF) Checks. Hence the term Protocol-independent: Your unicast routing protocol does not matter, PIM just uses the information in the unicast routing table to determine RPF.

As traffic travels from a source of multicast traffic (like a video application or music on hold service), the stream travels down the PIM domain from the very top (Source) to all the receivers, due to the fac that loops are avoided, the resultant path the multicast traffic takes from source to receivers resembles a tree, with the root being the source of the traffic.

PIM:
There are two versions of PIM, PIM Sparse Mode and Dense Mode, the major difference is in the behavior when forwarding multicast traffic, PIM Dense mode assumes that EVERYONE in the network (or rather, the PIM Domain) wants to hear from a multicast source unless specifically told otherwise, PIM Sparse mode assumes you DON'T want to forward multicast unless you specifically know that you have receivers listening for it.

PIM Dense mode is quite a simple and straight forward protocol, but leads to quite a bit of waste, and unfortunately from an exam taking point of view, The Nexus platform only supports PIM Sparse Mode. So that's what we will brush up on.


PIM Sparse Mode

PIM Sparse mode is rooted in the concept of a Rendezvous point. (RP).   Earlier in the multicast routing section we talked about the concept of a "tree", with it's root at the "source" of the multicast traffic. But how do you get the routers in a PIM domain to forward the multicast traffic IF by default they are configured not to forward it? A receiver might say "I want to join the multicast group 239.1.1.1", but then how does the router closest to him (which we will call the "last hop router" as it's the closest router to the receiver) know where to send his PIM JOIN msgs (which tell his upstream routers he wants to start receiving multicast for this group), Where do the next hop routers send the PIM Join msg? The Rendezvous point solves this problem. All routers indicate there desire to join a particular multicast stream to the rendezvous Point, and all sources of traffic first somehow deliver traffic to the RP To have the RP Forward the traffic for them. This is known as a SHARED Tree

Key Concept: Shared Tree Vs Shortest Path Tree
 

However things get a little bit more complicated...

 (Refer to diagram)


In our example, let's say that Source1 is sending multicast that Receiver1 wants to listen to, in our above example, Receiver1 would send a msg to the RP saying "I want to listen to traffic to 239.1.1.1",  Source1 would have it's mcast packet delivered to the RP via some method (more on this later), but if you look at the diagram, you can see that this means the traffic must flow like this:
As you can see this follows a very inefficient path, why can't our multicast stream just travel straight from Source1Receiver2, through PIM1, To PIM2 then to our receiver.

The Answer is: It can, once all the devices in the path know that there is a Source and Receiver, they will switch to a shortest path tree (SPT)


Now that PIM2 knows there is a multicast source out there, he can send a JOIN msg up the PIM domain towards the source, with routers along the way letting him join.

Let's watch this in action.
The default configuration on our router is that we don't have ANY RP's specified on any of the routers, all routers have full reachability via OSPF and are neighbors via PIM (Except for the Two Edge Routers Source1Receiver2 and SOurce2Receiver1 because they are just our multicast source/destinations :))

Let's see what happens when we join a group

Source2Receiver1(config)#int gi1/0
Source2Receiver1(config-if)#ip igmp join-group 239.1.1.1


We then run a debug ip pim on PIM2:

*Feb  2 18:29:28.539: IGMP(0): Received v2 Report on GigabitEthernet2/0 from 2.2.2.1 for 239.1.1.1
*Feb  2 18:29:28.543: IGMP(0): Received Group record for group 239.1.1.1, mode 2 from 2.2.2.1 for 0 sources
*Feb  2 18:29:28.543: IGMP(0): WAVL Insert group: 239.1.1.1 interface: GigabitEthernet2/0Successful
*Feb  2 18:29:28.547: IGMP(0): Switching to EXCLUDE mode for 239.1.1.1 on GigabitEthernet2/0
*Feb  2 18:29:28.547: IGMP(0): Updating EXCLUDE group timer for 239.1.1.1
*Feb  2 18:29:28.547: IGMP(0): MRT Add/Update GigabitEthernet2/0 for (*,239.1.1.1) by 0
*Feb  2 18:29:28.555: PIM(0): Building Triggered (*,G) Join / (S,G,RP-bit) Prune message for 239.1.1.1


What we can tell from the above is that PIM2 has received the IGMP join for the particular group, but this hasn't generated a message to the RP because there is no RP


Let's check out the ip mroute table from PIM2:

PIM2#show ip mroute
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
       L - Local, P - Pruned, R - RP-bit set, F - Register flag,
       T - SPT-bit set, J - Join SPT, M - MSDP created entry, E - Extranet,
       X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
       U - URD, I - Received Source Specific Host Report,
       Z - Multicast Tunnel, z - MDT-data group sender,
       Y - Joined MDT-data group, y - Sending to MDT-data group,
       V - RD & Vector, v - Vector
Outgoing interface flags: H - Hardware switched, A - Assert winner
 Timers: Uptime/Expires
 Interface state: Interface, Next-Hop or VCD, State/Mode

(*, 239.1.1.1), 00:02:14/00:02:47, RP 0.0.0.0, flags: SJC
  Incoming interface: Null, RPF nbr 0.0.0.0  Outgoing interface list:
    GigabitEthernet2/0, Forward/Sparse, 00:02:14/00:02:47



From the above you can see that PIM2 will forward all traffic for the group 239.1.1.1 to the outgoing interface Gi2/0, but  there is no incoming traffic it has seen for this so far so the incoming interface is NULL. The Flags: SJC are also worth understanding, the S simply stands for Sparse and indicates that the group is a sparse mode group, (we kinda already knew that ;), the C means that there is actually a receiver connected to one of our interfaces, so we know when looking at this that there is a receiver attached directly to us. The J Flag I left until last because it's the most interesting bit:

The J Flag says that, as soon as this router see's traffic come in from a source for this particular group, it will straight away after just one packet, switch to a SPT Tree by sending a join message up the PIM domain to that particular source. This is all based off a threshold called the SPT-Threshold, and dictates how many packets must be received before the Shared Tree switches to a source-based tree, the default is 0, which means that as soon as a single packet is received for that mcast group the tree is changed to a source-based tree straight away.

Let's see what happens if we where to try and generate Traffic to 239.1.1.1:

Source1Receiver2#ping 239.1.1.1
Type escape sequence to abort.
Sending 1, 100-byte ICMP Echos to 239.1.1.1, timeout is 2 seconds:
.

No Dice: the traffic won't reach our receivers, let's go to PIM1 and have a look at what he thinks of all this:

PIM1#show ip mroute
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
       L - Local, P - Pruned, R - RP-bit set, F - Register flag,
       T - SPT-bit set, J - Join SPT, M - MSDP created entry, E - Extranet,
       X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
       U - URD, I - Received Source Specific Host Report,
       Z - Multicast Tunnel, z - MDT-data group sender,
       Y - Joined MDT-data group, y - Sending to MDT-data group,
       V - RD & Vector, v - Vector
Outgoing interface flags: H - Hardware switched, A - Assert winner
 Timers: Uptime/Expires
 Interface state: Interface, Next-Hop or VCD, State/Mode

(*, 224.0.1.40), 00:01:46/00:02:39, RP 0.0.0.0, flags: DCL
  Incoming interface: Null, RPF nbr 0.0.0.0
  Outgoing interface list:
    GigabitEthernet1/0, Forward/Sparse, 00:01:46/00:02:39














The only entry our friend PIM1 Has is for the 224.0.1.40 (Auto-RP Discovery listening) group, he doesn't even see the traffic source for 239.1.1.1, this is because he has no RP to send the traffic to, so refuses to pass it on.

Let's delve further, first, On PIM2 (Closest to the receiver) let's assign an RP after stopping listening to the multicast on the receiver1:

Source2Receiver1(config)#int gi1/0
Source2Receiver1(config-if)#no ip igmp join-group 239.1.1.1


Clear the mroute on PIM2:

PIM2#clear ip mroute *
PIM2#show ip mroute
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
       L - Local, P - Pruned, R - RP-bit set, F - Register flag,
       T - SPT-bit set, J - Join SPT, M - MSDP created entry, E - Extranet,
       X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
       U - URD, I - Received Source Specific Host Report,
       Z - Multicast Tunnel, z - MDT-data group sender,
       Y - Joined MDT-data group, y - Sending to MDT-data group,
       V - RD & Vector, v - Vector
Outgoing interface flags: H - Hardware switched, A - Assert winner
 Timers: Uptime/Expires
 Interface state: Interface, Next-Hop or VCD, State/Mode

(*, 224.0.1.40), 00:00:01/00:02:58, RP 0.0.0.0, flags: DPL
  Incoming interface: Null, RPF nbr 0.0.0.0
  Outgoing interface list: Null







Specify a loopback on the RP Router to act as our RP:

RP(config-if)#int lo1
RP(config-if)#ip add 3.3.3.3

Specify this as our RP on our PIM2 Router:

 PIM2(config)#ip pim rp-address 3.3.3.3
PIM2(config)#
*Feb  2 18:58:56.447: PIM(0): Initiating register encapsulation tunnel creation for RP 3.3.3.3
*Feb  2 18:58:56.455: PIM(0): Initial register tunnel creation succeeded for RP 3.3.3.3
*Feb  2 18:58:56.459: PIM(0): Check RP 3.3.3.3 into the (*, 224.0.1.40) entry
*Feb  2 18:58:56.583: PIM(0): Building Triggered (*,G) Join / (S,G,RP-bit) Prune message for 224.0.1.40
*Feb  2 18:58:57.487: %LINEPROTO-5-UPDOWN: Line protocol on Interface Tunnel0, changed state to up


Some of you may be asking: What the heck just happened? Why have I now got a tunnel to the RP? What does this do? The Reason is simple: Remember that our Sparse Mode Routers will NOT forward multicast traffic unless they know for sure someone has asked to join it, so if you are the router thats directly connected to the source of multicast traffic, how do you get that multicast traffic to the Rendezvous point? You encapsulate it and send it inside a tunnel that you have established to the RP!


Let's see what happens when we tell the receiver to join the group again:

Source2Receiver1(config)#int gi1/0
Source2Receiver1(config-if)#ip igmp join-group 239.1.1.1


Here we go!
Pim2:

*Feb  2 19:01:33.755: PIM(0): Insert (*,239.1.1.1) join in nbr 10.2.0.2's queue
*Feb  2 19:01:33.767: PIM(0): Building Join/Prune packet for nbr 10.2.0.2
*Feb  2 19:01:33.771: PIM(0):  Adding v2 (3.3.3.3/32, 239.1.1.1), WC-bit, RPT-bit, S-bit Join
*Feb  2 19:01:33.771: PIM(0): Send v2 join/prune to 10.2.0.2 (GigabitEthernet1/0)

We have just sent a join message to the RP! We now know where the shared tree is, so we have now sent a message towards the RP saying hey, I have a receiver for this group 239.1.1.1, so as soon as you get traffic for it, send it to me.

Let's Examine the Entry:

Pim 2:
Outgoing interface flags: H - Hardware switched, A - Assert winner
 Timers: Uptime/Expires
 Interface state: Interface, Next-Hop or VCD, State/Mode

(*, 239.1.1.1), 00:01:14/00:02:26, RP 3.3.3.3, flags: SJC
  Incoming interface: GigabitEthernet1/0, RPF nbr 10.2.0.2
  Outgoing interface list:
    GigabitEthernet2/0, Forward/Sparse, 00:01:14/00:02:26

Wow I have highlighted a lot in the above, that's because there is plenty going on.

First of all, let's start with this:
(*, 239.1.1.1)

When looking at multicast routing, all "Routes" will have an entry that looks like this, the * indicates that the source could be anything, because this is a SHARED TREE with the root being the rendezvous point, a Shortest Path Tree will have an IP Address of the actual source of multicast traffic here, but for us we have the * as ours is a shared Tree

Key Concept: Shared Trees have the notation (*,G (for Group), Shortest path Trees have the notation (S,G) (where S is the source)


The next highlighted section shows us the RP's Address, in a shared tree this part is particuraly important, this shows us what the router considers to be the root of the tree, and the next highlighted section, the "RPF NBR" shows us what route PIM has determined the unicast table uses to reach that RP!

Let's check it ourselves:

PIM2#show ip route 3.3.3.3
Routing entry for 3.3.3.3/32
  Known via "ospf 1", distance 110, metric 2, type intra area
  Last update from 10.2.0.2 on GigabitEthernet1/0, 00:10:34 ago
  Routing Descriptor Blocks:
  * 10.2.0.2, from 10.1.0.2, 00:10:34 ago, via GigabitEthernet1/0
      Route metric is 2, traffic share count is 1

So, the RP is located off interface Gi1/0 on this router, since this is a shared tree, the traffic MUST come from the RP, therefore the incoming interface (even though we haven't even received any multicast traffic yet from the source) MUST be Gi1/0.

Still with me? The final entry is the outgoing interface list, so when we DO receive some traffic for this multicast group, this shows where we will forward it.




 Let's see what our RP Thinks:



RP(config-if)#
*Feb  2 19:01:33.923: %PIM-6-INVALID_RP_JOIN: Received (*, 239.1.1.1) Join from 10.2.0.1 for invalid RP 3.3.3.3
*Feb  2 19:03:31.319: %PIM-6-INVALID_RP_JOIN: Received (*, 239.1.1.1) Join from 10.2.0.1 for invalid RP 3.3.3.3

Oh Dear, our RP is not particularly happy, this is because it has received a PIM message telling it that it is the RP, but it itself does not know it's the RP, let's help it out (You must tell the router itself that it's an RP!)

RP(config)#ip pim rp-address 3.3.3.3

Things start to look a little more interesting:


RP#show ip mroute
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
       L - Local, P - Pruned, R - RP-bit set, F - Register flag,
       T - SPT-bit set, J - Join SPT, M - MSDP created entry, E - Extranet,
       X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
       U - URD, I - Received Source Specific Host Report,
       Z - Multicast Tunnel, z - MDT-data group sender,
       Y - Joined MDT-data group, y - Sending to MDT-data group,
       V - RD & Vector, v - Vector
Outgoing interface flags: H - Hardware switched, A - Assert winner
 Timers: Uptime/Expires
 Interface state: Interface, Next-Hop or VCD, State/Mode

(*, 239.1.1.1), 00:00:00/00:03:29, RP 3.3.3.3, flags: S
  Incoming interface: Null, RPF nbr 0.0.0.0
  Outgoing interface list:
    GigabitEthernet2/0, Forward/Sparse, 00:00:00/00:03:29
So, the RP still has not actually received any multicast traffic from a source for the group 239.1.1.1, so our RP stays in slumber mode, waiting to receive some multicast traffic.


For the sake of making you understand more about shared trees, we are going to turn off the feature in the routers that makes them switch from a shared tree to a source based Tree. This can be accomplished by adjusting the SPT Threshold we mentioned earlier, let's do that on our key routers:

PIM2(config)#ip pim spt-threshold infinity
PIM2(config)#end
PIM2#show ip mroute
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
       L - Local, P - Pruned, R - RP-bit set, F - Register flag,
       T - SPT-bit set, J - Join SPT, M - MSDP created entry, E - Extranet,
       X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
       U - URD, I - Received Source Specific Host Report,
       Z - Multicast Tunnel, z - MDT-data group sender,
       Y - Joined MDT-data group, y - Sending to MDT-data group,
       V - RD & Vector, v - Vector
Outgoing interface flags: H - Hardware switched, A - Assert winner
 Timers: Uptime/Expires
 Interface state: Interface, Next-Hop or VCD, State/Mode

(*, 239.1.1.1), 00:21:50/00:02:47, RP 3.3.3.3, flags: SC
  Incoming interface: GigabitEthernet1/0, RPF nbr 10.2.0.2
  Outgoing interface list:
    GigabitEthernet2/0, Forward/Sparse, 00:21:50/00:02:47




For good measure, we also need to tell PIM1 where to find the rendevous point, all routers in the PIM domain path must know how to get to the Rendevous Point.


As you can see from the above, the J Flag is now no longer present, let's generate some multicast traffic and watch what happens!


Here comes the ping:

Source1Receiver2#ping 239.1.1.1
Type escape sequence to abort.
Sending 1, 100-byte ICMP Echos to 239.1.1.1, timeout is 2 seconds:

Reply to request 0 from 2.2.2.1, 188 ms

We have a response from our listener! Let's see what happened


First, on PIM1 (the router closest to the source of traffic)


*Feb  2 19:31:17.835: PIM(0): Received v2 Join/Prune on GigabitEthernet2/0 from 10.1.0.2, to us
*Feb  2 19:31:17.843: PIM(0): Join-list: (1.1.1.1/32, 239.1.1.1), S-bit set
*Feb  2 19:31:17.847: PIM(0): Check RP 3.3.3.3 into the (*, 239.1.1.1) entry
*Feb  2 19:31:17.855: PIM(0): Building Triggered (*,G) Join / (S,G,RP-bit) Prune message for 239.1.1.1
*Feb  2 19:31:17.867: PIM(0): Adding register encap tunnel (Tunnel0) as forwarding interface of (1.1.1.1, 239.      1.1.1).
*Feb  2 19:31:17.875: PIM(0): Add GigabitEthernet2/0/10.1.0.2 to (1.1.1.1, 239.1.1.1), Forward state, by PIM S      G Join


You can see that the router is saying it will forward the multicast traffic via the tunnel it has connected to the RP.


On the RP, you have the following show ip mroute output:

RP#show ip mroute
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
       L - Local, P - Pruned, R - RP-bit set, F - Register flag,
       T - SPT-bit set, J - Join SPT, M - MSDP created entry, E - Extranet,
       X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
       U - URD, I - Received Source Specific Host Report,
       Z - Multicast Tunnel, z - MDT-data group sender,
       Y - Joined MDT-data group, y - Sending to MDT-data group,
       V - RD & Vector, v - Vector
Outgoing interface flags: H - Hardware switched, A - Assert winner
 Timers: Uptime/Expires
 Interface state: Interface, Next-Hop or VCD, State/Mode

(*, 239.1.1.1), 00:03:59/00:02:43, RP 3.3.3.3, flags: S
  Incoming interface: Null, RPF nbr 0.0.0.0
  Outgoing interface list:
    GigabitEthernet2/0, Forward/Sparse, 00:03:43/00:02:43

(1.1.1.1, 239.1.1.1), 00:03:59/00:02:03, flags: T
  Incoming interface: GigabitEthernet1/0, RPF nbr 10.1.0.1
  Outgoing interface list:
    GigabitEthernet2/0, Forward/Sparse, 00:03:43/00:02:43


So this is a bit confusing, you can see we have two Tree's, one of which is a Shortest Path Tree for 1.1.1.1, 239.1.1.1, and one which is the shared tree (*, 239.1.1.1)

The Diagram below might help explain:



Once the first packet is received over the unicast tunnel from the source (PIM) to the RP, the RP would rather this traffic be delivered via multicast, so the RP sends a join message up the tree towards the source and has it's very own little Shortest Path Tree (SPT) back to the source, that's what this entry is:

(1.1.1.1, 239.1.1.1), 00:03:59/00:02:03, flags: T
  Incoming interface: GigabitEthernet1/0, RPF nbr 10.1.0.1
  Outgoing interface list:
    GigabitEthernet2/0, Forward/Sparse, 00:03:43/00:02:43



Then, a shared tree exists to the receiver PIM2:

(*, 239.1.1.1), 00:03:59/00:02:43, RP 3.3.3.3, flags: S
  Incoming interface: Null, RPF nbr 0.0.0.0
  Outgoing interface list:
    GigabitEthernet2/0, Forward/Sparse, 00:03:43/00:02:43



So in the diagram above, the red line is the shortest path tree (SPT), the blue line is the shared tree.

During all of this though, i noticed something strange....

When pinging the first time, all was well and i received a reply.. but when pinging a second time i received multiple responses...:

Source1Receiver2#ping 239.1.1.1
Type escape sequence to abort.
Sending 1, 100-byte ICMP Echos to 239.1.1.1, timeout is 2 seconds:

Reply to request 0 from 2.2.2.1, 172 ms
Source1Receiver2#ping 239.1.1.1
Type escape sequence to abort.
Sending 1, 100-byte ICMP Echos to 239.1.1.1, timeout is 2 seconds:

Reply to request 0 from 2.2.2.1, 100 ms
Reply to request 0 from 2.2.2.1, 120 ms


Why was i receiving two responses? I recalled from the excellent book (of which there are links to purchase from Amazon at the bottom of my post) that multicast has some general rules, for us here is the relevant one:

"When a new (S,G) entry is created, it's outgoing interface list is initially populated with a copy of the outgoing interface list from it's parent (*,G) Entry"

A Ha! Let's look at the RP Routers multicast table:

RP#show ip mroute
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
       L - Local, P - Pruned, R - RP-bit set, F - Register flag,
       T - SPT-bit set, J - Join SPT, M - MSDP created entry, E - Extranet,
       X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
       U - URD, I - Received Source Specific Host Report,
       Z - Multicast Tunnel, z - MDT-data group sender,
       Y - Joined MDT-data group, y - Sending to MDT-data group,
       V - RD & Vector, v - Vector
Outgoing interface flags: H - Hardware switched, A - Assert winner
 Timers: Uptime/Expires
 Interface state: Interface, Next-Hop or VCD, State/Mode

(*, 239.1.1.1), 00:31:07/00:03:05, RP 3.3.3.3, flags: S
  Incoming interface: Null, RPF nbr 0.0.0.0
  Outgoing interface list:
    GigabitEthernet2/0, Forward/Sparse, 00:30:52/00:03:05

(1.1.1.1, 239.1.1.1), 00:03:00/00:00:11, flags: T
  Incoming interface: GigabitEthernet1/0, RPF nbr 10.1.0.1
  Outgoing interface list:
    GigabitEthernet2/0, Forward/Sparse, 00:03:00/00:03:05



So even though the Shortest Path Tree (SPT) is meant to end at the RP, it does not because the outgoing interface list for the (S,G) Entry (1.1.1.1,239.1.1.1) is copied from the (*,G) Entry, which includes interface Gi2/0, hence why we receive two replies to our ping! Both Multicast entries are being used to route the packet, so a copy is being received twice.

This is obviously not usual behavior and is simply a consequence of the fact that we have turned off the spt-threshold so that we don't use a proper, shortest path bridge.. so with that in mind, let's turn the spt-threshold back to the default.


RP(config)#no ip pim spt-threshold infinity
 

(Repeat on all routers)

Let's take a look what happens now when we generate some traffic to that multicast group.

Source1Receiver2#ping 239.1.1.1
Type escape sequence to abort.
Sending 1, 100-byte ICMP Echos to 239.1.1.1, timeout is 2 seconds:

Reply to request 0 from 2.2.2.1, 60 ms
Source1Receiver2#ping 239.1.1.1
Type escape sequence to abort.
Sending 1, 100-byte ICMP Echos to 239.1.1.1, timeout is 2 seconds:

Reply to request 0 from 2.2.2.1, 64 ms
Source1Receiver2#ping 239.1.1.1
Type escape sequence to abort.
Sending 1, 100-byte ICMP Echos to 239.1.1.1, timeout is 2 seconds:

Reply to request 0 from 2.2.2.1, 60 ms


So we can see now.. one ping, one response, just like it should be, let's take a look at the multicast routing tables:



PIM2#show ip mroute
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
       L - Local, P - Pruned, R - RP-bit set, F - Register flag,
       T - SPT-bit set, J - Join SPT, M - MSDP created entry, E - Extranet,
       X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
       U - URD, I - Received Source Specific Host Report,
       Z - Multicast Tunnel, z - MDT-data group sender,
       Y - Joined MDT-data group, y - Sending to MDT-data group,
       V - RD & Vector, v - Vector
Outgoing interface flags: H - Hardware switched, A - Assert winner
 Timers: Uptime/Expires
 Interface state: Interface, Next-Hop or VCD, State/Mode

(*, 239.1.1.1), 00:01:11/stopped, RP 3.3.3.3, flags: SJC
  Incoming interface: GigabitEthernet1/0, RPF nbr 10.2.0.2
  Outgoing interface list:
    GigabitEthernet2/0, Forward/Sparse, 00:01:11/00:02:51

(1.1.1.1, 239.1.1.1), 00:01:03/00:01:56, flags: JT
  Incoming interface: GigabitEthernet3/0, RPF nbr 10.0.0.1
  Outgoing interface list:
    GigabitEthernet2/0, Forward/Sparse, 00:01:03/00:02:51


Check this out! You can see now that there is a (S,G) (remember, shortest path  tree) for 1.1.1.1,239.1.1.1). The incoming interface is Gi3/0, which faces towards the PIM1 Router! So now our multicast traffic is NOT being sourced from the RP but rather we received a single frame from the RP, realised what the source is and therefore built a more effective tree.

This can be confirmed on router PIM1:


PIM1#show ip mroute
IP Multicast Routing Table
Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
       L - Local, P - Pruned, R - RP-bit set, F - Register flag,
       T - SPT-bit set, J - Join SPT, M - MSDP created entry, E - Extranet,
       X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,
       U - URD, I - Received Source Specific Host Report,
       Z - Multicast Tunnel, z - MDT-data group sender,
       Y - Joined MDT-data group, y - Sending to MDT-data group,
       V - RD & Vector, v - Vector
Outgoing interface flags: H - Hardware switched, A - Assert winner
 Timers: Uptime/Expires
 Interface state: Interface, Next-Hop or VCD, State/Mode

(*, 239.1.1.1), 00:04:20/stopped, RP 3.3.3.3, flags: SPF
  Incoming interface: GigabitEthernet2/0, RPF nbr 10.1.0.2
  Outgoing interface list: Null

(1.1.1.1, 239.1.1.1), 00:01:15/00:02:14, flags: FT
  Incoming interface: GigabitEthernet1/0, RPF nbr 0.0.0.0, Registering  Outgoing interface list:
    GigabitEthernet3/0, Forward/Sparse, 00:01:15/00:02:14

As you can see from the above output, we now have a SPT tree for the (S,G) 1.1.1.1, 239.1.1.1. it's incoming interface is as we expect, and the outgoing interface is now Gi3/0, so if we check out our diagram...



The yellow line shows our efficient multicast delivery, we only use the shared tree to learn the source of the multicast, once we know the source we create an SPT Tree back to it. This is known as the SPT-Switchover.

Before we go any further you must be 100 percent confident with the above concepts, if you think it was complicated before.. you ain't seen nothing yet. Next we will look at Bidir PIM and source-specific multicast, but it's crucially important you understand the way the multicast traffic travels before we can explain the rest.


As I mentioned at the start of this little tutorial, I learnt all about multicast during my routing and switching CCIE, The most useful books for me during that period was the CCIE routing and switching exam certification guide, and specifically to multicast the Developing IP Multicast network book shown below, although it's a little bit of an older book, the guy who writes it peppers it with humor and wit, and the content is explained exceptionally well, if I learnt to understand multicast with it anyone should be able to. It's a great book, if you enjoyed this blog post and found it useful, please consider purchasing it from one of the links below :)

Multicast Book:
Hard Copy:




Kindle Version:




Routing and Switching Book:

Kindle Version:


Hard Copy:

 












11 comments:

  1. Great post, thanks for doing that!!!

    ReplyDelete
  2. great post.. loved it

    ReplyDelete
  3. Great post. I want to try this but config file is inactive. Any other links for the conf file?

    ReplyDelete
  4. Now this was a pleasant read. The way you went through each part of how both shared tres and SPTs are built were simply great. I also learned this stuff back when I was studying for the R&S CCIE, but I liked how you came up with a simple, yet very descriptive topology to show all the concepts in PIM SM.

    ReplyDelete
  5. It would help if you put the interfaces on your diagrams.

    ReplyDelete
  6. Please help to download the GNS3 files

    ReplyDelete
  7. what's the user pass for the ftp

    ReplyDelete