As I've written about previously (The Importance of BGP NEXT_HOP in L3VPNs), the BGP NEXT_HOP attribute is key to ensuring end to end connectivity in an MPLS L3VPN. In the other article, I examine the different forwarding behavior of the network based on which of the egress PE's IP addresses is used as the NEXT_HOP. In this article I'll look at the subnet mask that's associated with the NEXT_HOP and the differences in forwarding behavior when the mask is configured to different values.

There is a lot of (mis-)information on the web stating that the PE's loopback address β€” which, as I explain in the previous article, should always be used as the NEXT_HOP β€” must have a /32 mask. This is not exactly true. I think this is an example of some information that has been passed around incorrectly, and without proper context, and is now taken as a rule. I'll explain more about this further on in the article.

Example Network

Here's the example network:

Examining the Network

First thing to test is whether R50 can reach R8 at 192.168.100.8.

R50# ping 192.168.100.8 timeout 1
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.100.8, timeout is 1 seconds:
.....
Success rate is 0 percent (0/5)

R50# show ip route 192.168.100.8
Routing entry for 192.168.100.0/24
  Known via "bgp 65535", distance 20, metric 0
  Tag 200, type external
  Last update from 192.168.19.2 00:08:38 ago
  Routing Descriptor Blocks:
  * 192.168.19.2, from 192.168.19.2, 00:08:38 ago
      Route metric is 0, traffic share count is 1
      AS Hops 1
      Route tag 200
      MPLS label: none

The ping fails because end-to-end connectivity is broken. However the routing table has a good route so it appears the issue is somewhere in the data plane.

Inspecting the Routing Information Base (RIB) and Label Forwarding Information Base (LFIB) on R2 reveals something interesting:

R2# show bgp vpnv4 unicast vrf BRANCHES 192.168.100.8
<...>
  Local
    10.1.7.7 (metric 31) from 10.1.7.7 (10.1.7.7)
      Origin incomplete, metric 0, localpref 100, valid, internal, best
      Extended Community: RT:200:1
      mpls labels in/out nolabel/26

The RIB has the proper next-hop of 10.1.7.7 (R7's loopback0) so that all looks fine.

R2# show mpls forwarding-table 10.1.7.7
Local      Outgoing   Prefix         Bytes Label   Outgoing   Next Hop
Label      Label      or Tunnel Id   Switched      interface
28         23         10.1.7.7/32    0             Et0/0      10.2.23.3

R2 has a good LFIB entry for R7's loopback but there's something odd: the mask in the LFIB entry is a /32 instead of a /24 as is configured on R7. Let's look at R6, the "penultimate router" ie, the second-to-last router on the Label Switched Path (LSP), to see what its tables look like.

R6# show mpls forwarding-table 10.1.7.7
Local      Outgoing   Prefix         Bytes Label   Outgoing   Next Hop
Label      Label      or Tunnel Id   Switched      interface
22         No Label   10.1.7.7/32    1813          Et0/1      10.2.67.7

There's a problem: "no label" is not something you want to see on the penultimate router.

R6 is sending R50's packets on towards R7 as straight IP packets, without any MPLS labels. R7 is attempting to route those packets in the global table which fails.

Analysis

First things first: why doesn't R6 have a label for R7's loopback? Did R7 not send one?

R6# show mpls ldp bindings 10.1.7.7 32
  lib entry: 10.1.7.7/32, rev 24
        local binding:  label: 22
        remote binding: lsr: 10.1.3.3:0, label: 23

It does indeed appear that R7 did not send R6 a label for its loopback. If it did, there would be a remote binding from 10.1.7.7 with a label value in the above output.

Let's think carefully about this though. R7's loopback is 10.1.7.7/24 and we're looking at the bindings for 10.1.7.7/32. Is there a binding for the /24?

R6# show mpls ldp bindings 10.1.7.0 24
  lib entry: 10.1.7.0/24, rev 25(no route)
        remote binding: lsr: 10.1.7.7:0, label: imp-null

Yes! That's the output we want to see. But there's another clue here: "no route".

R6# show ip route 10.1.7.0 255.255.255.0 longer-prefixes
<...>
      10.0.0.0/8 is variably subnetted, 14 subnets, 2 masks
O        10.1.7.7/32 [110/11] via 10.2.67.7, 00:45:58, Ethernet0/1

R6 has a route for the /32 but not for the /24 (this helps explain why R2's LFIB entry was also for a /32).

This leads to the discovery that R7 is doing two things:

  1. R7 is advertising an LDP label for the prefix 10.1.7.0/24
  2. R7 is advertising a route via OSPF for 10.1.7.7/32

This has created a mismatch between the two control plane protocols.

If R7's loopback0 is configured as a /24, why is it advertising it as a /32? This directly relates to what I wrote earlier about OSPF being significant as the choice for the IGP. By default, OSPF treats loopback interfaces as type "LOOPBACK" and advertises the directly connected subnet with a /32 mask, regardless of how the interface is configured.

R7# show ip ospf interface loopback0
Loopback0 is up, line protocol is up
  Internet Address 10.1.7.7/24, Area 0, Attached via Network Statement
  Process ID 1, Router ID 10.1.7.7, Network Type LOOPBACK, Cost: 1

Conclusion

Based on this analysis, the correct mask for the PE's loopback interface is: the same mask that is advertised by the IGP.

The correct mask for the PE's loopback interface is: the same mask that is advertised by the IGP.

There are two ways to accomplish this in the example network:

  1. Modify OSPF to advertise the true mask for R7's loopback
  2. Modify the configured mask on R7 to a /32

If OSPF is modified on R7 to advertise a /24 mask, then the following happens:

  • R6 no longer receives a /32 route and the /24 route becomes the best path
  • R7 continues to advertise a label binding for the /24 to R6
  • R6 takes the /24 label binding and the /24 OSPF route and pushes them both into the LFIB
  • R6 now sends labeled packets to R7 and end-to-end connectivity is restored
R6# show ip route 10.1.7.0 255.255.255.0 longer-prefixes
<...>
      10.0.0.0/8 is variably subnetted, 14 subnets, 2 masks
O        10.1.7.0/24 [110/11] via 10.2.67.7, 00:01:02, Ethernet0/1

R6# show mpls ldp bindings 10.1.7.0 24
  lib entry: 10.1.7.0/24, rev 26
        local binding:  label: 17
        remote binding: lsr: 10.1.7.7:0, label: imp-null
        remote binding: lsr: 10.1.3.3:0, label: 22

If the mask configured on R7 is modified to a /32, then the following happens:

  • R7 stops advertising a label binding for the /24 and starts advertising one for the /32
  • R6 takes the existing /32 OSPF route and the newly received /32 label binding and pushes them both into the LFIB
  • R6 now sends labeled packets to R7 and end-to-end connectivity is restored
R6# show mpls ldp bindings 10.1.7.7 32
  lib entry: 10.1.7.7/32, rev 24
        local binding:  label: 22
        remote binding: lsr: 10.1.3.3:0, label: 23
        remote binding: lsr: 10.1.7.7:0, label: imp-null

Bonus Points

You can see why the "rule" stating "MPLS PEs must have a /32 loopback" was passed around. Since loopbacks are very rarely (are they ever?) configured with a prefix length other than /32, following this "rule" would generally keep you out of trouble. However it's not a hard and fast rule, it's just general guidance that should keep you out of trouble.

This whole situation is only likely to be an issue when using OSPF as the IGP in the MPLS core due to its described behavior of advertising loopbacks with a /32 by default. This is the only case where an IGP β€” by default β€” will not advertise a connected subnet with its true mask.


Disclaimer: The opinions and information expressed in this blog article are my own and not necessarily those of Cisco Systems.