2011
Sep 20

Virtualizing the OpenBSD Routing Table

Introduction

The OpenBSD routing table can be carved into multiple virtual routing tables allowing complete logical separation of attached networks. This article gives a brief overview of rtables and explains how to successfully leak traffic between virtual routing domains.

The ability to virtualize the routing table in OpenBSD first appeared in version 4.6. Since then the functionality has matured nicely with support for virtual routing tables now present in userland tools such as dhclient(8) and dhcpd(8) and in the routing protocol daemons ripd(8), ospfd(8), and bgpd(8). Kernel side, pf(4) has been extended to handle filtering of packets based on the routing table they came in on as well as being able to move packets between routing tables. This article will concentrate on the latter with examples of how to setup separate routing tables and leak traffic between them successfully.

Using separate routing tables is similar to using VRFs in Cisco IOS or routing instances in Juniper’s JUNOS. Multiple routing tables are created each of which contain their own forwarding and ARP information. In OpenBSD, each routing table is called an “rtable”. Network interfaces can be bound to an rtable which causes traffic going through the interface to be forwarded based on the information present in that rtable. When one or more interfaces are bound to an rtable, the rtable and all of the interfaces bound to it are called a routing domain, or “rdomain”.

Basic Configuration

Creating an rtable is done using route(8) with the -T argument.

# route -T 1 add 0.0.0.0/0 192.168.1.1

This creates rtable 1 if it doesn’t already exist and adds a default route to it.

Interfaces are bound to an rtable using ifconfig(8) with the “rdomain” keyword.

# ifconfig vic1 rdomain 1

This binds the vic1 interface to rtable 1.

To execute a command within a non-default rtable, use the route(8) command with the exec keyword.

$ route -T 1 exec telnet 10.5.3.29

This executes the telnet command within rtable 1. Certain commands such as ping(8) and arp(8) have their own command line arguments that will place them into an rtable (the -V argument in this case).

Setting up rdomains

By default, all interfaces on an OpenBSD host belong to rdomain 0. Traffic can flow freely between all interfaces (assuming the pf(4) ruleset allows it) without any special handling. Similarly, traffic can flow between all interfaces in the same non-default routing domain without any special handling (again, as long as the pf(4) ruleset passes this traffic).

Two OpenBSD Routing Domains

In this network, Host 1 and Host 2 both belong to rdomain 1. Routing domain 1 has routes to the 192.168.1/24 and 172.16.0/24 networks because they are directly attached so traffic between the two is forwarded without any special consideration. Host 1 and 2 cannot talk to Host 0 because Host 0 is connected to a separate routing domain.

As shown in the picture, pf(4) is used to connect routing domains. This is really powerful because pf(4) allows for very fine-grained packet matching which means you can be as specific or broad as you want when it comes to what traffic you want to pass between rdomains. Sending traffic between rdomains is done by using the rtable keyword in pf.conf.

pass in on vic1 to 172.16.2.0/24 rtable 0
pass out on vic0

This is the basic ruleset needed to allow Host 1 to initiate a connection to Host 0.

The rtable must be specified on the rule that matches traffic inbound to the OpenBSD router. As stated in the pf.conf(5) man page, the resulting route lookup will only work correctly if the rtable is specified on the inbound rule. This ruleset is not enough for traffic to flow bidirectionally. We also have to look at the routing entries within the source and destination routing domains.

The source routing domain, in this case rdomain 1, is easy. pf(4) will magically handle taking the packets out of rdomain 1 and sending them to rdomain 0 — we do not need a route for 172.16.2/24 in rtable 1. Reverse traffic is different. Routing domain 0 requires a route be present for 192.168.1/24. The next-hop for this route isn’t really important, what’s important is that it’s present in the rtable. If a route isn’t present, then the route lookup will fail before pf(4) has a chance to move the packet into rdomain 1 and the return traffic will be dropped. Note that the route doesn’t have to be exactly 192.168.1/24, it could be 192.168/16 or even 0.0.0.0/0 — the important part is that there is some kind of route in rtable 0 that will match the network in rdomain 1.

# route -T 0 add 192.168.1/24 -iface 172.16.2.137

This is kind of a cheat. It creates a route for 192.168.1/24 as a connected route on the rdomain 0 interface. Obviously this isn’t correct, but it doesn’t really matter. It achieves the goal of getting a route into rtable 0. Host 1 can now successfully talk to Host 0.

An alternative to creating a “connected” route is to set the next-hop of the 192.168.1/24 route to the loopback IP.

# route -T 0 add 192.168.1/24 127.0.0.1

The loopback interface provides a really convenient place to point your reverse path routes.

The caveat with this is that pf(4) must be active on the loopback interface you create. The default pf.conf ruleset contains “set skip on lo” which disables pf(4) on each loopback interface and will result in return traffic being dropped. Be sure that your loopback isn’t being “skipped”.

The same idea works between two non-default routing domains.

Three OpenBSD Routing Domains

Creating a loopback interface in rdomain 2 so that Host 1 can talk to Host 2 would look like:

# ifconfig lo2 rdomain 2 127.0.0.1
# route -T 2 add 192.168.1/24 127.0.0.1

Since lo2 is created inside rdomain 2, the IP address assigned to it doesn’t conflict with lo0 in rdomain 0.

Another caveat with the pf(4) ruleset is that the states that get created by the rule that specifies the rtable must be “floating”.

If you’ve changed the “state-policy” option in your pf.conf from the default of “floating” then you must use the “floating” keyword in your inbound rule.

set state-policy if-bound
pass in on vic1 to 172.16.2.0/24 rtable 0 keep state (floating)
pass out on vic0

All of the above guidance also applies if you’re doing NAT on the outbound interface.

pass in on vic1 to 172.16.2.0/24 rtable 0
pass out on vic0 nat-to vic0

This ruleset would hide the 192.168.1/24 network from hosts in rdomain 0 by translating the source 192.168.1.x IP to the IP address on the vic0 interface. This might be necessary if there’s already a 192.168.1 network in rdomain 0. Even though you’re doing NAT, you still need a route in rdomain 0 that points back to the real source network (192.168.1/24) in rdomain 1.

Sample Use Cases

Routing domains can be used to isolate a test/dev network from production.

Two OpenBSD Routing Domains

In the sample network from earlier, rdomain 0 could be the production network with production servers and the users connected to it. Routing domain 1 could be a test network where applications and systems are put through testing before being moved into rdomain 0. In order to prevent the test systems from possibly affecting the production systems, they could be isolated in their own routing domain, ensuring that test traffic cannot get into the production network. In fact, the test network could even use the same IP addresses as the production network without them stepping on each other. A pf(4) ruleset could be written that lets management/administrative traffic from the production network into test. A ruleset could also be written that allows the test systems to talk to a specific management or file server in the production network. If overlapping IP space is used, traffic between the rdomains must be NAT’d as outlined above.

Routing domains can also be used to connect to multiple ISPs. Since userland tools such as dhclient(8) work properly within routing domains, each ISP interface could be put into its own routing domain without the risk of conflicting default routes.

Three OpenBSD Routing Domains

Here if vic1 is connected to ISP#1 and vic2 is connected to ISP#2, the pf(4) ruleset would control which ISP connection to use when users in rdomain 0 connect to the Internet. This provides a much more elegant solution than the outbound load balancing example I wrote about in the PF User’s Guide.

The only shared component of a multiple-dhclient(8) setup is the resolv.conf(5) file. Each copy of dhclient(8) will update the file as it renews its lease.

 Conclusion

By virtualizing the OpenBSD routing table you can create virtual routers and/or firewalls within the same physical OpenBSD machine. Networks can be safely isolated from each other without having to worry about traffic crossing network boundaries or IP addresses overlapping. Routing domains can be created by binding one or more interfaces to a routing table so that all traffic crossing those interfaces is automatically forwarded based on the routes present in the virtualized routing table. Traffic can be leaked between routing domains by using the granular pf(4) packet matching syntax to allow policy-based communication between routing domains.

40 Comments

  1. By Denis on Sep 27, 2011 at 1:45pm MDT |

    Thank you for that insight.

    In your opinion, are routing domains suitable to isolate gif(4) tunnels ? To me if I put gif(4) interfaces in another routing domain, the tunnel won’t turn up.

    • By Joel Knight on Sep 27, 2011 at 9:54pm MDT |

      Hey Denis,

      I haven’t actually tested tunnel interfaces in an rdomain. I would assume that if you’re doing something like

      ifconfig gif0 rdomain 5
      ifconfig gif0 tunnel 1.1.1.1 2.2.2.2

      that your local 1.1.1.1 interface would also need to be in rdomain 5 and that you’d need a route to 2.2.2.2 in rdomain 5. Is that how you’re doing it?

      Have you seen the tunneldomain ifconfig(8) option? It’ll let you place the inner tunnel traffic into an rdomain.

      I’m curious now if the gif interface is put into an rdomain and a tunneldomain is not configured whether the inner tunnel traffic would be routed in rdomain 0 or in the gif’s rdomain. That’d be interesting to test.

    • By Denis on Nov 13, 2011 at 2:12pm MDT |

      Hello,

      I did some experiment and it is really easy in fact. “tunneldomain” was the directive I was looking for, thank you for pointing !

      # ifconfig gif0 rdomain 1
      # ifconfig gif0 tunnel 192.0.2.1 198.51.100.1 tunneldomain 0

      Rtable 0 is the “regular” routing table and rtable 1 is my isolated network reachable over gif0 :)

  2. By Bink on Sep 27, 2011 at 3:29pm MDT |

    Thank you for taking the time to do this write-up—particularly because there is little literature on the subject!

  3. By joe on Sep 27, 2011 at 3:36pm MDT |

    Very interesting and in-depth article of an unknown for me feature of openbsd.

    Thanks.

  4. [...] You can read it here. [...]

  5. By Claer on Sep 28, 2011 at 4:44am MDT |

    I used gif and rdomains

    ifconfig gif0 rdomain 5
    ifconfig gif0 tunnel 1.1.1.1 2.2.2.2

    In your example, 1.1.1.1 and 2.2.2.2 are in rdomain 0 by default. This was wonderful to use serveral rdomains with ipsec+gif before reyk explained how to use serveral enc(4) interfaces.
    The tunneldomain option is used to place the 2 IP addresses above in the rdomain you want.

    • By Joel Knight on Sep 28, 2011 at 7:10am MDT |

      Did I have it backwards then? So tunneldomain is for the outer traffic and rdomain for the inner?

  6. By Denis on Sep 28, 2011 at 1:13pm MDT |

    Thank you very much for the pointers :)

  7. By chris on Sep 28, 2011 at 3:13pm MDT |

    Thank you very much! A very interesting article for me.

  8. By johnw on Sep 29, 2011 at 6:25am MDT |

    It look like very powerful and interesting.

  9. By dave on Sep 29, 2011 at 8:00am MDT |

    Thank you. Great Job!

  10. By Joakim on Sep 30, 2011 at 11:43am MDT |

    Really nice, I had totally missed this functionality. Thanks!

  11. By Web Hosting on Oct 3, 2011 at 8:47am MDT |

    Nice one, again another reason to dump out those expensive Cisco routers and have a smart OpenBSD box instead!

  12. By Omer on Oct 6, 2011 at 12:49pm MDT |

    Thanks! Great howto and that’s exactly what i was looking for.

    One small questions, How do I make this configuration persistent? I mean how can I make it survive reboot?

    • By Joel Knight on Oct 6, 2011 at 1:43pm MDT |

      Hey Omer,

      Except for the pf policy, everything should be doable in the hostname.if(5) files. Going back to the example network in the article, I would create /etc/hostname.vic1 like this:

      descr "Connection to 192.x network in rodmain 1"
      inet 192.168.1.100 255.255.255.0
      rdomain 1

      If you needed to add a route to rdomain 0 for a network inside rdomain 1, you could add your route statement to the file as well:

      !route -T 0 add 192.168.1/24 -iface 172.16.2.137

      If you do it this way just make sure the interface that 172.16.2.137 is assigned to is configured first during boot.

  13. By Russell Garrison on Nov 10, 2011 at 10:19am MDT |

    This is a great resource for using rdomains and really helps over using just the man pages. I hope that sometime this will get picked up in the FAQ. I have two questions though:

    1) You mention needing to put a route into the domain for return traffic even if using NAT, but I haven’t run into that using 4.9. Was there a change? My test setup used rdomain 0 for the internal IF and rdomain 1 for the outside. The /etc/mygate is set to the ip of the internal IF for rdomain 0 and a default route pointing to the ISP router is added to rdomain 1 via /etc/hostname !directive. Does this work on accident, or should I add a lo2 and a route to the internal (rdomain 0) net in rdomain 1?

    2) I have also noticed that any time I create an alias in any rdomain the entry is added to rdomain 0 route table and not the one I specify even with a complete line “family alias address netmask broadcast rdomain n” in my hostname file. Is that normal, or am I missing something?

    Sorry for the kind of basic questions, but your post is the first I have run into that does a good job explaining. I am using rdomains to handle a dual-ISP setup, but it is a constant learning process for me.

    • By Joel Knight on Nov 10, 2011 at 1:05pm MDT |

      Hi Russell,

      Thanks a lot for your comments and questions.

      A1. Your setup is actually working exactly as I described it should. The default route you have in rdomain 1 is enough to match the destination of the return traffic. So for a packet on the return path, it’ll be run through NAT so the packet has the internal, rdomain 0 IP as its destination address. That destination is looked up in rtable 1 and matches the default route there. Since the lookup was successful the process continues and pf moves the packet into rdomain 0 where it is sent to the end host.

      A2. I didn’t actually test aliases. That’s a good one. To me, that looks like a bug. Possibly ifconfig is not passing the rdomain to the kernel or the kernel is not adding the host route in the proper rdomain.

      • By Joel Knight on Nov 10, 2011 at 1:24pm MDT |

        Just did some quick testing with alias. I can’t reproduce what you’ve described. As long as my interface is already in an rdomain, all aliases are added correctly and routes are put into the proper rtable. This is on 4.9 and 5.0.

  14. By Ilya Shipitsin on Dec 19, 2011 at 11:25am MDT |

    I’m getting “tcp rst” in dual-ISP rdomains mode.
    can you please post detailed working config for dual-ISP with rdomains ?

    • By Joel Knight on Dec 19, 2011 at 12:23pm MDT |

      Hi Ilya,

      If you provide more information about your configuration, I can try and help you debug it.

    • By Joel Knight on Dec 20, 2011 at 2:04pm MDT |

      After talking out of band with Ilya, this turned out to be a case where sshd was running on the OpenBSD router where rdomains were configured and when trying to connect from a non-default rdomain to that sshd instance, the router was responding with a TCP RST.

      One thing the article above doesn’t talk about is sockets. Sockets actually belong to an rdomain. If sshd is started like normal (ie, “/usr/sbin/sshd”), then it opens a listening socket in rdomain 0. As far as rdomain 1 is concerned, port 22 is not listening and the kernel will respond with a TCP RST to any inbound connection attempts to port 22. The solution is to launch an sshd instance for rdomain 1 (ie, “route -T1 exec /usr/sbin/sshd”). This sshd instance will open a listening socket, and because it’s been exec’d in rdomain 1, that socket will be associated with that rdomain.

  15. By Ilya Shipitsin on Dec 19, 2011 at 6:24pm MDT |

    I can send config privately, what is your email?

  16. By Tero Marttila on Dec 29, 2011 at 9:30am MDT |

    Awesome, using routing domains enabled me to build a working proxy-ARP setup for my OpenBSD 5.0 router (ISP gw with x.y.z.1/24 directly on-link, no route from ISP to our own box, and ISP gw itself proxy-arp’s the entire internet to itself).

    Separate routing domain for em0 (x.y.z.2/24) to ISP, populate it with `arp -s x.y.z.4-254 pub permanent`, and then set up pf `match in on … to … rtable …` rules on em0 and em1 (x.y.z.254/24 + pub arps for x.y.z.1-2).

    It works perfectly for forwarded traffic – but fails for all locally originated traffic on the default routing domain :(

    In this setup, rdomain 2 (em0 / ISP link) has the real default route (x.y.z.1), whereas rdomain 0 (em1 / LAN) has a fake default route (also for x.y.z.1, which is either alias’d or arp-proxied back to em1). Seemingly, locally generated traffic never ingresses on any interface, and `rtable …` rules only work on ingress :(

    How should the default routing domain and its default route be set up to best keep outgoing traffic from the host itself working, in addition to forwarded traffic?

    • By Joel Knight on Dec 29, 2011 at 10:41am MDT |

      Hi Tero, thanks for explaining your unique setup. You are right, traffic on the router will not pass through the inbound pf rules and won’t be moved into a different rdomain. The software/daemons on the box need to specify which rdomain their socket should use when they create it otherwise they are stuck in rdomain 0. Lots of OpenBSD daemons have command line or conf file options to set their rdomain. For everything else you’ll have to use “route -Tx exec”.

      For simplicity and to avoid the issue you’re having, it’s probably best practice for everyone to keep the main ISP or the one that doesn’t do anything oddball in rdomain 0. It should be thought of as the default ISP and traffic is only leaked to other ISPs/rdomains as needed. This keeps the rule set simple and allows traffic off the router without issue.

      • By Tero Marttila on Dec 29, 2011 at 11:05am MDT |

        Moving the WAN to rdomain0 doesn’t help, because then the routes/hosts on your LAN/rdomainX become unreachable, in the same way… And besides, in this case, the WAN rdomain is full of /32 proxy-arp’d routes for all the LAN hosts, and those are all “fake”..

        Well, this is starting to feel a little scary even for me, but I actually managed to hack this into a working state by applying a little NAT:

        pass out on $lan_if from self to route lan-egress rtable $wan_domain nat-to $wan_if

        $lan_if being the rdomain0/default route destination, and the default route having a ‘lan-egress’ -label attached – very neat way to do the basic `match out on $lan_if to route rtable $wan_domain` btw.

        Remaining problem I still have is that TCP RST’s generated by `block out on $lan_if …` (i.e. after a `match in on $wan_if … rtable 0` change) are sent out on $lan_if… and the sender behind $wan_if ends up getting an `icmp host … unreachable` reply for blocked ports :/

        • By Joel Knight on Dec 29, 2011 at 12:09pm MDT |

          Nice trick with the route label.

          Curious what kind of traffic the router is initiating that is causing you this problem?

          • By Tero Marttilla on Jan 3, 2012 at 11:22am MDT |

            1) TCP SYN comes in on em0, rdomain 2
            2) `match in on em0 … rtable 0`
            3) TCP SYN packet routes out on em1 in rdomain 0)
            4) `block out on em1 …`
            5) pf generates a TCP RST… and sends it out on the rdomain 0 fake default route intended for leaking traffic to rdomain 2 (i.e. out on lo1) – which doesn’t go anywhere, since it never matches the `match in on em1 … rtable 2` -rule used for routing forwarded traffic.

            All in all, it seems that routing traffic between domains doesn’t work all that well :/

            Is there any way to work around this? The generated TCP RST in rdomain 0 doesn’t seem to go through pf at all.

            P.S: this comment box is getting ridculously small

            • By Joel Knight on Jan 3, 2012 at 3:15pm MDT |

              pf doesn’t keep track of which rdomain a packet came from when it moves a packet into a new rdomain. This is why pf isn’t able to automatically route the RST back using the original rdomain. To avoid this you must filter your traffic on the input interface, before the incoming packet is marked with the new rdomain. And this would go for anybody using rdomain leaking, not just you.

              And you’re right, the comment box was getting awfully tight. I changed some CSS around so things look much better now. Thanks.

  17. By TANABE Ken-ichi (@nabeken) on Jan 4, 2012 at 2:20am MDT |

    http://t.co/75GZauf5 Virtualizing the OpenBSD Routing Table

  18. By Antonio Feitosa (@antonio_cfc) on Feb 7, 2012 at 6:21pm MDT |

    RT @knight_joel: @phessler You bet! Virtualizing the #OpenBSD Routing Table http://t.co/M00Db6ei

  19. By jofcho on Mar 25, 2012 at 11:38pm MDT |

    Hi, nice article, thanks.
    I try to set-up a dual ISP connections with rdomains, but failed to make the correct pf lines.
    In http://www.openbsd.org/faq/pf/pools.html#outgoing example is used route-to and round-robin to achieve load balansing. What about rtables? Some working example?

    • By Joel Knight on Mar 26, 2012 at 12:14pm MDT |

      Hi jofcho,

      All of the components you need to make a dual ISP setup work are talked about in the article. What does your config look like so far? What works and what doesn’t?

      • By jofcho on Mar 27, 2012 at 2:18am MDT |

        Well, I make all settings, and I’m able to transmit and receive data between domains. But how to make something like:
        “pass in on $int_if from $lan_net \
        route-to { ($ext_if1 $ext_gw1), ($ext_if2 $ext_gw2) } \
        round-robin”

        with rtables?

        pass in on vic1 to 0.0.0.0/0 rtable 0
        pass out on vic0 nat-to vic0
        pass in on vic1 to 0.0.0.0/0 rtable 1
        pass out on vic2 nat-to vic2
        How to combine theese two rules?

        • By Joel Knight on Mar 27, 2012 at 2:24pm MDT |

          I can’t think of a straightforward way of doing that.

          Keep in mind that rdomains aren’t designed to do what you’re doing. They’re meant to provide isolation at Layer 3 (and below). You’re trying to do round robin routing to balance Internet use. Stick with the ‘route-to’ method.

  20. By jofcho on Mar 28, 2012 at 4:30am MDT |

    Actually I try to make load balancing of outbound connections and use two ISPs.
    Maybe this citation misleaded me:
    “This provides a much more elegant solution than the outbound load balancing example I wrote about in the PF User’s Guide.”

    • By Joel Knight on Mar 28, 2012 at 9:01pm MDT |

      Ah, I can see why that would be misleading. I meant more in the sense that the pf ruleset is a bit cleaner and that you can do the dual-ISP setup using dynamically assigned Internet IPs with rdomains whereas with route-to, you’re pretty well stuck needing static IPs (because you have to specify the gateway IPs in the ruleset).

  21. By Danie on Mar 29, 2012 at 3:54am MDT |

    Do you perhaps have a example of overlapping subnets for rdomain 1 and 2?

    in my case, I’ve setup vlan10 in rdomain 10 and vlan11 in rdomain 11 both have the same network 10.0.0.0/18 and same gateway ip of 10.0.0.1. I can ping both using the ping -V10/11 10.0.0.1 and also the hosts located on the 2 subnets. (vlan10/host = 10.0.0.4 and vlan11/host = 10.0.0.2)

    in pf.conf – I’ve added the following,

    pass in on vlan10 to 172.29.0.0/16 rtable 0
    pass in on vlan11 to 172.29.0.0/16 rtable 0

    This allows ping from both 10.0.0.2 and .4 hosts to my external interface em0 (172.29.43.239)

    But now I would like to access the host 10.0.0.2 from a host 172.29.43.20 by accessing a natted IP of say 172.29.43.240->10.0.0.2 (rdomain 10).

    • By Joel Knight on Apr 2, 2012 at 5:42pm MDT |

      Hi Danie,

      That’s a good scenario. I haven’t tested this, but here’s what I’ve got off the top of my head.

      # push tcp/80 traffic into rdomain 10 and do dest addr translation
      pass in on em0 proto tcp to 172.29.43.20 port 80 rtable 10 rdr-to 10.0.0.2

      # get traffic from rdomain 10 destined to 172-net back into rdomain 0
      pass in on vlan10 to 172.29.0.0/16 rtable 0

      # setup reverse route in rdomain 10
      ifconfig lo10 rdomain 10 127.0.0.1
      route -T 10 add 172.29.0.0/16 127.0.0.1

      # (no route needed in rdomain 0)

      Let me know what you think. I’m interested to know if this works or not. Sounds like you might have all this already except for the first pf rule.

      • By Danie on Apr 3, 2012 at 7:04am MDT |

        Hi Joel,
        thanks for the reply.

        The push traffic to rdomain and local loopback for each rdomain did the trick.

        I’m still getting used to the idea of rdomains and pf.

        Thanks!