A Use Case for Policy Routing with KVM and Open vSwitch
Published on 30 May 2013 · Filed in Explanation · 777 words (estimated 4 minutes to read)In an earlier post, I provided an introduction to policy routing as implemented in recent versions of Ubuntu Linux (and possibly other distributions as well), and I promised that in a future post I would provide a practical application of its usage. This post looks at that practical application: how—and why—you would use Linux policy routing in an environment running OVS and a Linux hypervisor (I’ll assume KVM for the purposes of this post).
Before I get into the “how,” let’s first discuss the “why.” Let’s assume that you have a KVM+OVS environment and are leveraging tunnels (GRE or other) for some guest domain traffic. Recall from my post on traffic patterns with Open vSwitch that tunnel traffic is generated by the OVS process itself, and therefore is controlled by the Linux host’s IP routing table with regard to which interfaces that tunnel traffic will use. But what if you need the tunnel traffic to be handled differently than the host’s management traffic? What if you need a default route for tunnel traffic that uses one interface, but a different default route for your separate management network that uses its own interface? This is why you would use policy routing in this configuration. Using source routing (i.e., policy routing based on the source of the traffic), you could easily define a table for tunnel traffic that has its own default route while still allowing management traffic to use the host’s default routing table.
Let’s take a look at how it’s done. In this example, I’ll make the following assumptions:
-
I’ll assume that you’re running host management traffic through OVS, as I outlined here. I’ll use the name
mgmt0
to refer to the management interface that’s running through OVS for host management traffic. We’ll use the IP address 192.168.100.10 for themgmt0
interface. -
I’ll assume that you’re running tunnel traffic through an OVS interface interface named
tep0
. (This helps provide some consistency with my walk-through on using GRE tunnels with OVS.) We’ll use the IP address 192.168.200.10 for thetep0
interface. -
I’ll assume that the default gateway on each subnet uses the .1 address on that subnet.
With these assumptions out of the way, let’s look at how you would set this up.
First, you’ll create a custom policy routing table, as outlined here. I’ll use the name “tunnel” for my new table:
echo 200 tunnel >> /etc/iproute2/rt_tables
Next, you’ll need to modify /etc/network/interfaces
for the tep0
interface so that a custom policy routing rule and custom route are installed whenever this interface is brought up. The new configuration stanza would look something like this:
auto tep0
iface tep0 inet static
address 192.168.200.10
netmask 255.255.255.0
network 192.168.200.0
broadcast 192.168.200.255
post-up ip rule add from 192.168.200.10 lookup tunnel
post-up ip route add default via 192.168.200.1 dev tep0 table tunnel
(Click here for the same information as a GitHub Gist.)
Finally, you’ll want to ensure that mgmt0
is properly configured in /etc/network/interfaces
. No special configuration is required there, just the use of the gateway
directive to install the default route. Ubuntu will install the default route into the main table automatically, making it a “system-wide” default route that will be used unless a policy routing rule dictates otherwise.
With this configuration in place, you now have a system that:
-
Can communicate via
mgmt0
with other systems in other subnets via the default gateway of 192.168.100.1. -
Can communicate via
tep0
to establish tunnels with other hypervisors in other subnets via the 192.168.200.1 gateway.
This configuration requires only the initial configuration (which could, quite naturally, be automated via a tool like Puppet) and does not require using additional routes as the environment scales to include new subnets for other hypervisors (either for management or tunnel traffic). Thus, organizations can use recommended practices for building scalable L3 networks with reasonably-sized L2 domains without sacrificing connectivity to/from the hypervisors in the environment.
(By the way, this is something that is not easily accomplished in the vSphere world today. ESXi has only a single routing table for all VMkernel interfaces, which means that management traffic, vMotion traffic, VXLAN traffic, etc., are all bound by that single routing table. To achieve full L3 connectivity, you’d have to install specific routes into the VMkernel routing table on each ESXi host. When additional subnets are added for scale, each host would have to be touched to add the additional route.)
Hopefully this gives you an idea of how Linux policy routing could be effectively used in environments leveraging virtualization, OVS, and overlay protocols. Feel free to add your thoughts, idea, corrections, or questions in the comments below. Courteous comments are always welcome! (Please disclose vendor affiliations where applicable.)