The Border Gateway Protocol (BGP) exchanges routing information between autonomous systems. Routers use it to locally decide, among a set of neighboring routers, which router to send IP (and other) traffic based on the target network prefix. In our BGP blog post, we describe how BGP selects routers based on best path selection.
At Datapath.io, we built our own implementation of BGP. This is needed because we use the protocol in a non-standard way. We alter the best path selection on a per customer basis. Datapath.io accomplishes this in the following way.
High Speed vs Open Source
The features of the protocol vary a little depending on the router vendor. If this does not meet the requirements of your use-case you have to write your own implementation of BGP. Or, ask your vendor to add the features you need (do not waste time trying). For the sake of this post, we will stick to the first option.
There are many excellent open source implementations of the protocol. These can be used as a starting point. Quagga, BIRD, OpenBGP, ExaBGP, XORP. They have reached a grade of maturity that allows use of them in production. But there is more to it.
BGP usually runs on routers that route hundreds of Gigabits of traffic per second. To achieve such speeds, router vendors use special ASICs (application-specific integrated circuits). Those ASICs are made to match and change bits in protocol headers at a high rate. They are expensive to produce for router vendors. However, their customers tend to buy software features to solely look at the specs of the ASICs. This is why router vendors bind their software to the ASICs. You are probably stuck with the features your vendor sells. They will not let you change the software on your hardware. Does that mean that there is no open source software to run on routers?
The software defined networks (SDN) principle promises to change this. Decoupling the data-plane (vendor silicon) and control-plane (custom software) with an SDN protocol (like OpenFlow) between. We use OpenFlow and we love it! It allows us to run our own implementation of BGP (and other protocols). This is on a dedicated server that programs the data-plane within our switch. That puts us in a comfortable position, because every time we change something with our software, we can roll it out to our servers without changing anything at the switch.
How do you actually run BGP on top of OpenFlow?
Take a look at this picture. On the left side there are the routers of our transit providers like Cogent, Level3, GTT and Hibernia. They all connect to ports on our HP 5406zl2 switch. On dedicated servers, we run multiple apps that add features to the OpenFlow switch.
First, there is the FIB Handler App. It takes care of the database that defines which customer is routed over which link, to the outside world, per destination network prefix. This app employs two methods for doing this: installRoute() and removeRoute().
The BGP Router App actually hosts our BGP implementation. When it starts up, it installs a rule within our OpenFlow switch. This actually says: “Whatever you see related to BGP, please hand it to me using the PACKET_IN/PACKET_OUT channel”. The PACKET_IN/PACKET_OUT channel is a feature of OpenFlow. It can match on header fields of Ethernet frames and IP packets. The app then receives full Ethernet frames with BGP data units inside from the switch. Now, we can use the BGP data units and talk with the other routers from within our BGP Router App.
IP packets can be fragmented. This means they are distributed over multiple frames. Or, IP packets might not arrive in order or be retransmitted. Because of this, we need a full-featured network stack. There are Open Source libraries with implementations. Our software runs on an operating system that has a full-featured network stack implemented. Linux.
Let’s make a detour via the Operating System.
To employ the Operating System’s network stack, we use the TAP interface. To the operating system, the TAP interface looks like a physical network card. The frames can be pushed to and pulled from it using API calls from user-space. The operating system then delivers a reliable, ordered and de-duplicated stream of BGP data units. These are sent back to our BGP Router App using the socket API. Inside our app, we are then doing our BGP magic. The results are pushed out to the FIB Handler App and installed on our OpenFlow switch.
You have now seen a solution which enables us to run our own implementation of the Border Gateway Protocol on dedicated servers. To achieve this, we use the OpenFlow protocol to gain access to BGP data units on our HP 5406zl2 switch. The data units are delivered as frames, so we curate them using a detour via the operating system. The operating system provides a full network stack that does the heavy lifting for us. Using the socket API we get the curated stream of BGP data units back to our system. Per-customer selected routes are translated to OpenFlow flows and placed inside our switch using the OpenFlow protocol.
Download the AWS Network Optimization Whitepaper to find out more about how Datapath.io uses BGP for network optimization.