For those of you that weren’t aware, ResTek has a pair of machines acting as a border firewall. Both of the machines, hardware-wise, are the same and their configurations are close to identical as well. In this brief entry I will describe how they work, assuming the reader has zero understanding already.
First, a picture: http://www.flickr.com/photos/muskrat/2389614285/
A close look will reveal that the two 1U boxes sporting the OpenBSD and Puffy stickers are labeled “Firewall A” (top) and “Firewall B” (bottom).
Each of the firewalls has 4 Ethernet ports. Two of these are separate devices (fxp0 and em0 below), and the third and fourth are a dual-port PCI-X card. All of them are Intel-based cards. The interfaces associated with the devices are named as follows:
- fxp0 – 100 mbps, connected directly to other firewall with crossover cable
- em0 – currently unused
- em1 – gigabit, “outside” (CARP)
- em2 – gigabit, “inside” (CARP)
Because these two firewalls are on a critical path (between students and their internet), it’s very important that they are redundant. If we only had one, and it failed, there would be no internet connection for students until someone manually reconfigured the routes on the main router – this would mean hours of downtime at the very least every time something happened (including upgrades when the system needs to be rebooted)!
Common Address Redundancy Protocol (CARP)
CARP is a protocol written by the OpenBSD team (the firewalls are on the OpenBSD operating system, partly because of this incredibly useful protocol) to address this issue. From the official documentation:
CARP works by allowing a group of hosts on the same network segment to share an IP address. This group of hosts is referred to as a “redundancy group”. The redundancy group is assigned an IP address that is shared amongst the group members. Within the group, one host is designated the “master” and the rest as “backups”. The master host is the one that currently “holds” the shared IP; it responds to any traffic or ARP requests directed towards it.
In a simple setup like ours, one of the firewalls will be processing traffic passing between students and their internet at any given moment – whichever one doing that is referred to as the “master.” It advertises itself as master by sending out advertisements that the “backup” can see at a configurable interval. If ever the backup stops seeing these advertisements, thinking the master has failed or been shut down, it will “immediately” step in and take over.
Hopefully you noticed above that there were two interfaces (em2 and em1) referred to as “inside” and “outside”, respectively. This is just a way to describe which part of the path that network interface is facing. Consider the diagram below:
student --- (inside) em2 [firewall] em1 (outside) --- internet
A simple CARP firewall configuration will allow both firewalls to share one address on the inside, and another on the outside. The main router (to which the residence halls and firewalls are connected) will route traffic to these shared (CARP) addresses, rather than the unique addresses of either firewall. This way, the routes don’t have to change in the event of failure—the address is usually a fine route because usually at least one firewall will be up and running.
Up until now in the description, the firewalls probably don’t seem much like firewalls…
PF - The OpenBSD Packet Filter
PF is the OpenBSD packet filter (arguably the best packet filter/firewall software available today). It is part of the operating system itself, and it is responsible for deciding how to handle packets that it is configured to process. Like most packet filters, PF is configured through a configuration file (named pf.conf) that defines certain rules for how to handle different types of traffic. The rules are collectively known as a “ruleset.”
When a packet enters (or exits) the firewall, pf processes it. First, PF checks the packet against a list of existing connections—referred to as “states”. This is called a state table lookup. A “state” consists of: source address/port, destination address/port, and direction. If the packet is found to match an existing connection (and it is valid in the context of that connections current state, hence the name) it is passed. If no state match is found, the packet must be checked against the ruleset.
Most rules are very easy to read. Here are a few examples:
pass out quick from <firewalls>
block quick from any to <firewalls>
pass in on $ext_if from any to $servers
The packet is compared to the list of rules, and the last rule that matches (unless a special “quick” keyword is used in a rule), will determine how PF handles the packet. If it is “block”, it will be blocked. If it is “pass”, it will be passed. If no rule matches, the default action is to pass. The ruleset lookup is much slower than a state table lookup.
It is also important to note that, in our environment, traffic passing through the firewall is processed twice by PF (once on the “inside”, and again on the “outside”).
A primary way the firewalls are used at ResTek is to maintain a list of registered students on our network and pass their traffic, but not pass the traffic of a machine that has not been registered with us. Another way is to minimize incoming spam destined to our mail server, using OpenBSD’s spamd.
Obviously I won’t attempt to describe everything about PF, but I encourage curious readers to visit the OpenBSD documentation and learn about it.
Failover
Because state lookups are much faster (and we process a huge number of packets per second), we like to make use of them where possible. It’s possible to never create states by specifying “no state” in rules, but then the ruleset has to be checked for every single packet, and that’s much slower. In our environment, a state table lookup occurs around 40,000 times per second on average.
However, you might notice there is a possible problem with states. If the master is maintaining a list of all current states as connections pass through it, and then it fails, the backup will have no idea which packets are part of states and which aren’t. Obviously this would lead to interrupted connections where stateful filtering is in place.
OpenBSD developed “pfsync” to address this. Pfsync runs on both firewalls, and allows them to update each other on current states. So, the backup firewall, formerly blind about what connections are going through the master, would now be sent a list of updates to states. In other words, their state tables are synced at all times. If the master fails, the backup knows exactly how to handle existing connections. This is where the “fxp0” interface comes in. As said above, the two fxp0 interfaces are connected directly to each other using a crossover cable. Pfsync traffic is sent across this dedicated link.
Conclusions
The redundancy created by OpenBSD on the firewalls is extremely valuable and also extremely easy to configure. You can literally unplug the master and streaming audio/video won’t miss a beat as the backup takes over.
A few things to be aware of about our firewalls:
- As for our policy, ResTek does not block ports. We recently removed the ones that used to be blocked. The main router still blocks a few ports, but we are working with the department that manages it to get that cleaned up. We do not feel that blocking ports is a worthwhile security practice in the majority of cases, and our role as an Internet Service Provider is not to police traffic in such a useless/potentially damaging way.
- When traffic is denied by the firewall, it is not simply discarded. The firewall will send back the appropriate response. This is known as a “block policy” in PF. You can define “drop” (discard) or “return”. It is good practice to “return” (send back a response indicating that the connection was denied) to avoid confusion for people that are trying to troubleshoot. Our philosophy is that it is better to “return” than to “drop”.
- Inbound connections to resident machines are not blocked. Now that most operating systems (Windows) ship with firewalls enabled by default, we don’t want to hinder your internet experience by trying to “protect” you. We find that usually it causes more problems than it solves, specifically for people trying to host games or whatever else. Some providers allow outbound connections, and inbound traffic associated with those connections, but drop inbound connections. We feel this practice causes more problems than it solves.
- We do not use the firewalls to monitor what you do. We monitor how much traffic IP addresses are using to make sure nobody is taking up our limited bandwidth and making things slow for everyone else. We also keep an eye out for suspiciously high numbers of outbound e-mails (almost always an indication of a virus-infected machine spamming), but we do not monitor the e-mails themselves at all and look down on people that do.
Hope that clarifies some things!
Have a look at our graphs, specifically the “pf” layout, for some visuals.