Archive for February, 2007
net neutrality: grey area

I went to the presentation this evening by the American Civil Liberties Union (ACLU) of Washington State in CF 125 at 7:00pm. The topic of discussion was—you guessed it—Net Neutrality. The difficult thing about this issue is that it isn’t clear-cut. Like most issues, there is a lot of grey area that usually isn’t covered by competing parties because they want you to believe there is a right answer and that you should vote for it.

Specifically, the presentation stated verbatim that Net Neutrality legislation would prevent discrimation among legal content by providers. What about the times when the connection is a shared resource, as it is in the residence halls? Would it be illegal to discriminate against BitTorrent as we do, despite the effect it would have on the network? I would hope not—but this issue must not be ignored, lest poor wording in legislation be abused. Would this legislation prevent innovation in the area of QoS (e.g. MPLS). Again, I don’t think that is the purpose of the legislation, but it is important to me that the law (the action of the government) is “narrowly tailored.” I don’t always trust the judicial system to correctly interpret law (especially when it is related to something as unprecedented as the internet) and I don’t like laws to be too broad in accomplishing their purpose. Also in the presentation was a “game” about Fact vs. Fiction and we were asked to analyze the claim “net neutrality is first-time regulation that would prevent innovation.” While I don’t think it is true, it is still an opinion. The person who answered, saying it was Fiction, said the reason it was fiction is “because it wouldn’t.”

Bottom line: there aren’t very many issues that are strictly black and white, so understand that both sides in this argument may be logical and acting in their best interest—you just have to decide which is best for you, and there almost certainly is a decision that is best for you. For me, I think that none of this would be necessary if there was true competition in the market. However, the nature of the industry makes that impossible (extremely high fixed cost, low variable cost = high bar to entry) so it is necessary for the government to regulate enough to keep the few companies that provide the infrastructure from becoming too powerful.

In other news, an upgrade to PHP 5.2.1 has led to extremely strange behavior with wget and Nagios’ check_http plugin. Specifically, both programs take ~15 seconds to fetch wordpress pages, even over the loopback interface. Why 15 seconds? The first few milliseconds of the connection are normal, but for some reason wget/check_http don’t understand where the end of the file comes, and continue waiting until the server’s KeepAlive timeout expires (15 seconds) and the server closes the connection before finishing. Also, a wget/check_http show a trailing ‘0’ at the end of the file—not present when viewing source in Firefox, fetch, telnet, or nc.

I’m suspecting that for some reason PHP/Wordpress isn’t terminating the file in a way that these two programs expect, so they keep waiting, but I’ll have to look into it more tomorrow and perhaps dig around in the code to see what they’re waiting to see, and what changed about the way files are terminated (if that is the case).

monitoring PF state tables with Nagios

A few months ago I decided to clean up the Nagios configuration as much as possible, and it has really paid off. Not only do we have cool eyecandy now (Beastie on the FreeBSD servers, Puffy on the OpenBSD machines, and that little chameleon on the HITS SLES machines, network diagrams), but the configuration itself is really easy to maintain. Instead of services being defined multiple times, we define a service once and use hostgroups to tell Nagios to check that service on all members of the group. That leads to much smaller and simpler configuration files.

Another issue was that Nagios could check actual “services” such as HTTP, SMTP, etc., but we did not monitor things such as partition usage, process counts, or any other information that you need local access for—enter NRPE. NRPE listens on each server and allows Nagios to execute local plugins and read the results. I set it up to monitor disk space, process counts, load averages and so on but this evening I decided to add a service for entries in the PF state table.

Adding a command for nrpe2 to run is very simple, and looks like this in the configuration file:

command[check_users]=/usr/local/libexec/nagios/check_users -w 5 -c 10
command[check_load]=/usr/local/libexec/nagios/check_load -w 15,10,5 -c 30,25,20
command[check_disk_root]=/usr/local/libexec/nagios/check_disk -w 20% -c 10% -p /
command[check_disk_var]=/usr/local/libexec/nagios/check_disk -w 20% -c 10% -p /var
command[check_pfstates]=/sbin/pfctl -qsi | /usr/bin/grep entries

The first few are examples of how to check the usage of various partitions, and what thresholds are WARNING or CRITICAL. The last is a simple command to parse the “current entries: XX” line from the output of pfctl.

Because nrpe2 isn’t run as root (I wouldn’t use it if it needed to be), it doesn’t have permission to read from the pf(4) pseudo-device (root:wheel mode 0600 by default). We need it to be able to read statistics so we simply need to change the ownership and permissions of the device so that the right user can read from it.

crw-r-----  1 root  nagios    0,  78 Feb 16 15:11 /dev/pf

The permissions will revert after a reboot when the devices are re-created, so we add the following lines to /etc/devfs.conf:

own     pf    root:nagios
perm    pf    0640

We already have Nagios setup to check NRPE2 services, so we just define a new service in the Nagios configuration file.

define service {
    use                 nrpe2-service
    hostgroup_name      main-servers
    service_description PF State Table
    check_command       check_nrpe2!check_pfstates
}

Once everything is restarted for the changes to take effect, a glance at Nagios will show the current state table counts for each of the servers. When I have some more time, I’ll probably create a script to accept high-low values to trigger warnings when the state table count is outside of a given range.

nrpe2 pf states