Menu:

Although I've been a fan of IPv6 for a while, only in the past few weeks have I been using it full-time on my home network and my web sites. Trivial setup on Fedora Core made roll-out of IPv6 on my tiny network a breeze. This was Linux's big step, in my own mind, into IPv6 production readiness.

Thanks to projects such as KAME and USAGI, most of the standard Internet service daemons have long since been updated to support IPv6. The last pieces were solid Linux kernel IPv6 support, and easy integration of IPv6 into Linux distributions.

Given that the OS software makes it possible to deploy IPv6 in production, it made me wonder how the Internet-wide rollout of IPv6 would occur. Currently it is happening slowly, with a few ISPs rolling out "native" IPV6 services, and the rest requiring the use of 6to4 or a tunnel broker. There is no question that, outside North America, IPv6 is being examined and adopted with some interest. In particular, Asia has been experiencing rapid adoption of IPv6.

However, it seems there is a catch-22: the most popular web sites on the Internet are (to my naive American reckoning, anyway) in North America. These web sites don't have any incentive to switch to IPv6 until a large portion of their userbase is on IPv6, and their user base does not have a large incentive to switch to IPv6 until many of the popular Internet destinations support IPv6.

A quick check reveals that none of the top web sites have associated AAAA records, indicating an associated IPv6 address. I didn't test SMTP servers from the big ISPs, but I'm willing to bet I'll find the same results.

Proxy it

My proposed solution is simple. Integrating IPv6 into a large web site necessarily involves lots of planning, software testing, and sysadmin time. Shortcut that time by configuring a proxy server that serves IPv6 HTTP and FTP requests, passing those requests through to underlying IPv4-only servers that not have yet been transitioned to IPv6. As an added bonus, most proxy servers can also cache data, reducing traffic to the underlying IPv4 servers. This allows organizations to roll out IPv6 on a completely separate network, without having to change or their primary web servers at all. Simply by "turning on" a new server, a web site automatically has IPv6 support.

Ideally, I would be able to use the Squid web proxy cache. However, Squid's IPv6 support is only available through somewhat stale developer patches. Luckily, there is already a production-quality, IPv6-friendly solution: Apache HTTP server, which includes modules for both proxying and caching.

This is certainly not a new suggestion, but I felt that events and software momentum have coalesced such that this solution should be reiterated.

Configure it

By far, the most commonly used configuration for Apache's mod_proxy is as a forward proxy. A forward proxy is typically an internal server that receives all HTTP requests in a single organization, and forwards them those requests to the Internet. We wish to do the opposite — receive HTTP requests from the Internet at large — so naturally this is called a reverse proxy

Setting up a reverse proxy for an entire website is fairly straightforward. As an example, I present the configuration for a web server that will serve pages as http://www.us.kernel.org/, passing them through to the underlying www.kernel.org web server(s).

<VirtualHost *:80>
    ServerAdmin admin@example.com
    ServerName www.example.kernel.org
    ServerAlias *.kernel.org
    ErrorLog logs/kernel.org-error
    CustomLog logs/kernel.org common

    ProxyRequests Off

    <Proxy http://www.us.kernel.org/*>
        Order deny,allow
        Allow from all
    </Proxy>

    ProxyPass / http://www.kernel.org/
    ProxyPassReverse / http://www.kernel.org/
    ProxyPreserveHost On

</VirtualHost>
The key parts being:

Note that this is not an "open proxy", because we restrict the reverse proxy to a single virtual host. The server serves files to the public, but only for the web sites listed.

Cache it

While providing IPv6 reverse proxy services for IPv4 websites is certainly nice, we should do more. If we are proxying, we might as well cache the popular data as well.

Hunting through the Apache documentation for mod_cache, mod_disk_cache and mod_mem_cache yields the following configuration recipe (please see the documentation for the meaning of each directive):

MCacheSize 16000
MCacheMaxObjectCount 10000
MCacheMaxObjectSize 1000000
MCacheMinObjectSize 1

CacheDirLength 3
CacheDirLevels 3
CacheMaxFileSize 1000000
CacheMinFileSize 1
CacheRoot /var/www/cache
CacheSize 160000

CacheEnable mem /
CacheEnable disk /
The last few CacheEnable lines are the key ones. Those lines are what enables caching for the server.

Caveat coder

While doing the test and configuration related to this article, I noticed something peculiar: no matter how frequency I requested a file, it would never be cached on disk. Not even if I removed the CacheEnable mem directive.

Doing some research on the Apache mailing lists, I found at least one developer commenting that mod_disk_cache has many problems in Apache 2.0.50 and earlier. I tested the suggested patches (now release version 2.0.51), but was unable to get mod_disk_cache to work with reverse proxying.

My current theory is that ProxyRequests Off causes the cache module to assume no data exists to cache, but that's just a wild guess.

Test it

Now, it's time to test our reverse proxy, to make sure that everything is set up correctly, and that requests are passing through to the downstream IPv4 site correctly. I use cURL for testing, since it allows me to manually specify the Host header, allowing me to easily test any virtual host on a server.

Here is the script I use, check-ip.sh:

#!/bin/sh
# usage:  ./check-ip.sh [hostname of virtual host] [IP address]

SPOOFHOST=$1
IP=$2
OPTS="--connect-timeout 60 --max-time 120 -f"

if curl $OPTS --header "Host: $SPOOFHOST" http://$IP/ > /dev/null 2>&1
then
        echo $IP succeeded.
        exit 0
fi

echo $IP FAILED.
exit 1

Finish it

The Apache configuration presented here facilitates the hosting of a large number of web sites (or free software mirrors), without having to actually store and synchronize the web site with a central server. The natural flow of proxy->core web server automatically keeps data synchronized. I think this is a highly optimal model for free software distribution, if enough servers participate in a group proxy cache as http accelerators. (plug: if you have some spare bandwidth and are willing to donate a UML or Xen virtual server to my netop.org project, let me know)

We now have all the tools we need to provide IPv6 server for a web site, without a single modification to the existing IPv4 server. For system administrators, this facilitates easy (and perhaps even off-site) deployment of the IPv6 web services without a costly and perhaps peril-fraught "flag day" transition of the border web servers to IPv6.

Additional resources

SixXS runs twin v6-to-v4 and v4-to-v6 HTTP gateways, which is similar to the proposal here. Visit http://ipv6gate.sixxs.net/ for more info.

Digg it
Slashdot it
mmmm, del.icio.us