Although this post is about writing a redirect script for App Engine, it doesn't require that any of the sites are hosted on App Engine, so it could be useful to you even if you're hosting .NET websites elsewhere, but need to handle redirecting old domains.

If you believe what Derek Says, I change my domain every 5 minutes. While this might be a slight exaggeration, I have moved many domains recently and needed to deal with the usual problems this brings: search engine rankings and existing inbound links.

301 vs 302 Redirects

The most common type of redirect you'll see on the web is a "302 Temporary Redirect". This is what most frameworks will output when you redirect (eg. Response.Redirect() in ASP/.NET or self.redirect() in App Engine). The "Temporary" part of this redirect means the redirect is a one-off and will not always be served. This is handy for example, for redirecting a user back to a page after logging in to your site. Since this page may be different each time, the redirect is not fixed.

The other type of redirect, 301, is a "Permanent Redirect". Most frameworks support these redirects without outputting your own headers. Eg. in App Engine you can use self.redirect(url, permanent=True). This type of redirect means the redirect will always occur to the same link, and that clients are free to bypass your script and assume it will always go to the same place. This is the redirect that we want to use when moving domains. It tells search engines that this page has moved, permanently, and they may associate any rankings for the old page, with the new page.

Generic App Engine Redirect Script

So, now we know what type of redirect to use, it's time to build a script to handle our redirects. If you have multiple domains like me, it makes sense to write a script that can handle them all in one go. I've decided to set up a new App Engine app with the sole purpose of redirects for all my domains from this point forward.

Since I recently decided to move everything from dantup.me.uk to dantup.com, I had a few different domains to redirect. blog.dantup.me.uk needs to map to blog.dantup.com, wavenotifier.dantup.me.uk needs to map to wavenotifier.dantup.com and all of the unused domains (eg. dantup.com, www.dantup.com, tuppeny.com, etc.) need to map to the root of my blog.

App.yaml Setup

If we're gong to be redirecting all incoming requests, we need to route all requests through our script. This is why it's important to use a new App Engine app rather than piggy-back onto an existing one. We'll see up our App.yaml file to route all requests into a script called main.py.

application: myapp-redir
version: 1
runtime: python
api_version: 1

handlers:

- url: /.*
script: main.py

Next we need to define a way to hold all of the data we'll need to perform our redirects. As well as the old domain and the new domain, we need to know whether to map urls from the request onto the new domain, or just redirect to the root. Eg., when I moved my blog from blog.dantup.me.uk, I wanted blog.dantup.me.uk/mypost to redirect to blog.dantup.com/mypost. However I want tuppeny.com/anything to just redirect to the root, blog.dantup.com.

A dictionary seems to be a good way to store this data because we can perform lookups on the domain quickly, and we can store the new domain and a boolean for the url mapping as a tuple.

# Old Domain: New Domain, Map urls (else redirects to root)
urls = {
	'www.dantup.com': ('blog.dantup.com', False),
	'www.dantup.me.uk': ('blog.dantup.com', False),
	'www.tuppeny.com': ('blog.dantup.com', False),
	'dantup-redir.appspot.com': ('blog.dantup.com', False),
	'blog.dantup.me.uk': ('blog.dantup.com', True),
	'feeds.dantup.me.uk': ('feeds.dantup.com', True),
	'wavenotifier.dantup.me.uk': ('wavenotifier.dantup.com', True),
	'wavenotifier.tuppeny.com': ('wavenotifier.dantup.com', True),
	'go.dantup.me.uk': ('go.dantup.com', True),
	'go.tuppeny.com': ('go.dantup.com', True),
}

In addition to this mapping, we should declare a default domain, so if any requests make it to our script that don't have a mapping, we can redirect there. We'll use a 302 and also log and email this, since it's probably a mistake.

DEFAULT_URL = 'http://blog.dantup.com/'

This is all looking a little complicated, so it makes sense to build in a way to test our mappings without having to set up lots of entries in the hosts file. I've decided to declare a boolean that enables/disables testing. When testing is enabled, if you navigate to /test then it will output a bunch of URLs and the locations they'll redirect to. We'll keep a list of URLs to test in the code:

ALLOW_TEST = True

test_urls = [
	'http://www.dantup.me.uk',
	'http://www.dantup.me.uk/',
	'http://www.dantup.me.uk/blah',
	'http://www.dantup.com',
	'http://www.dantup.com/',
	'http://www.dantup.com/blah',
	'http://www.tuppeny.com',
	'http://www.tuppeny.com/',
	'http://www.tuppeny.com/blah',
	'http://blog.dantup.me.uk',
	'http://blog.dantup.me.uk/',
	'http://blog.dantup.me.uk/2010/mytest',
	'http://feeds.dantup.me.uk',
	'http://feeds.dantup.me.uk/',
	'http://feeds.dantup.me.uk/2010/mytest',
	'http://wavenotifier.dantup.me.uk',
	'http://wavenotifier.dantup.me.uk/',
	'http://wavenotifier.dantup.me.uk/2010/mytest',
	'http://wavenotifier.tuppeny.com',
	'http://wavenotifier.tuppeny.com/',
	'http://wavenotifier.tuppeny.com/2010/mytest',
	'http://go.dantup.me.uk',
	'http://go.dantup.me.uk/',
	'http://go.dantup.me.uk/mytest',
	'http://go.tuppeny.com',
	'http://go.tuppeny.com/',
	'http://go.tuppeny.com/mytest',
]

Now we've set the data up, it's time to write the code to handle the redirects. To allow for easy testing, we'll first create a method that takes a URL and returns where it should map to. This will be called by both the tests and the real redirects.

def get_redirect_url(url):
scheme, netloc, path, query, fragment = urlparse.urlsplit(url)

# Discard any port number from the hostname
netloc = netloc.split(':', 1)[0]

# Fix empty paths to be just '/' for consistency
if path == '':
	path = '/'

# Check if we have a mapping for this domain
if netloc in urls:
	# Grab the redirect info tuple
	redirect_info = urls[netloc]
	# Root redirects
	if not redirect_info[1]:
		return 'http://' + redirect_info[0] + '/'
	# Redirects with paths
	else:
		return urlparse.urlunsplit(['http', redirect_info[0], path, query, fragment])
	# No mapping, so return None
else:
	return None

This code is fairly straight forward. It uses our mappings dictionary to look up the domain to redirect to, and whether to include the path information. Next we need to write the code that actually handles incoming requests. This will check whether test mode is enabled, and if the request is '/test'. If so, it will output a table using out list of test URLs above. Otherwise it will call the same method, but actually perform a redirect. If we couldn't match a domain, we'll use a 302 redirect to the default URL, and send an email/log.

def get(self):
	# If we're allowed to test (eg. local), and requested /test, then output the test
	if ALLOW_TEST and self.request.path == '/test':
		self.response.out.write('<h1>testing</h1>;')
		self.response.out.write('<table>')
		for test_url in test_urls:
			self.response.out.write('<tr><td>' + test_url + '</td><td>&nbsp;</td><td>' + get_redirect_url(test_url) + '</td></tr>')
			self.response.out.write('</table>')

	# Otherwise, just go ahead and redirect
	else:
		# Perform redirect
		url = get_redirect_url(self.request.url)

	if url:
		logging.info('Redirecting ' + self.request.url + ' to ' + url);
		self.redirect(url, permanent=True)

	else:
		# Log that we didn't know what this was, and redirect to a good default
		logging.error('Unable to redirect this url: ' + self.request.url);
		mail.send_mail_to_admins(
			sender='"DanTup Redirect" <myemail@mydomain.com>',
			subject='Redirect Script Error',
			body='Unable to redirect this url: ' + self.request.url
		)

		# Don't do permanent (301), since we don't know what this is.
		# Move it into the dictionary above if needed
		self.redirect(DEFAULT_URL)

There's a lot of code there, but it should be fairly simple to understand. We handle the test mode by just spitting out a table of our test URLs and the redirects. We can then look over this manually to make sure everything looks correct before going live. Otherwise we work out the redirect for the current request and redirect. If no URL was found, we log and email the attempt, and redirect to the default URL. When the email comes through, we can then add the domain we missed to the mappings dictionary and specify how it should be handled.

Naked Domains on App Engine

You'll notice that "naked" versions of my domains are missing from the script. This is because App Engine doesn't support naked domains, so these are all set up as redirects in my registrars control panel. They support 301 redirects with the same URL mapping options (eg. redirect all to root, or copy the path).

Conclusion

It didn't take much to write a simple generic redirect script, and now we can handle redirects for all domains in the future. This simply needs setting up on App Engine and any number of domains pointing at it. It's worth noting that you can point multiple domains from different Google Apps accounts at the same App Engine app. There is no requirement to use App Engine for hosting your sites in order for this script to be used. The fact that blog.dantup.com is hosted on App Engine doesn't change anything. You could redirect to an Azure site if you wished! Though you probably wouldn't want to ;-)