Archive for April 2009

17

April 2009

ASP.NET MVC HandleError Attribute, Custom Error Pages and Logging Exceptions

I'm sure I don't need to tell you how bad serving a Yellow Screen of Death to your users is. Nonetheless, it seems to be pretty common practice across the web. One of the first things I do when setting up a new ASP.NET project is set up custom error pages and ensure all exceptions are logged (who wants to find out about their errors from their visitors?). Since things work a little differently in ASP.NET MVC, I thought I'd dig in and find the best way to do the same sort of thing.

The HandleError Attribute

The HandleError attribute (which appears on the default controllers in an MVC project) tells the framework that if an unhandled exception occurs in your controller that rather than showing the default Yellow Screen of Death it should instead serve up a view called Error. The controller-specific View folder will be checked first (eg. Views/Home/Error.aspx) and if it's not found, the Shared folder (Views/Shared/Error.aspx) will be used.

But How Do I Log Exceptions?

You might've spotted the problem with HandleError. It just outputs a view, and doesn't let you run any code. This might be fine if you don't want users to see errors but don't really care for fixing them. Hopefully you think this isn't acceptable and you want to investigate all exceptions!

The OnException Method

The System.Web.Mvc.Controller class contains a method called OnException which is called whenever an exception occuts within an action. This does not rely on the HandleError attribute being set. If you're being a good coder and have your own base Controller class you can override this method in one place to handle/log all errors for your site. You might choose to send emails and/or detect duplicate exceptions and discard them. For now, I'm just going to write them all to a text file in my App_Data folder.

protected override void OnException(ExceptionContext filterContext)
{
	WriteLog(Settings.LogErrorFile, filterContext.Exception.ToString());
}

/// <summary>
/// Logs a message to the given log file
/// </summary>
/// <param name="logFile">The filename to log to</param>
/// <param name="text">The message to log</param>
static void WriteLog(string logFile, string text)
{
	//TODO: Format nicer
	StringBuilder message = new StringBuilder();
	message.AppendLine(DateTime.Now.ToString());
	message.AppendLine(text);
	message.AppendLine("=========================================");

	System.IO.File.AppendAllText(logFile, message.ToString());
}

This works great, but it still shows our user an unhandled exception message, even if we use the HandleError attribute. This makes the HandleError attribute look rather useless, so I've removed it. We can easily show the friendly error ourselves with the following code:

filterContext.ExceptionHandled = true;
this.View("Error").ExecuteResult(this.ControllerContext);

It's important to set ExceptionHandled to true, otherwise you'll still see the default unhandled exception message. The OnException method returns void so we must Execute the view and pass in the ControllerContext ourselves.

How Do I see my own Errors During Development?

It's a little inconvenient to open log files or keep commenting out your error handling code while developing to see exceptions and stack traces. You might remember ASP.NET has a nice web.config setting that configures custom errors. This property is exposed via MVC, so we can set up our config to show friendly errors to remote users only:

<customErrors mode="RemoteOnly" />

Then all we need to do in our OnException method is check this value and serve up the custom error view only if it returns true.

protected override void OnException(ExceptionContext filterContext)
{
	WriteLog(Settings.LogErrorFile, filterContext.Exception.ToString());

	// Output a nice error page
	if (filterContext.HttpContext.IsCustomErrorEnabled)
	{
		filterContext.ExceptionHandled = true;
		this.View("Error").ExecuteResult(this.ControllerContext);
	}
}

It's worth noting that IsCustomErrorEnabled will resolve the RemoteOnly option for you, you don't need to check where the user is coming from. Now out site serves up friendly errors to users and logs all exceptions without us losing the ability to see stack traces during development.

14

April 2009

Using OpenID in your ASP.NET MVC Application/Blog

Over the last few days I've been rewriting this blog in ASP.NET MVC. As it gets closer to a state where I can upload it, I found myself needing to implement security for the administration section (adding, editing posts, etc.). I don't want yet another username/password to remember, and I don't want to IP-restrict it because that's not very flexible (and I don't know how static my IP is!), so what are my options?

OpenID

OpenID is nothing new, it's been around since late 2005. I've been aware of what it did and how it worked, but never really played with it. I did, however, get the impression it might solve my problem. Especially having seen that you can use your Google account as an OpenID!

What's OpenID? Why is it Cool?

OpenID is a standard for authentication, allowing you to use the same identitiy/login for multiple services. It is not the same as using the same username/password at multiple websites (that's a very bad idea). Let's see an example.

I want to be able to login to my blog to edit posts. I don't want another username/password. As Google now works as an identity provider, my blog can redirect me to Google and let them authenticate me. Google will then return me to my blog saying "Yes, this is definitely Danny Tuppeny". This means I don't need any user tables, login forms, or anything else on my blog!

This might sounds complicated, but as with most things, there's a nice .NET library called dotnetopenid to hide the complexity. Let's see some code!

On the first request, dotnetopenid will return a null response. After logging in at the identity providers website, the user will be redirected back (to the same page by default, but this can be changed) with a token on the query string. This will cause dotnetopenid to return a response. The basic code looks like this:

var openId = new OpenIdRelyingParty();

if (openId.Response == null)
{
	// No response means this is the first page load
}
else
{
	// This means we're been redirected back after authentication
	if (openId.Response.Status == AuthenticationStatus.Authenticated)
		// User was logged in (as someone!)
}

On the first page load, we would usually ask the user for their OpenID Identifier/URL, however since in my case it's always going to be Google, I'm going to hard-code this as a single value.

dotnetopenid supports adding claim requests so that you can request (or even demand) specific pieces of information. In my case I only care about authenticating me, I don't need to request my name or email address. As such, I'm just going to fire a simple request off without any claim requests.

openId.CreateRequest("https://www.google.com/accounts/o8/id").RedirectToProvider();

In the else block we need to check the response. We want to make sure that the status is Authenticated and the ClaimedIdentifier matches the known identifier for my own login.

// We got a response - check it's valid
if (openId.Response.Status == AuthenticationStatus.Authenticated
	&& openId.Response.ClaimedIdentifier.ToString() == "http://google.com/blah/blah/blah")
{
	Session["Admin"] = true;
	return Redirect("/posts/edit");
}
else
	return Content("Go away, you're not me.");

The ClaimedIdentifier will be unique to each Google account. You can run the code once and examine the returned value to find out your own, and then you can check against it.

If we put all this together into a controller action, it'll look something like this:

public ActionResult Login()
{
	var openId = new OpenIdRelyingParty();

	// If we have no response, start
	if (openId.Response == null)
	{
		// Create a request and redirect the user
		openId.CreateRequest(Settings.AdminOpenIDIdentifier).RedirectToProvider();

		return null;
	}
	else
	{
		// We got a response - check it's valid and that it's me
		if (openId.Response.Status == AuthenticationStatus.Authenticated
			&& openId.Response.ClaimedIdentifier.ToString() == Settings.AdminClaimedIdentifier)
		{
			Session["Admin"] = true;
			return Redirect("/posts/edit");
		}
		else
			return Content("Go away, you're not me.");
	}
}

That's really all there is to it. Now when I hit the Login action I'll be redirected to Google's login page. After logging in, I end up back at /posts/edit on my blog with the correct session variable set. Of course, you could instead call the built-in ASP.NET authentication methods, or look up a user from your database based on their ClaimedIdentifier. There are a lot of ways you can extend this, and I'll cover using OpenID for blog comments in a future article!

14

April 2009

IE8: Hanging with "Connecting..." when opening tabs, unable to hide Favourites bar and other bugs

I'm not the only person having these problems, so I thought I'd post the solution here for all...

After hearing that IE8 will be offered via Windows Update next week, I decided to install it on my home PC running Windows Vista. I've been using it since it RTM'd at work with some major stability issues, but I put them down to my machine rather than IE. Oh, how I was wrong!

After the usual install and reboot cycle, I opened IE8. I turned off the usual trash (Accelerators, Web Slices, Compatibility View for Intranet sites, etc.) and went to hide the nasty favourites bar. Only, I couldn't. Right-clicking on the favourites bar didn't give a context menu. So I tried View » Toolbars » Favourites - the option was disabled!

I proceeded to Google, doing the usual middle-click on results to open them in new tabs. Every new tab I opened just sat with "Connecting..." in the tab title. The content was blank.

This isn't looking good...

I swiftly disabled all the non-MS addons (and Fiddler, thinking that could potentially break connections). No change.

I fired up Event Viewer and found an "Internet Explorer" event log. This would be an interesting find if the entire log wasn't blank. Great!

I opened up My Computer and navigated to my system drive. WTF - it opened in a new window. I checked folder options - it's still set to "open each folder in the same window". I don't like that :(

Rolling back to IE7 is now a serious option. This isn't what I've come to expect from Microsoft!

After a little more Googling (using Chrome, ofcourse) I found I wasn't the only one having these issues. Fixes varied from broken addons to GoogleUpdater. Nothing seemed to apply to my problem.

Then I found a magic post. Someoe had run IE as Administrator and discovered his problems disappeared. Even better, when he ran as a normal user afterwards, things still worked! Worth a shot, eh?

Well, it worked. Running as Administrator worked fine. Running again as a normal user and everything continued to work. Specifically, I could now:

  • Hide the IE8 favourites toolbar
  • Open new tabs without the "Connecting..." message/hanging
  • Open Windows explorer folders in the same window
  • Access toolbar context-menus

An interesting bug! I can't explain how it could happen, or why any errors/failures aren't written to the event log. All I can say is I'm glad it's fixed and I don't need to roll back to IE7!

13

April 2009

Eager-Fetching of Relationships with LINQ to SQL

By default, LINQ to SQL lazy-loads its relationships. This means it won't go fetching entire trees of objects when you're only using the top ones. However, this might not always be desirable. Imagine if every time you output a post for your blog, you include tagsa. You might write something like this:

foreach(var post in db.Posts)
{
	Response.Write("<h1>{0}</h1>", post.Title);
	foreach(var Tag in post.Tags)
	{
		Response.Write("<a href=\"{0}\">{1}</a>",
			Html.Encode(tag.FullUrl),
			Html.Encode(tag.Name)
		);
	}
}

This code will work fine, but if you were to examine the SQL being generated, you'd see n+1 SELECT queries. One for fetching the posts, and one for fetching the tags for each posts - individually.

What can we do about it?

LINQ to SQL allows us to specify a set of DataLoadOptions that dictate this behaviour. One of the methods of the DataLoadOptions is LoadWith<T> which allows us to say "whenever you load x, always include y". E.g.

// Create DataLoadOptions
DataLoadOptions dlo = new DataLoadOptions();

// Always fetch tags when we get posts
dlo.LoadWith<Post>(p => p.Tags);

// Set these options on the DataContext
db.LoadOptions = dlo;

If we re-run the original query, we'll now find a JOIN to the Tags table and just a single query to fetch all the data we require.

Simple! However, bear in mind that LINQ to SQL might not always generate joins. I've got a few cases where the LoadWith seems to be ignored. As soon as I figure out why, I'll be sure to update this post!

09

April 2009

Reducing Duplicate Content with ASP.NET MVC

As you're all no doubt aware, ASP.NET MVC recently went RTM. This brings the MVC-style of coding, made very popular by Ruby-on-Rails to the ASP.NET world. I've been eager to start using MVC for months, but I've been holding off until I knew the API was locked down so I don't have to change anything.

Unfortunately, like WebForms, MVC has some "issues" with regards to duplicate content, making it not all that SEO-friendly.

What do you mean, Duplicate Content?

Duplicate content is just that - the same content repeated on multiple pages/sites. This might not sound like a big deal, but it's not something search engines like. They don't want the search results to show the same content multiple times across different websites so they often penalise or hide duplicate content. Additionally, if you have two pages with the same content, your inbound links might become split between the two - reducing the pagerank passed to either.

What's this got to do with ASP.NET MVC?

Unfortunately ASP.NET MVC makes it easy to have the same content indexed multiple times. I've listed the main problems below.

Case-Sensitivity. In ASP.NET (or rather IIS and Windows), URLs are not case sensitive. That means you can write Default.asp, default.asp or even DeFalT.aSp and still get the same page. While you'll probably stick to the same case within your website, it wouldn't be hard for someone to create links to your site with different casing (e.g. they might have CAPS LOCK turned on).

Default Documents. Most websites have a default document set up to serve when a filename is not provided in the request. E.g. http://mydomain.com/ might actually serve up http://mydomain.com/default.asp, but it won't tell the browser that's what it did. It will serve it up as if the two are different URLs.

Trailing Slashes. While the above problems are general ASP.NET/IIS issues, trailing slashes are something that only really become a problem with MVC or other URL rewriting/routing. In ASP.NET if you requested http://mydomain.com/files and you had a folder named files, IIS would issue a redirect to mydomain.com/files/. However, in ASP.NET MVC the URL routing will treat trailing slashes the same as requests without. So http://mydomain.com/controller/action is exactly the same as http://mydomain.com/controller/action/ and therefore results in duplicate content.

Query Strings. Query strings can be a big problem for duplicate content. Imagine if you can add ?sort=field to the end of your page to have a table re-ordered. To a search engine this looks like another page, but the content is mostly the same. Fortunately, ASP.NET MVC doesn't really use query strings thanks to the excellent URL routing.

So, what can we do?

Lowercase URLs. We can force all requests to our application to be lowercase by catching them in BeginRequest in Global.asax and redirecting to the lowercase version if they contain any uppercase characters.

protected void Application_BeginRequest(Object sender, EventArgs e)
{
	// Get the requested URL so we can do some validation on it.
	// We exclude the query string, and add that later, so it's not included
	// in the validation
	string url = (Request.Url.Scheme + "://" + HttpContext.Current.Request.Url.Authority + HttpContext.Current.Request.Url.AbsolutePath);

	// If we've got uppercase characters, fix
	if (Regex.IsMatch(url, @"[A-Z]"))
		PermanentRedirect(url.ToLower() + HttpContext.Current.Request.Url.Query);
}

/// <summary>
/// Redirects with a 301 header to pass along any incoming
/// PageRank/link value.
/// </summary>
/// <param name="url">The URL to redirect to</param>
private void PermanentRedirect(string url)
{
	Response.Clear();
	Response.Status = "301 Moved Permanently";
	Response.AddHeader("Location", url);
	Response.End();
}

Now if anyone requests a URL with uppercase characters, they'll be redirected with a 301 redirect. This works great, but we have a problem. All URLs generated internally by MVC will continue to use Action and Controller names in Pascal case (assuming that's how your classes are named). This means every link within our site will cause two requests (the first being a redirect). To fix this, we can override the default behaviour for creating URLs. We'll create a new extension method for the RouteCollection class called MapRouteLowercase which instead of creating a Route will create an instance of a new class, called LowercaseRoute. This class will override the GetVirtualPath method to lowercase the URL before passing it back. I can't take credit for this code, I pretty much just copied it from Graham O'Neale's blog.

public class LowercaseRoute : System.Web.Routing.Route
{
	public LowercaseRoute(string url, IRouteHandler routeHandler)
		: base(url, routeHandler) { }
	public LowercaseRoute(string url, RouteValueDictionary defaults, IRouteHandler routeHandler)
		: base(url, defaults, routeHandler) { }
	public LowercaseRoute(string url, RouteValueDictionary defaults, RouteValueDictionary constraints, IRouteHandler routeHandler)
		: base(url, defaults, constraints, routeHandler) { }
	public LowercaseRoute(string url, RouteValueDictionary defaults, RouteValueDictionary constraints, RouteValueDictionary dataTokens, IRouteHandler routeHandler)
		: base(url, defaults, constraints, dataTokens, routeHandler) { }

	public override VirtualPathData GetVirtualPath(RequestContext requestContext, RouteValueDictionary values)
	{
		VirtualPathData path = base.GetVirtualPath(requestContext, values);

		if (path != null)
			path.VirtualPath = path.VirtualPath.ToLowerInvariant();

		return path;
	}
}

public static class RouteCollectionExtensions
{
	public static void MapRouteLowercase(this RouteCollection routes, string name, string url, object defaults)
	{
		routes.MapRouteLowercase(name, url, defaults, null);
	}

	public static void MapRouteLowercase(this RouteCollection routes, string name, string url, object defaults, object constraints)
	{
		if (routes == null)
			throw new ArgumentNullException("routes");

		if (url == null)
			throw new ArgumentNullException("url");

		var route = new LowercaseRoute(url, new MvcRouteHandler())
		{
			Defaults = new RouteValueDictionary(defaults),
			Constraints = new RouteValueDictionary(constraints)
		};

		if (String.IsNullOrEmpty(name))
			routes.Add(route);
		else
		routes.Add(name, route);
	}
}

You can put these classes anywhere. Because MapRouteLowercase is an extension method, you can just call it on the RouteCollection class in place of the existing MapRoute call in your Global.asax.

// Home stuff
routes.MapRouteLowercase(
	"Default",
	"{page}",
	new { controller = "Home", action = "Index", page = 1 },
	new { page = @"\d+" }
);

Default Documents. While this issue doesn't affect MVC in the same way, there's a very similar problem. In ASP.NET MVC the default routing is {controller}/{action} but it sets a default action of Index. That means on a newly-created project, both /Home/Index and /Home will serve up the same content.

To work around this, and provide some nicer URLs, I changed the routing a little so that my default actions where mapped to the root and a seperate route dealt with the homepage (which accepts pages, to allow browsing to older posts).

public static void RegisterRoutes(RouteCollection routes)
{
	routes.IgnoreRoute("{resource}.axd/{*pathInfo}");

	// Posts
	routes.MapRouteLowercase(
		"Posts",
		"posts/{url}",
		new { controller = "Post", action = "Display" }
	);

	// Tags
	routes.MapRouteLowercase(
		"Tags",
		"tags/{url}/{page}",
		new { controller = "Tag", action = "Display", page = 1 },
		new { page = @"\d+" }
	);

	// Home stuff
	routes.MapRouteLowercase(
		"Default",
		"{page}",
		new { controller = "Home", action = "Index", page = 1 },
		new { page = @"\d+" }
	);

	// Home stuff
	routes.MapRouteLowercase(
		"Home",
		"{action}",
		new { controller = "Home", action = "" }
	);

	// Catch-all for any unmatched URL
	routes.MapRouteLowercase(
		"Error Catch-All",
		"{*path}",
		new { controller = "Home", action = "NotFound" } // NotFound doesn't exist, so HandleUnknownAction will be fired
	);
}

Trailing Slashes. To avoid trailing slashes and a few other minor issues (such as people adding /1 to a URL to get page 1, which is served up without the /1) I added some additional rules to my Global.asax as below.

protected void Application_BeginRequest(Object sender, EventArgs e)
{
	// Get the requested URL so we can do some validation on it.
	// We exclude the query string, and add that later, so it's not included
	// in the validation
	string url = (Request.Url.Scheme + "://" + HttpContext.Current.Request.Url.Authority + HttpContext.Current.Request.Url.AbsolutePath);

	// If we're not a request for the root, and end with a slash, strip it off
	if (HttpContext.Current.Request.Url.AbsolutePath != "/" && HttpContext.Current.Request.Url.AbsolutePath.EndsWith("/"))
		PermanentRedirect(url.Substring(0, url.Length - 1) + HttpContext.Current.Request.Url.Query);

	// If we end with /1 we're a page 1, and don't need (shouldn't have) the page number
	if (HttpContext.Current.Request.Url.AbsolutePath.EndsWith("/1"))
		PermanentRedirect(url.Substring(0, url.Length - 2) + HttpContext.Current.Request.Url.Query);

	// If we have double-slashes, strip them out
	else if (HttpContext.Current.Request.Url.AbsolutePath.Contains("//"))
		PermanentRedirect(url.Replace("//", "/") + HttpContext.Current.Request.Url.Query);

	// If we've got uppercase characters, fix
	else if (Regex.IsMatch(url, @"[A-Z]"))
		PermanentRedirect(url.ToLower() + HttpContext.Current.Request.Url.Query);
}

This seems to stop many of the issues I came up with, however the double-slash seems to be passed through (in AbsolutePath) as a single slash here (Vista/IIS7) so doesn't work. I've left it in just in case this behaves differently on other web servers.

Is there anything else I should do?

As of February, Google, Yahoo, ASK and Microsoft Live Search support a new Canonical meta-tag. This allows you to specify on a page that this page is duplicate content and any incoming links should instead be attributed to another page. If your site has query strings or other potential for multiple requests to serve up the same content I would recommend inserting this tag to make sure the search engines choose your prefered page.

« Older posts