Tuesday 1 July 2008

Practical ASP.NET Url Rewriting with UrlRewriting.Net

This post describes briefly how I implemented url rewriting with the UrlRewriting.Net package.

My objectives were simple, to:

  1. Use simple readable urls wherever possible e.g. http://domain.com/app
  2. Honour querystring parameters passed to pages e.g. http://domain.com/app?param=value
  3. Map several Urls to one implementation page e.g. http://domain.com/red => http://domain.com/colours.aspx?name=red
  4. Not have to create physical folders for all virtual folders e.g. /red does not actually exist on disk!
  5. Ensure all ASP.Net functionality remains intact e.g. postbacks, themes, authentication, AJAX
  6. Must work for apps rooted at the domain e.g. http://domain.com/ and in subdirectories e.g. http://localhost/devapp/.

This was based on the following setup:

  • An IIS 5.5/6 web server
  • IIS wilcard (.*) request mapping
  • UrlRewriting.Net package
  • FormRewriter control adapter (see ScottGu's article here)
  • An existing large site with default pages named default.aspx

I chose UrlRewriting.Net as it seemed fullest featured package out there, and the only one I tested that would work correctly with FormsAuthentication and ASP.Net AJAX.

With this running 'out of the box' I found a couple of problems.

Issue 1: Trailing slashes

This can be an issue with themes and is also required to get default pages working sensibly. The idea is to remove the trailing slash on urls except those on the authority.

For example, we want:

  • http://domain.com/ to map to http://domain.com/ i.e. no change
  • but http://domain.com/dir/ to map to http://domain.com/dir i.e. slash removed

My solution is based on the one by Fabrice here but is modified to exclude the authority part of the Url (e.g. http://domain.com/) so we don't end up in an endless loop of redirects. This rule has to be a redirect since UrlRewriting.Net will not process multiple rewrites. In any case you don't want users or search engines seeing http://domain.com/dir/ as different from http://domain.com/dir so a permanent redirect to the slashless url makes good sense here, giving a single identity to multiply addressable resources.

The rule looks like this:

<add name="RemoveTrailingSlash"
virtualUrl="^~/(.*)/(\?.*)?$"
destinationUrl="~/$1$2"
rewriteUrlParameter="ExcludeFromClientQueryString"
redirectMode="Permanent"
redirect="Application"
ignoreCase="true" />

Issue 2: The Default Page

The default page is set as an attribute in the UrlRewriting configuration and is a workaround for the default page being lost when the IIS wildcard mapping is added. However, UrlRewriting default pages do not work as in IIS - the default page is actually appended to all directory requests.

For example: http://mydomain.com becomes http://mydomain.com/default.aspx

In projects where branding and SEO are primary considerations this is unlikely to be acceptable.

Switching off the default page functionality (by removing the defaultPage configuration attribute) prevents this happening but then we have to add rules for every single instance in which we would like http://mydomain.com to map to the file http://mydomain.com/default.aspx .

The workaround here is to add the following rule last in the rewrite configuration:

<add name="Default"
virtualUrl="^([a-zA-Z0-9_\-/]*)(\?.*)?$"
rewriteUrlParameter="ExcludeFromClientQueryString"
destinationUrl="$1/Default.aspx$2"
ignoreCase="true"/>

This has the effect of re-instating default page functionality. Note the expression [a-zA-Z0-9_\-/] does not include the period (.) character (i.e. to skip mapping requests for actual pages, style sheets or other resources which usually contain a period) but should include any other characters that may appear in your url. This rule is designed to only map requests such as http://domain.com/test to http://domain.com/test/default.aspx and relies on the prior redirect to remove the trailing slash (except at the authority level).

NB 1. This rule is by no means exhaustive. Modify it to suit the naming structure of your site.

NB 2. This rule relies on the trailing slash rule being present first, otherwise it would map requests such as /test/ to /test//default.aspx .

Summary

In summary I have two core rules, the trailing slash rule and the default rule. I also have a number of site specific rules in the middle to map top level directories to a single template implementation page. My config rules looks something like this:

<add name="RemoveTrailingSlash"
virtualUrl="^~/(.*)/(\?.*)?$"
destinationUrl="~/$1$2"
rewriteUrlParameter="ExcludeFromClientQueryString"
redirectMode="Permanent"
redirect="Application"
ignoreCase="true" />

<!-- Begin app specific rules -->

<!-- Category mapping rule example - does not map querystring -->

<add destinationurl="~/Categories/CategoryHome.aspx?cn=$1"
ignorecase="true" name="CategoryHome" rewriteurlparameter="ExcludeFromClientQueryString"
virtualurl="^~/(Travel|Entertainment|Family|Health|Lifestyle|Care|Food|Fashion|Home)(\?.*)?$">
</add>

<!-- End app specific rules -->

<add name="Default"
virtualUrl="^([a-zA-Z0-9_\-/]*)(\?.*)?$"
rewriteUrlParameter="ExcludeFromClientQueryString"
destinationUrl="$1/Default.aspx$2"
ignoreCase="true"/>

Conclusion

Whilst UrlRewriting.Net has it's limitations it can be used to create a maintainable url rewritten site relatively quickly and cleanly .

Disclaimer: Incorrect use of Url rewriting can have catastrophic consequences for your site. One miswritten rule can bring your whole site down. Use with caution! :)

No comments :