I’ve talked before about .htaccess and the hotness that is mod_rewrite for various purposes.
- Permalink conversion from Movable Type to WordPress.
- Killing referral spam.
- Another adjustment to the referrer spam killing.
I ran across another one today that I hadn’t dealt with and figured it was worth discussing. I updated an old post with some new material. That in itself wasn’t overly odd. But I decided that the post was different enough and the demand still fresh enough — the damned thing accounts for 10% of the total traffic to my site, spam included — to warrant a new timestamp. So I bumped the timestamp to today. Lovely.
Almost lovely. So now the content has a different URL. I use dates in my URLs as Al Gore intended. What used to exist at
/archives/2005/10/26/xbox-360-high-definition-faq/
Now exists at
/archives/2006/11/10/xbox-360-high-definition-and-hd-dvd-faq/
That’s not necessarily a bad thing, just different. But guess what all the search engines have in them. Yes, the old one. If they go there now they’ll get a 404! That’s something I’ll have to address with an .htaccess redirect.
Redirecting from the old page to the new page
This is actually a fairly painless procedure in .htaccess. Let’s just take a look.
<IfModule mod_rewrite.c> RewriteEngine On RewriteBase / RewriteRule ^archives/2005/10/26/xbox-360-high-definition-faq/?(.*) /archives/2006/11/10/xbox-360-high-definition-and-hd-dvd-faq/$1 [R=301,N] </IfModule>
Most of this is a bit of work just to get ready to do the one line that counts. I’ve explained the other stuff previously, so let’s look at the line that does the actual work, starting with RewriteRule. Our RewriteRule does everything we need on one shot. It looks for matches to the first little clause — in this case starting at ^ and ending in ) — in the URL and, if it matches, “rewrites” the URL to match the second clause. In this case, we’re looking for:
^- starts us at the beginning of the URL after the domain.archives/2005/10/26/xbox-360-high-definition-faq- should be relatively self-explanatory. It’s the “old” URL that we’re looking to change./?- that looks for 0 or 1 occurrence of a/character. Just in case someone forgets to include it.(.*)- grabs everything after the/and stores it off. It’s what’s termed a parenthetical expression that allows us to later create a back-reference to the text. For instance, I support paging of my comments. If someone links to a comment, the URL might look something like (I’m cutting off the beginning for word wrapping reasons)...-high-definition-faq/comment-page-18/#comment-8789. In this case, I storecomment-page-18/#comment-8789in the back-reference./archives/2006/11/10/xbox-360-high-definition-and-hd-dvd-faq/- this is simply appended directly to the domain when Apache rewrites the URL. So, we’re applying the new beginning of the URL we want to redirect to.$1- Remember the “back-reference” listed above? This$1references that. It’s the first of the back-references we created — as it’s the only parenthetical expression in the matching regular expression — so we use the number1. We can have many parenthetical expressions in our matching regular expression and hence many back-references but we just need one for this. Whatever was stored in our back-reference, like thecomment-page-18example above, will get plunked onto the back of our URL. If there was nothing in that back-reference, it doesn’t matter as we’ll just append “nothing” to the end of the URL.[R=301,- this tells Apache to return a 301 redirect status code back to the browser that’s requesting the page when this rule is executed and the matching expression matches. A 301 tells the requester that the content has moved permanently, giving spiders and hence search engines the opportunity to correct their links when they spider.N]- this tells Apache to start over from the beginning of the rewrite rules, but this time use the “new” URL. That way WordPress will get its shot at the URL in order to construct the things it needs to construct.
At the end of all this meandering, the user is redirected to the new content and she typically doesn’t even have to know. What shows up in the location bar at the top of the screen changes to match the updated location. It doesn’t get much simpler.
Here’s what my access log shows for these cases.

You can see the initial request come in from Google, which gets the 301 redirect response along with the new URL. Then comes the second request for the new URL and we respond with the 200 OK response and the actual content. Huzzah!
The spiders are happy. The users are happy. I’m happy. Let’s have pie.




Someone Says:at some time after publication.