Archive for the Blogging Category

Htaccess redirection for timestamp changes

Friday, November 10th, 2006

I’ve talked before about .htaccess and the hotness that is mod_rewrite for various purposes.

I ran across another one today that I hadn’t dealt with and figured it was worth discussing. I updated an old post with some new material. That in itself wasn’t overly odd. But I decided that the post was different enough and the demand still fresh enough — the damned thing accounts for 10% of the total traffic to my site, spam included — to warrant a new timestamp. So I bumped the timestamp to today. Lovely.

Almost lovely. So now the content has a different URL. I use dates in my URLs as Al Gore intended. What used to exist at

/archives/2005/10/26/xbox-360-high-definition-faq/

Now exists at

/archives/2006/11/10/xbox-360-high-definition-and-hd-dvd-faq/

That’s not necessarily a bad thing, just different. But guess what all the search engines have in them. Yes, the old one. If they go there now they’ll get a 404! That’s something I’ll have to address with an .htaccess redirect.

Redirecting from the old page to the new page

This is actually a fairly painless procedure in .htaccess. Let’s just take a look.

<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /

RewriteRule ^archives/2005/10/26/xbox-360-high-definition-faq/?(.*) /archives/2006/11/10/xbox-360-high-definition-and-hd-dvd-faq/$1 [R=301,N]
</IfModule>

Most of this is a bit of work just to get ready to do the one line that counts. I’ve explained the other stuff previously, so let’s look at the line that does the actual work, starting with RewriteRule. Our RewriteRule does everything we need on one shot. It looks for matches to the first little clause — in this case starting at ^ and ending in ) — in the URL and, if it matches, “rewrites” the URL to match the second clause. In this case, we’re looking for:

  • ^ - starts us at the beginning of the URL after the domain.
  • archives/2005/10/26/xbox-360-high-definition-faq - should be relatively self-explanatory. It’s the “old” URL that we’re looking to change.
  • /? - that looks for 0 or 1 occurrence of a / character. Just in case someone forgets to include it.
  • (.*) - grabs everything after the / and stores it off. It’s what’s termed a parenthetical expression that allows us to later create a back-reference to the text. For instance, I support paging of my comments. If someone links to a comment, the URL might look something like (I’m cutting off the beginning for word wrapping reasons) ...-high-definition-faq/comment-page-18/#comment-8789. In this case, I store comment-page-18/#comment-8789 in the back-reference.
  • /archives/2006/11/10/xbox-360-high-definition-and-hd-dvd-faq/ - this is simply appended directly to the domain when Apache rewrites the URL. So, we’re applying the new beginning of the URL we want to redirect to.
  • $1 - Remember the “back-reference” listed above? This $1 references that. It’s the first of the back-references we created — as it’s the only parenthetical expression in the matching regular expression — so we use the number 1. We can have many parenthetical expressions in our matching regular expression and hence many back-references but we just need one for this. Whatever was stored in our back-reference, like the comment-page-18 example above, will get plunked onto the back of our URL. If there was nothing in that back-reference, it doesn’t matter as we’ll just append “nothing” to the end of the URL.
  • [R=301, - this tells Apache to return a 301 redirect status code back to the browser that’s requesting the page when this rule is executed and the matching expression matches. A 301 tells the requester that the content has moved permanently, giving spiders and hence search engines the opportunity to correct their links when they spider.
  • N] - this tells Apache to start over from the beginning of the rewrite rules, but this time use the “new” URL. That way WordPress will get its shot at the URL in order to construct the things it needs to construct.

At the end of all this meandering, the user is redirected to the new content and she typically doesn’t even have to know. What shows up in the location bar at the top of the screen changes to match the updated location. It doesn’t get much simpler.

Here’s what my access log shows for these cases.

You can see the initial request come in from Google, which gets the 301 redirect response along with the new URL. Then comes the second request for the new URL and we respond with the 200 OK response and the actual content. Huzzah!

The spiders are happy. The users are happy. I’m happy. Let’s have pie.

Screw DreamHost

Friday, May 5th, 2006

Bye DreamHost. One too many “you’re using too many server resources, how about investigating before we shut off your account?” emails. Now I’m with A Small Orange, which supposedly is much better. We’ll see, but so far I’m impressed with their support. I asked for SSH access and got a response within about 5 minutes.

I’m sure there will be more later. At least now I have a place to put it!

Screw LightPress

Tuesday, April 11th, 2006

Screw LightPress. I could never get it looking the way I wanted — well, to be fair, I didn’t have the patience to get it to look the way I wanted –, it didn’t work right with Markdown, and it didn’t fix my wonky Dreamhost idiocy. So, I’m back to my friggin’ design. If Dreamhost sends me a nastygram, I’ll move. We’ll see how that goes, eh?

Moving to Lightpress

Monday, March 13th, 2006

I’ve moved this thing over to the Lightpress front end. Why? I’ve mentioned before that I’ve gotten nastygrams from Dreamhost for taking too many resources. I’ve tried many, many things, from adjusting the content of the sidebar to no longer display “most popular” and other types of things, to no longer using my Image Headlines plugin, to enabling caching, to using the Preformatted plugin and I never got the usage under control. I’m almost positive it’s a Dreamhost ineffeciency, but they won’t do anything about it preferring instead to send me nastygrams and encourage me to upgrade to their dedicated hosting for a hundred bucks a month. Sure.

I average 6,000 hits a day. Does that seem excessive to you? Does that seem like enough traffic to choke a server? Me neither. So I installed Lightpress which is a supposedly more lightweight front-end to WordPress. That forced me to downgrade to WordPress 1.5 but oh well.

So, what does this mean to you, dear reader? A couple of things.

  • Some things don’t work right now. Some people can’t find posts, the archives are for poop. Someday I’ll get around to them.
  • Commenting has been turned off. I had it on, LightPress supports commenting. Hell, they even provide an “anti-spam” plugin. Doesn’t appear to do much. So, I spend my time receiving emails from my installation with the 40 latest spam comments and I couldn’t be bothered anymore. When the signal to noise ratio drops below the audibility threshhold, the terrorists win. Congrats spammers! You win!

If I sound bitter or uncaring, I am right now. I have a(nother) sinus infection that’s absolutely killing me. My daughter had one of the worst “colds” we’ve experienced with her for the past 8 days including a 2:00 am trip to the ER for a nasty croup. My wife is slowly coming down with something as well. And I have a work deadline hanging over my head like a guillotine so I can’t just go home and sleep all day. So, dealing with the latest phentermine spam? Off the chart of “sick and tired of screwing with.”

I know this is ugly

Thursday, February 2nd, 2006

Please bear with me, I’m trying to upgrade to WordPress 2.0 like all the cool kids, but I don’t have a lot of time to make it pretty. Or even to make it work correctly. So sorry.