Archive for the Hacks Category

Htaccess redirection for timestamp changes

Friday, November 10th, 2006

I’ve talked before about .htaccess and the hotness that is mod_rewrite for various purposes.

I ran across another one today that I hadn’t dealt with and figured it was worth discussing. I updated an old post with some new material. That in itself wasn’t overly odd. But I decided that the post was different enough and the demand still fresh enough — the damned thing accounts for 10% of the total traffic to my site, spam included — to warrant a new timestamp. So I bumped the timestamp to today. Lovely.

Almost lovely. So now the content has a different URL. I use dates in my URLs as Al Gore intended. What used to exist at

/archives/2005/10/26/xbox-360-high-definition-faq/

Now exists at

/archives/2006/11/10/xbox-360-high-definition-and-hd-dvd-faq/

That’s not necessarily a bad thing, just different. But guess what all the search engines have in them. Yes, the old one. If they go there now they’ll get a 404! That’s something I’ll have to address with an .htaccess redirect.

Redirecting from the old page to the new page

This is actually a fairly painless procedure in .htaccess. Let’s just take a look.

<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /

RewriteRule ^archives/2005/10/26/xbox-360-high-definition-faq/?(.*) /archives/2006/11/10/xbox-360-high-definition-and-hd-dvd-faq/$1 [R=301,N]
</IfModule>

Most of this is a bit of work just to get ready to do the one line that counts. I’ve explained the other stuff previously, so let’s look at the line that does the actual work, starting with RewriteRule. Our RewriteRule does everything we need on one shot. It looks for matches to the first little clause — in this case starting at ^ and ending in ) — in the URL and, if it matches, “rewrites” the URL to match the second clause. In this case, we’re looking for:

  • ^ - starts us at the beginning of the URL after the domain.
  • archives/2005/10/26/xbox-360-high-definition-faq - should be relatively self-explanatory. It’s the “old” URL that we’re looking to change.
  • /? - that looks for 0 or 1 occurrence of a / character. Just in case someone forgets to include it.
  • (.*) - grabs everything after the / and stores it off. It’s what’s termed a parenthetical expression that allows us to later create a back-reference to the text. For instance, I support paging of my comments. If someone links to a comment, the URL might look something like (I’m cutting off the beginning for word wrapping reasons) ...-high-definition-faq/comment-page-18/#comment-8789. In this case, I store comment-page-18/#comment-8789 in the back-reference.
  • /archives/2006/11/10/xbox-360-high-definition-and-hd-dvd-faq/ - this is simply appended directly to the domain when Apache rewrites the URL. So, we’re applying the new beginning of the URL we want to redirect to.
  • $1 - Remember the “back-reference” listed above? This $1 references that. It’s the first of the back-references we created — as it’s the only parenthetical expression in the matching regular expression — so we use the number 1. We can have many parenthetical expressions in our matching regular expression and hence many back-references but we just need one for this. Whatever was stored in our back-reference, like the comment-page-18 example above, will get plunked onto the back of our URL. If there was nothing in that back-reference, it doesn’t matter as we’ll just append “nothing” to the end of the URL.
  • [R=301, - this tells Apache to return a 301 redirect status code back to the browser that’s requesting the page when this rule is executed and the matching expression matches. A 301 tells the requester that the content has moved permanently, giving spiders and hence search engines the opportunity to correct their links when they spider.
  • N] - this tells Apache to start over from the beginning of the rewrite rules, but this time use the “new” URL. That way WordPress will get its shot at the URL in order to construct the things it needs to construct.

At the end of all this meandering, the user is redirected to the new content and she typically doesn’t even have to know. What shows up in the location bar at the top of the screen changes to match the updated location. It doesn’t get much simpler.

Here’s what my access log shows for these cases.

You can see the initial request come in from Google, which gets the 301 redirect response along with the new URL. Then comes the second request for the new URL and we respond with the 200 OK response and the actual content. Huzzah!

The spiders are happy. The users are happy. I’m happy. Let’s have pie.

Permalink conversion from Movable Type to WordPress

Friday, June 17th, 2005

This site used to be a Movable Type site. The permalinks generated by Movable Type — well, the static directory structure in their case — looked something like this:

http://www.coldforged.org/archives/2005/06/15/100_greatest_americans.html

Whereas the permalink structure I setup in WordPress looks like so:

http://www.coldforged.org/archives/2005/06/15/100-greatest-americans/

This shouldn’t be an enormous problem and in general it isn’t. The search engines reindexed so I haven’t seen a referrer from the likes of Google to one of the old files for quite a while. The main problem is people that linked to those older entries, including my own self-referencing links from those days. All of those kinds of links will simply hit a 404 Not Found error and the reader will be left stumped and shaking their head at my idiocy.

Until last night. It finally occurred to me that I didn’t have to put up with that crap, not with any sort of rudimentary knowledge of PHP and .htaccess rewrites. So, I set out to redirect those old-style URLs to my new style URLs.

Since the rest of this is even more boring and technical than the preceding gibberish and is likely only of interest to other WP users, I’ll spare the rest of you and make those of you who do care click on the little link below to get to the goods.

Read the rest of this entry »

RSS 2.0 pubDate and Feed On Feeds

Tuesday, June 7th, 2005

UPDATED 6/10/05: I modified the code below to account for a weakness of the initial method.

One little niggling thing with FEED ON FEEDS that’s bothered me is its lack of handling of RSS 2.0 pubDates. So instead of having everything nicely sorted by the actual publication date, you get a mish-mash of appropriately ordered RSS 1.0 feeds — since FEED ON FEEDS does appropriately handle the RSS 1.0 dc:date specification — and inappropriately ordered RSS 2.0 and Atom feeds.

Luckily there’s a very easy workaround. Open the init.php file in your FEED ON FEEDS installation. Go to line number 593 and change the line from this:

$dcdate = mysql_escape_string($item['dc']['date']);

to this:

if( !empty( $item['dc']['date'] ) )
        $dcdate = mysql_escape_string($item['dc']['date']);
    else
        $dcdate = date( "Y-m-d\TH:i:sO", $item['date_timestamp'] );

From now on the rest of your posts will be ordered correctly. Minor little detail, yes. But I’m picky that way.

Paged comment display

Friday, May 13th, 2005

I’ve gone ahead and implemented paging for my comment displays here on cf.org since I have several posts where there are a couple hundred comments. For those of you who haven’t seen it, this functionality is available in plugin form these days. It’s a nice plugin and provides the functionality as well as can be expected, but were I to do it from scratch I’d likely go about it slightly differently so that the theme’s comment template was leveraged somehow. As it stands I had to make considerable modifications in order to get my rather customized comment functionality worked in. But it’s there.

I think the ability to view all of the comments at once is useful for doing textual searches, so I’ve modified the original plugin to allow viewing of the complete list of comments if desired. I’ve ordered the views kind of differently as well depending on what’s being displayed. If you’re paging through the comments, you’ll see the most recent comments first, starting from the newest comment at the top of the first page to the oldest comment at the bottom of the last page. I figure if you’re paging you’re likely interested in the most recent content first. However, if you’re viewing all the comments on one page — or if the post has 10 comments or fewer such that it’s implicitly one page — the comments display from oldest to newest. We’ll see how this works out. I think it’s not overly confusing given the displayed comment index… you know that comment 99 is older than comment 100 regardless of any relative display ordering.

Have a look and tell me what you think.

Oh how I love thee, Lilina

Friday, April 29th, 2005

Lilina is (yet another) news aggregator. The thing I like about this one is that it’s a PHP-based solution that sits in your hosting space. So, now you don’t have to be hostage to some third-party website that may or may not be down and may or may not serve you ads with your RSS. It doesn’t hurt that it’s relatively attractive and simple.

One thing WordPress users might ask for is an easy “blog it” link. I aim to please. Put this in your Lilina’s index.php file right below the line that has “add to furl” in it. Note that you should fill in your WordPress installation URL where appropriate.

$out .= " &nbsp; <a href="javascript:void(window.open('http://<MY WORDPRESS DOMAIN>/wp-admin/bookmarklet-advanced.php?&popupurl=".addslashes($href)."&popuptitle=".addslashes($title)."','WordPress bookmarklet','scrollbars=yes,width=600,height=560,left=100,top=150,status=yes'))">Blog it</a>";

Yes, it’s a long line.