» Who knew that Mister T was such a fashion maven? (0)

» "And right then," Knox said, "I heard, 'Excuse me, would it be OK if we carried her around and she touched each bag?'" Sportsmanship defined. (0)

» Web-based sequence diagram generator. Whoda thunk? Next thing you know you'll be able to buy stuff online. (0)

Permalink conversion from Movable Type to WordPressPermalink conversion from Movable Type to WordPress

This site used to be a Movable Type site. The permalinks generated by Movable Type — well, the static directory structure in their case — looked something like this:

http://www.coldforged.org/archives/2005/06/15/100_greatest_americans.html

Whereas the permalink structure I setup in WordPress looks like so:

http://www.coldforged.org/archives/2005/06/15/100-greatest-americans/

This shouldn’t be an enormous problem and in general it isn’t. The search engines reindexed so I haven’t seen a referrer from the likes of Google to one of the old files for quite a while. The main problem is people that linked to those older entries, including my own self-referencing links from those days. All of those kinds of links will simply hit a 404 Not Found error and the reader will be left stumped and shaking their head at my idiocy.

Until last night. It finally occurred to me that I didn’t have to put up with that crap, not with any sort of rudimentary knowledge of PHP and .htaccess rewrites. So, I set out to redirect those old-style URLs to my new style URLs.

Since the rest of this is even more boring and technical than the preceding gibberish and is likely only of interest to other WP users, I’ll spare the rest of you and make those of you who do care click on the little link below to get to the goods.

The Code

Hey, thanks for sticking around. Here’s the code. Copy that code out and put it in a file called convertperma.php in your site root and put those rewrite rules somewhere in your .htaccess file, anywhere except between the #BEGIN WordPress and #END WordPress labels.

<?php
// Convert MT-style permalinks of the form:
//
// 	http://www.coldforged.org/archives/2005/06/15/of_media_conglomerates.html

// 
// to
// 
// 	http://www.coldforged.org/archives/2005/06/15/of-media-conglomerates/

// ---------------------------------------------------------------------------
//
// In order to make this work we also need to add some .htaccess 
// rewrite rules. Add these in, making sure to set the RewriteBase 
// correctly for your installation:
// 
// <IfModule mod_rewrite.c>
// RewriteEngine On
// RewriteBase /
// RewriteRule (.*)/([0-9]+)/(.*)/([a-zA-Z0-9_^/]*)\.html /convertperma.php?req=$4 [QSA,L]
// </IfModule>

        $request = $_REQUEST['req'];
        $request = str_replace( '_', '-', $request );
	$request .= '/';
        header("Location: $request");
        exit();
?>

The Explanation

Some people may ask why I didn’t just do it with .htaccess rewrite rules and not bother with the PHP at all. It’s a fair question but I couldn’t figure out a way to do it that didn’t involve a lot of lines to account for the number of words separated by underscores. If anyone can simplify it down further, I’m all ears.

Let’s look at the rewrite rule first. It essentially looks for an URL that has at least one instance of a numeric directory (e.g. /2004/) to start with. This way other possible HTML files in your installation won’t necessarily be converted. When I first tried it I neglected to make the distinction and got burned on my spelling checker plugin of all places. Suddenly it stopped working. It’s because the /spellchecker.html file that was being included during the processing got converted to /spellchecker/ by the rule. Couldn’t have that, so now we do a fairly naive check by looking for at least one numeric directory. We then look for some combination of letters, numbers, and underscores — but no slashes so that we don’t pick up preceding directories — followed by an .html string and nothing following. If we find that, we’ll convert the whole thing into a new request to our PHP script and set the req query string variable to be the underscore-filled part of the url (e.g. of_media_conglomerates in this case).

Now let’s get into the actual PHP. First, we load up that query string variable mentioned above and put it in a variable $request. We then replace those nasty underscores with dashes. We then cap off the string with a closing slash. Finally we send out a new header with this new location. Surprisingly, enough by the time it heads back in to the .htaccess file for the second time everything is where it should be.

Go ahead, give it a shot.

Personally I think it’s magic. And faeries.

Digg!

3 Responses to “Permalink conversion from Movable Type to WordPress”

  1. 1

    Matt Says:

    The problem with this is it doesn’t work with the way MT truncates long titles. Here’s how my dream plugin would work, called MT-404, that I haven’t written yet. (You want to?)

    mt-404.php is set as the 404 handler for the site. If it gets something that looks like a permalink from the new MT system, it does a database query using all the date info and a LIKE for what it has of the title and then does a 301 permanent redirect to the new URI.

    If it sees an old style number MT URL it looks for MT tables in the same database and if it finds them does a select for that ID and then uses that information to redirect to the relevant WP entry after using get_permalink as above.

    I think this would cover just about everything. It could be a plugin or a standalone file.

  2. 2

    ColdForged Says:

    matt said: The problem with this is it doesn’t work with the way MT truncates long titles.

    Ah. Truncation. Joy.

    Apparently when I set up my MT blog I did it in such a way that the titles weren’t truncated, because digging through my database looking for old-style links resulted in complete titles for even my longest ones.

    As this works for my particular need, I’m less than amped to go after the the “dream plugin” thing. Perhaps someone else wants to jump on.

    it does a database query using all the date info and a LIKE for what it has of the title

    Hmm… I wonder if you couldn’t get away with a LIKE by default. That might be the missing element, though I don’t know what the possible side-effects might be.

  3. 3

    cavemonkey50 Says:

    ColdForged, I ran into that same exact problem. I have used .htaccess to semi-fix the old permalinks, but mainly I just mention on my 404 page to use the search function on the site. Not sure how many people actually use it, but it’s at least worth mentioning to them.

Leave a Reply

How do I get a cool icon like yours? Obviously "cool" is subjective, but you can have your own icon displayed here by signing up for a gravatar. Note that I currently accept up to an R-rated icon though that may change in the future.

You may use Markdown syntax in your comments.

Name

Mail (never published)

Website

In order to comply with COPPA and cover my own ass, you must be 13 or older to post a comment here. Period, no exceptions.

Comment Preview

  1. 4

    Someone Says: