Converting from Drupal to WordPress

Well after looking at this site I modified the existing Drupal database to remove all references to the img_assist module. What a pain in the arse that module is, there is no way to easily migrate away from it so I wrote a Perl script to change the tags to <img> tags instead.

use DBI;
my $databasename = “drupal”;
my $servername = “192.168.1.1”;
my $port = “3306”;
my $dsn = “dbi:mysql:$databasename:$servername:$port”;

# set the user and password
my $user = ‘xxxxx’;
my $pass = ‘xxxxx’;

# now connect and get a database handle
my $dbh = DBI->connect($dsn, $user, $pass)
or die “Can’t connect to the DB: $DBI::errstr\n”;

my $query1=”Select * from node_revisions LEFT JOIN node ON (node_revisions.nid=node.nid) WHERE node.type=’blog'”;
my $action = $dbh->prepare($query1);
$action->execute();

while (@row = $action->fetchrow()) {
$body=$row[4];
while ($body=~m/\[img_assist\|nid=(\d+)\|.*\]/){
$vid=$row[1];
$width=450;
my $query2=”Select filepath FROM files LEFT JOIN image ON (files.fid=image.fid) WHERE nid=$1 AND image_size=’_original'”;
my $action2= $dbh->prepare($query2);
$action2->execute();
@row2 = $action2->fetchrow();
$body=~s/\[img_assist\|nid=(\d+)\|.*\]/<p style=”text-align: center;”><img src=”$row2[0]” width=”$width”><\/p>/;

($escapequotes=$body)=~s/\’/\’\’/g;
my $query3=”UPDATE node_revisions SET body=’$escapequotes’,teaser=’$escapequotes’ WHERE vid=$vid”;
print “$vid\n”;
my $action3 = $dbh->prepare($query3);
$action3->execute();
}
while ($body=~m/\<img.*src\=\”\?q\=image\/view\/(\d+).*\>/){

$vid=$row[1];
$nid=$1;
$width=450;
my $query2=”Select filepath FROM files LEFT JOIN image ON (files.fid=image.fid) WHERE nid=$nid AND image_size=’_original'”;
my $action2= $dbh->prepare($query2);
$action2->execute();
@row2 = $action2->fetchrow();
$body=~s/\<img.*src\=\”\?q\=image\/view\/$nid.*\>/<img src=”$row2[0]” width=”450″>/;

($escapequotes=$body)=~s/\’/\’\’/g;
my $query3=”UPDATE node_revisions SET body=’$escapequotes’,teaser=’$escapequotes’ WHERE vid=$vid”;
print “$vid\n”;
my $action3 = $dbh->prepare($query3);
$action3->execute();
}
}
print “Finished\n”;

It’s a bit ugly but works, I had to go through twice because some of the tags were using older versions of img_assist, again what a dumb idea. If only the entries had been migrated as the module was updated. Anyway that was done. Then I modified the import script linked at the start of this post to this:

# This set of instructions was updated by 9seeds.com based on
# a post by Mike Smullin at mikesmullin.com
#
# Mike’s original post can be found here:
# http://www.mikesmullin.com/development/migrate-convert-import-drupal-5-to-wordpress-27/
#
# 9seeds’ updated post can be found here:
# http://9seeds.com/news/drupal-to-wordpress-migration

# Clear all existing WordPress content
TRUNCATE TABLE wordpress.wp_comments;
TRUNCATE TABLE wordpress.wp_links;
TRUNCATE TABLE wordpress.wp_postmeta;
TRUNCATE TABLE wordpress.wp_posts;
TRUNCATE TABLE wordpress.wp_term_relationships;
TRUNCATE TABLE wordpress.wp_term_taxonomy;
TRUNCATE TABLE wordpress.wp_terms;

# Create Categories
INSERT INTO wordpress.wp_terms (term_id, `name`, slug, term_group)
SELECT
d.tid, d.name, REPLACE(LOWER(d.name), ‘ ‘, ‘_’), 0
FROM drupal.term_data d
INNER JOIN drupal.term_hierarchy h
USING(tid);

# Add Taxonomies
INSERT INTO wordpress.wp_term_taxonomy (term_id, taxonomy, description, parent)
SELECT
d.tid `term_id`,
‘category’ `taxonomy`,
d.description `description`,
h.parent `parent`
FROM drupal.term_data d
INNER JOIN drupal.term_hierarchy h
USING(tid);

# Import posts/pages
INSERT INTO wordpress.wp_posts (id, post_date, post_content, post_title, post_excerpt, post_name, post_modified, post_type, `post_status`)
SELECT DISTINCT
n.nid `id`,
FROM_UNIXTIME(n.created) `post_date`,
r.body `post_content`,
n.title `post_title`,
r.teaser `post_excerpt`,
IF(SUBSTR(a.dst, 11, 1) = ‘/’, SUBSTR(a.dst, 12), a.dst) `post_name`,
FROM_UNIXTIME(n.changed) `post_modified`,
n.type `post_type`,
IF(n.status = 1, ‘publish’, ‘private’) `post_status`
FROM drupal.node n
INNER JOIN drupal.node_revisions r
USING(vid)
LEFT OUTER JOIN drupal.url_alias a
ON a.src = CONCAT(‘node/’, n.nid)
WHERE n.type IN (‘blog’);

# Turn articles in to posts
update wordpress.wp_posts set post_type=’post’ where post_type=’blog’;

# Add post to category relationships
INSERT INTO wordpress.wp_term_relationships (object_id, term_taxonomy_id)
SELECT nid, tid FROM drupal.term_node;

# Update category count
UPDATE wordpress.wp_term_taxonomy tt
SET `count` = (
SELECT COUNT(tr.object_id)
FROM wordpress.wp_term_relationships tr
WHERE tr.term_taxonomy_id = tt.term_taxonomy_id);

# Import comments
INSERT INTO wordpress.wp_comments (comment_post_ID, comment_date, comment_content, comment_parent, comment_author, comment_author_email, comment_author_url, comment_approved)
SELECT nid, FROM_UNIXTIME(timestamp), comment, thread, name, mail, homepage, status FROM drupal.comments;

# Update comment counts
use wordpress;
UPDATE `wp_posts` SET `comment_count` = (SELECT COUNT(`comment_post_id`) FROM `wp_comments` WHERE `wp_posts`.`id` = `wp_comments`.`comment_post_id`);

# Fix breaks in post content
UPDATE wordpress.wp_posts SET post_content = REPLACE(post_content, ”, ”);

# fix images in post content
UPDATE wordpress.wp_posts SET post_content = REPLACE(post_content, ‘”files/’, ‘”wp-content/uploads/’);
UPDATE wordpress.wp_posts SET post_content = REPLACE(post_content, ‘”/files’, ‘”wp-content/uploads/’);
UPDATE wordpress.wp_posts SET post_content = REPLACE(post_content, ‘”images/’, ‘”wp-content/uploads/images/’);

# fix images in post_excerpt  content
UPDATE wordpress.wp_posts SET post_excerpt = REPLACE(post_excerpt, ‘”files/’, ‘”wp-content/uploads/’);
UPDATE wordpress.wp_posts SET post_excerpt = REPLACE(post_excerpt, ‘”/files’, ‘”wp-content/uploads/’);
UPDATE wordpress.wp_posts SET post_excerpt = REPLACE(post_excerpt, ‘”images/’, ‘”wp-content/uploads/images/’);

# fix links
UPDATE wordpress.wp_posts SET post_content = REPLACE(post_content, ‘href=”?q=node/’, ‘href=”?p=’);
UPDATE wordpress.wp_posts SET post_excerpt = REPLACE(post_excerpt, ‘href=”?q=node/’, ‘href=”?p=’);

#remove dud characters
UPDATE wordpress.wp_posts SET post_content = REPLACE(post_content, ‘Â’, ”);
UPDATE wordpress.wp_posts SET post_excerpt = REPLACE(post_excerpt, ‘Â’, ”);

I added some extra replacements which fixed any links to old nodes and I also had to include post type ‘blog’ so that all the entries got moved over. I don’t care about pages or stories.

This is by no means a completed product, it works just enough for me to have a WordPress blog running without too much effort. The migration went better on littlefishcreations.com this blog still has almost all the images failing. I might get around to fixing it one day.

This entry was posted in Computers and tagged , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *