Importing posts to Blogger

I said when I started this blog that I was going to use it to merge all my other blogs into one.

Well I Googled a bit and search the Blogger help, but theres very little information on actually doing this…other than retyping it all manually. I finally found someone referencing an ‘idea’ to use the Blogger email facility. I.e. in your settings, under Email, you can set an email address that you send to and it gets automatically blogged. You can even tell it not to publish until you go in and publish it (i.e. it saves it as a draft). Couple this with the fact that it is possible to alter the date / timestamp on posts in blogger (go to Edit your post, at the bottome of the post click on Post Options, and set whatever you want), it makes it possible to import.

Basically, we just need a way to auto-magically convert a bunch of posts to a standard email format so blogger can read it and blog it properly. This is text manipulation. This is perl work!

My realllyy old “news” page (before blogs existed) was basically a slightly HTML-ised text file. I did this for portability and it finally paid off. If your blog posts are hosted somewhere or in another format, you basically want to aggregate them into one file so you can process that one file. I have multiple blog pages, so I’ll do them seperately, which each blog file containing a hundred posts or so.

There are a couple of things to bare in mind. Blogger will date and timestamp using the time that it receives the email. So you’ll still need to change that manually. What I did to make this easy for me, is I used a regex to strip the date of the post from the file, and used the date of the post as the subject for the blog. That way I know what I’m looking at when there are 100 draft posts in my blogger!

Also, Blogger escapes any HTML in the email. I used a regex to strip a lot of the HTML out, but for some of the URL’s blogger still broke them. Oh well, they are 4 year old posts!

I called the perl script I used news2blogger.pl, and it’s in the resources section (not that the resources section is operational at the moment!). It was customized for me, so it is basically a proof of concept for anyone else. If you want to use it, make sure to examine the way I’m parsing the file with regex’s and to alter the email settings.

Leave a Reply