As folks know, I've been working on transitioning to Disqus over the past week. I ran into multiple problems, and I made multiple mistakes, but today the process completed and I'm ready to share details about my BlogCFC export script as well as some tips for others who may be considering making the jump.
I had two main issues when I did my import. The first was that some comments on my earliest blog entry didn't show up. This issue went away. I'm not sure why it did - but it was a minor issue compared to the second issue so I'm not concerned about it.
The second issue was the big one. When I did my import I discovered that my comments were not in the right order. This was because I screwed up my call to ColdFusion's timeFormat function. Yes - timeFormat. I've been using ColdFusion for about fifteen years and I made a rookie mistake there. (Actually I screwed it up twice which is even worse - but let's just pretend that I didn't.) This is where I ran into the problem with Disqus. When I reran my import, the changes were not reflected. I even went through the step of deleting all 6000+ of my previously imported comments, 25 at a time, to remove them and do my import. That didn't help either.
Turns out there is no way to do replacements in Disqus. Period. They recommend you test your imports on a dev forum, but even that wouldn't be helpful if you screwed up. You would need to make multiple testing forums which is probably not desirable. Luckily the fix was easy enough. Disqus looks at the <wp:comment_id> tag value in your imported XML to determine the uniqueness of a comment. I was using the UUID from my database table. To get around the issue, I literally just prefixed "m1_" in front of the ID. (Why m1? I assumed I was going to screw up again.) I should note that some folks on Twitter also suggested this but I held off trying it until I got confirmation from Disqus that this would work.
So... for the most part, that was the end of it. I ran my script about 4 times - generating "pages" of data over my BlogCFC entry list. Disqus recommends creating XML files less than 50 megs big. From what I could see I would have been a bit over that if I had done them all at once, but for folks who want to use my code you can probably generate a complete export if your comment count is less than mine. Another thing to watch out for is errors. I had about 20 comments in my database that were blank in regards to the actual text. I don't know why. Disqus considered these errors (rightly so), and reported the import as an error... but only while it was processing. Here is a screen shot of what I'm talking about:
Do you see how it says it only imported 900 or so? And see the error? This worried me but then I realized that it was still processing. The status seemed to imply a finished state but it was actually still digging through stuff. As I reloaded the number went higher and higher. (For folks curious, it took maybe 5 minutes to import 20K+ comments.)
If that UI in the screen shot doesn't match what you see in the Disqus import screen, that's because there is apparently two different places you can check imports. I was shown this url: http://import.disqus.com/group/FORUMNAME. This site seemed to provide slightly clearer reports so you may want to check it if you do a big import.
I want to give huge thanks to Matt Robenolt of Disqus. As I said, I had trouble with the "main" Disqus support. They were somewhat slow. I found Matt via contacts on Twitter and he dug deep into the issue. He agreed that there probably needs to be a way to force a reimport so hopefully that will come in the future.
For those of you on ColdFusion and running BlogCFC, I've attached my script. It was written for ColdFusion 11 but you can backport it easily enough to earlier versions. If you use it and it works for you, please let me know in the comments below.
Archived Comments
Woot! Grats Ray.
I will run this against a dev forum just for fun. The enclosure download is throwing a 404.
Looks to be a security issue. I renamed it and it works now. Thanks.
By the way - for folks who download this script - I should have added a quick mod to the comment query to ignore comments that had no body. Again, they shouldn't exist in BlogCFC *anyway*, but somehow I had about... maybe 20 of them or so.
Is the start and limit because of their limitations? Did you have to upload multiple files? I only have 2,700 total comments so I am not sure I am in the same boat as you.
Yes. They want you to limit the file size to 50 megs. I also wanted a way to test w/ a small set first. If you only have 2700, just change the limit to 999999 and it will grab them all. Or 999998. ;)
Awesome. Appreciate it Ray. This and 1 tiny bug with my the syntax highlighter change and I am ready to go. Almost there!
Yeah I need to divert attention back to Wordpress now. :)
The download link seems to be failing for me.
Try it now.
Happy New Year! I am trying to adopt your export script handling one of my test blogs (mangoblog, railo4.2, CF9 etc.. )
Before I get to deep into it, I got an alpha version exported. However on import on WP ran into some trouble related to author.
It gave me the message that author wasn't available but post would be attributed to existing user, which would have been fabulous. But none of posts were imported. I'll keep working at it, comparing WP export with my import file etc and other troubleshooting...
Just wanted to asked you if the above linked export script is actually your final version? Or if you kept tweaking it afterwards...
To be clear, you are talking about posts, not comments?
Yes, posts...
Not sure what to say. For me, I had one author, and the posts migrated just fine. I don't remember fixing any bugs there. Do you want a copy of my blogcfc export script mod (I mean the mod from the core repo I downloaded it from)?
Thanks for checking: Do you remember what you did with your categories?Never mind, I'll figure it out:-)
The script I used took care of it. Since it would throw an error on rerun, I used a SQL statement to remove categories quickly while I tested.
Ah, then, yes, I'd like to have the earlier mentioned copy:-)
Please email me and I'll send it to you - raymondcamden at gmail dot com.
The enclosure download is throwing a 404 again. Thanks for fixing!
Fixed.