August 13, 2007 (This post is more than 2 years old.)

Bug to watch out for with CFFEED

coldfusion

So I've blogged before about how xmlFormat() is a bit buggy. While it will remove most characters, including "high ascii" characters in the range of 128-255, it will gleefully ignore other high ascii characters, for example, character 8220 which is the funky Microsoft Word quote. Unfortunately it looks like the same code used for xmlFormat is used to escape text when you create feeds with CFFEED. Consider the following example:


<cfset getEntries = queryNew("publisheddate,content,title")>
<cfset queryAddRow(getEntries)>
<cfset querySetCell(getEntries,"title", "LAST ENTRY")>
<cfset querySetCell(getEntries,"content", "<b>Test</b>")>
<cfset querySetCell(getEntries,"publisheddate", now())>
<cfset queryAddRow(getEntries)>
<cfset querySetCell(getEntries,"title", "LAST ENTRY2")>
<cfset querySetCell(getEntries,"content", "#chr(8220)#Test#chr(8220)#")>
<cfset querySetCell(getEntries,"publisheddate", now())>
<cfset props = {version="rss_2.0",title="Test Feed",link="http://127.0.0.1",description="Test"}>
<cffeed action="create" properties="#props#" query="#getEntries#" xmlVar="result">

<cfcontent type="text/xml" reset="true"><cfoutput>#result#</cfoutput>

The first entry will correctly show up in Firefox, but the second will not, and if you view source, you see the B tags are properly escaped, but the funky MS Word character is not. Now obviously I can make sure to "clean" my data before it gets used in the feed, but I wasn't aware this was an even an issue until a friend reported that the feed at ColdFusionBloggers suddenly turned up empty. For now I've switched to the solution below - which is not a good solution, but I needed a quick fix.


<!--- clean up bad stuff --->
<cfloop query="items">
	<cfset fixedcontent = replaceList(content, "#chr(25)#,#chr(212)#,#chr(248)#,#chr(937)#,#chr(8211)#", "")>
	<cfset fixedcontent = replaceList(fixedcontent,chr(8216) & "," & chr(8217) & "," & chr(8220) & "," & chr(8221) & "," & chr(8212) & "," & chr(8213) & "," & chr(8230),"',',"","",--,--,...")>	
	<cfset querySetCell(items, "content", fixedcontent, currentRow)>
</cfloop>

<cffeed action="create" properties="#props#" columnMap="#cmap#" query="#items#" xmlVar="result">

Hire Me!

I'm currently looking for my next role in developer evangelism and advocacy. I have a long history of helping companies work with developers and love to write, create demos, and present at confereneces. You can find my resume to learn more and drop me an email (raymondcamden@gmail.com) to reach out.

Support this Content!

If you like this content, please consider supporting me. You can become a Patron, visit my Amazon wishlist, or buy me a coffee! Any support helps!

Want to get a copy of every new post? Use the form below to sign up for my newsletter.

Archived Comments

Comment 1 by Rob Wilkerson posted on 8/13/2007 at 11:06 PM

I've never cared much for the XMLFormat() path and have instead chosen to use a less...engineered, maybe?...CDATA block. If that can be used, it's a bit of a cleaner solution, in my opinion.

Comment 2 by u posted on 8/13/2007 at 11:08 PM

That would work for xmlFormat - but not cffeed as I assume the < would be escaped automatically.

Comment 3 by Rob Wilkerson posted on 8/13/2007 at 11:50 PM

Good point. I assumed you had considered the possibility, but thought I'd throw it out there since you didn't mention it specifically. As I recall, it doesn't work for CFXML and the mechanics of that are probably fairly similar to those of CFFEED.

Comment 4 by Ben Garrett posted on 8/14/2007 at 5:51 AM

It's not just a few characters about around 25 that are not part of the ISO-8859-1 standard but are often found in documents and on web sites.

http://en.wikipedia.org/wik...

This was my fix for the problem.

<cffunction name="UnicodeWin1252" hint="Converts MS-Windows superset characters (Windows-1252) into their XML friendly unicode counterparts" returntype="string">
<cfargument name="value" type="string" required="yes">
<cfscript>
var string = value;
string = replaceNoCase(string,chr(8218),'&##8218;','all'); // ‚
string = replaceNoCase(string,chr(402),'&##402;','all'); // ƒ
string = replaceNoCase(string,chr(8222),'&##8222;','all'); // „
string = replaceNoCase(string,chr(8230),'&##8230;','all'); // …
string = replaceNoCase(string,chr(8224),'&##8224;','all'); // †
string = replaceNoCase(string,chr(8225),'&##8225;','all'); // ‡
string = replaceNoCase(string,chr(710),'&##710;','all'); // ˆ
string = replaceNoCase(string,chr(8240),'&##8240;','all'); // ‰
string = replaceNoCase(string,chr(352),'&##352;','all'); // Š
string = replaceNoCase(string,chr(8249),'&##8249;','all'); // ‹
string = replaceNoCase(string,chr(338),'&##338;','all'); // Œ
string = replaceNoCase(string,chr(8216),'&##8216;','all'); // ‘
string = replaceNoCase(string,chr(8217),'&##8217;','all'); // ’
string = replaceNoCase(string,chr(8220),'&##8220;','all'); // “
string = replaceNoCase(string,chr(8221),'&##8221;','all'); // ”
string = replaceNoCase(string,chr(8226),'&##8226;','all'); // •
string = replaceNoCase(string,chr(8211),'&##8211;','all'); // –
string = replaceNoCase(string,chr(8212),'&##8212;','all'); // —
string = replaceNoCase(string,chr(732),'&##732;','all'); // ˜
string = replaceNoCase(string,chr(8482),'&##8482;','all'); // ™
string = replaceNoCase(string,chr(353),'&##353;','all'); // š
string = replaceNoCase(string,chr(8250),'&##8250;','all'); // ›
string = replaceNoCase(string,chr(339),'&##339;','all'); // œ
string = replaceNoCase(string,chr(376),'&##376;','all'); // Ÿ
string = replaceNoCase(string,chr(376),'&##376;','all'); // Ÿ
string = replaceNoCase(string,chr(8364),'&##8364','all'); // €
</cfscript>
<cfreturn string>
</cffunction>

Comment 5 by Raymond Camden posted on 8/14/2007 at 5:59 AM

Do you mind if I include this in toXML.cfc?

Comment 6 by Ben Garrett posted on 8/14/2007 at 6:17 AM

Sure go for it

Comment 7 by Raymond Camden posted on 8/15/2007 at 12:44 AM

Thanks Ben. Updated:

http://www.coldfusionjedi.c...

This will probably roll into Paragator as well.

Comment 8 by Ben Garrett posted on 8/16/2007 at 1:33 PM

The file download for toxml seems to be the one released on 30/Apr even though the page says 14th Aug?

Comment 9 by Raymond Camden posted on 8/16/2007 at 2:33 PM

Sorry about that - try now please.

Comment 10 by Ben Garrett posted on 8/16/2007 at 3:48 PM

Thanks, works great now

Comment 11 by Michael Williams posted on 9/1/2007 at 6:52 AM

Doesn't the CFLIB tag XMLFormat2() handle high ascii?

Comment 12 by Raymond Camden posted on 9/7/2007 at 1:23 AM

It covers some - but not all.

Comment 13 by Jason posted on 9/19/2007 at 5:43 PM

Thanks guys. I've been banging my head on this problem for hours now!

Comment 14 by Joel posted on 12/11/2007 at 9:43 PM

This sound great... I'm sort of new to CF -- where/how would I implement this so that it make the corrections.

Comment 15 by Raymond Camden posted on 12/11/2007 at 9:59 PM

Joel - my blog entry ends with an example of how I change the data before I pass to cffeed.

Comment 16 by Grumpy CFer posted on 7/11/2011 at 5:09 PM

Old post, but a top result in Google so adding to the conversation...

I solved this with CDATA blocks inserted after creating the feed. So after the cffeed I've run these two REReplace fucntions.

If you want to write the feed to disk then you'll need to put it into a variables, run the above and then cffile it to disk.

Hire Me!

Support this Content!

Archived Comments

Webmentions