Posted in ColdFusion | Posted on 08-13-2007 | 4,689 views
So I've blogged before about how xmlFormat() is a bit buggy. While it will remove most characters, including "high ascii" characters in the range of 128-255, it will gleefully ignore other high ascii characters, for example, character 8220 which is the funky Microsoft Word quote. Unfortunately it looks like the same code used for xmlFormat is used to escape text when you create feeds with CFFEED. Consider the following example:
2
3<cfset queryAddRow(getEntries)>
4<cfset querySetCell(getEntries,"title", "LAST ENTRY")>
5<cfset querySetCell(getEntries,"content", "<b>Test</b>")>
6<cfset querySetCell(getEntries,"publisheddate", now())>
7
8<cfset queryAddRow(getEntries)>
9<cfset querySetCell(getEntries,"title", "LAST ENTRY2")>
10<cfset querySetCell(getEntries,"content", "#chr(8220)#Test#chr(8220)#")>
11<cfset querySetCell(getEntries,"publisheddate", now())>
12
13<cfset props = {version="rss_2.0",title="Test Feed",link="http://127.0.0.1",description="Test"}>
14
15
16
17<cffeed action="create" properties="#props#" query="#getEntries#" xmlVar="result">
18
19<cfcontent type="text/xml" reset="true"><cfoutput>#result#</cfoutput>
The first entry will correctly show up in Firefox, but the second will not, and if you view source, you see the B tags are properly escaped, but the funky MS Word character is not. Now obviously I can make sure to "clean" my data before it gets used in the feed, but I wasn't aware this was an even an issue until a friend reported that the feed at ColdFusionBloggers suddenly turned up empty. For now I've switched to the solution below - which is not a good solution, but I needed a quick fix.
2<cfloop query="items">
3 <cfset fixedcontent = replaceList(content, "#chr(25)#,#chr(212)#,#chr(248)#,#chr(937)#,#chr(8211)#", "")>
4 <cfset fixedcontent = replaceList(fixedcontent,chr(8216) & "," & chr(8217) & "," & chr(8220) & "," & chr(8221) & "," & chr(8212) & "," & chr(8213) & "," & chr(8230),"',',"","",--,--,...")>
5 <cfset querySetCell(items, "content", fixedcontent, currentRow)>
6</cfloop>
7
8<cffeed action="create" properties="#props#" columnMap="#cmap#" query="#items#" xmlVar="result">


http://en.wikipedia.org/wiki/Windows-1252
This was my fix for the problem.
<cffunction name="UnicodeWin1252" hint="Converts MS-Windows superset characters (Windows-1252) into their XML friendly unicode counterparts" returntype="string">
<cfargument name="value" type="string" required="yes">
<cfscript>
var string = value;
string = replaceNoCase(string,chr(8218),'&##8218;','all'); // ‚
string = replaceNoCase(string,chr(402),'&##402;','all'); // ƒ
string = replaceNoCase(string,chr(8222),'&##8222;','all'); // „
string = replaceNoCase(string,chr(8230),'&##8230;','all'); // …
string = replaceNoCase(string,chr(8224),'&##8224;','all'); // †
string = replaceNoCase(string,chr(8225),'&##8225;','all'); // ‡
string = replaceNoCase(string,chr(710),'&##710;','all'); // ˆ
string = replaceNoCase(string,chr(8240),'&##8240;','all'); // ‰
string = replaceNoCase(string,chr(352),'&##352;','all'); // Š
string = replaceNoCase(string,chr(8249),'&##8249;','all'); // ‹
string = replaceNoCase(string,chr(338),'&##338;','all'); // Œ
string = replaceNoCase(string,chr(8216),'&##8216;','all'); // ‘
string = replaceNoCase(string,chr(8217),'&##8217;','all'); // ’
string = replaceNoCase(string,chr(8220),'&##8220;','all'); // “
string = replaceNoCase(string,chr(8221),'&##8221;','all'); // ”
string = replaceNoCase(string,chr(8226),'&##8226;','all'); // •
string = replaceNoCase(string,chr(8211),'&##8211;','all'); // –
string = replaceNoCase(string,chr(8212),'&##8212;','all'); // —
string = replaceNoCase(string,chr(732),'&##732;','all'); // ˜
string = replaceNoCase(string,chr(8482),'&##8482;','all'); // ™
string = replaceNoCase(string,chr(353),'&##353;','all'); // š
string = replaceNoCase(string,chr(8250),'&##8250;','all'); // ›
string = replaceNoCase(string,chr(339),'&##339;','all'); // œ
string = replaceNoCase(string,chr(376),'&##376;','all'); // Ÿ
string = replaceNoCase(string,chr(376),'&##376;','all'); // Ÿ
string = replaceNoCase(string,chr(8364),'&##8364','all'); // €
</cfscript>
<cfreturn string>
</cffunction>
http://www.coldfusionjedi.com/projects/toxml/
This will probably roll into Paragator as well.
I solved this with CDATA blocks inserted after creating the feed. So after the cffeed I've run these two REReplace fucntions.
<cfset FeedXML = REReplace(Variables.FeedXML, "(?m)^(\s*<title>)(.*?)(</title>\s*)$", "\1<![CDATA[\2]]>\3", "all")>
<cfset FeedXML = REReplace(Variables.FeedXML, "(?m)^(\s*<description>)(.*?)(</description>\s*)$", "\1<![CDATA[\2]]>\3", "all")>
If you want to write the feed to disk then you'll need to put it into a variables, run the above and then cffile it to disk.
[Add Comment] [Subscribe to Comments]