getSafeHTML and ColdFusion 11

One of the cooler new features in the next version of ColdFusion is getSafeHTML. I had seen this mentioned a few times already but it never really clicked in my brain what it was doing. getSafeHTML makes use of the AntiSamy project. It takes user-generated content and replaces unsafe HTML. What is safe and what isn’t? It is totally up to you. The functionality is driven by an XML file (a very complex XML file) that lets you get as granular as you want. Want to support the bold tag but not italics? Fine. Want to support colors for CSS but only some? You can do that. Let’s look at a simple example – and one that happens to point out a little issue.

<cfsavecontent variable="test">
This is some <b>html</b>. Even <i>more</i> html!<br/>
<iframe src="http://www.cnn.com"></iframe>
</cfsavecontent>

<cfoutput>
<pre>
#getSafeHTML(test)#
</pre>
</cfoutput>

In my sample input, I’ve got a B tag, an I tag, and an iframe. getSafeContent will strip out just the iframe, leaving the bold and italics there. This is rather cool I think. But in my test I discovered a little bug. The actual result of the above code is this:

This is some <b>html</b>
. Even <i>more</i>
 html!<br />

See the line break after the closing B tag? That moves the period to a new line, which renders as a space in the browser. I did some research and discovered that there is a particular setting in AntiSamy that modifies the result with a bit of formatting. In this case, the formatting breaks my HTML. So how to fix?

As I mentioned, AntiSamy is configured by an XML file. There is a default one for the server. You can override the XML file at the Application.cfc level or in your call to getSafeHTML itself. I did some Googling, found a sample file, and then did the modification to the setting in question:

<directive name="formatOutput" value="false"/>

This goes within the directives block. I’m going to file an ER to add this to the default XML for ColdFusion 11.