Ask a Jedi: ColdFusion support for CALS Tables

This post is more than 2 years old.

Aaron asked:

I've been a follower of your blog forever and it's been an indispensable tool while I've been learning ColdFusion. I started on Allaire's 4.5 version and more recently got ahold of CF8. I'd consider myself a moderate level developer, with a good handle on building CFC's and manipulating XML documents.

I've run into a snag recently though, regarding table representation in XML and the process of converting CALS-based format into HTML and vice-versa. This issue came about when I exported pages out of Adobe InDesign to XML and saw CALS format for the first time. My attempt to use the same data from an exported catalog for use on a website has been thwarted by these demonic CALS tables.

Knowing that CF8 has awesome XML power, I thought there would be a custom tag or CFC for converting these tables into either format but my search has turned up nil. Before attempting to write my own, I was curious if you had any suggestions or knew of any resources I could tap to help with the process? If I did end up creating one, I'd love to pass it off to any who needed it. Maybe you could cover it in a blog entry?

I had to admit to Aaron - I had no idea what CALS was. I did a quick Google search and didn't have luck, but Wikipedia helped me out: CALS Table Model. From what I can gather, CALS is an XML format for describing tables for print. I could be a bit off on that - some of the jargon was a bit hard to grok. But that's what I took from it. Wikipedia then led me to a web page and a DTD description: CALS Table Model Document Type Definition. Be sure to check the date on that document - 2001. I had a lot less gray hairs in the beard back when this was published.

I asked Aaron for a sample of his XML to see if I could work with it. Here is a sample of the XML he had to work with:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <catalog> <category> <item> <itemsku>SEE BELOW</itemsku> <itemroottitle>Aimpoint CompM3 Weapon Sight </itemroottitle> <itemintroduction>When police and armed forces personnel are on duty, they need their equipment to handle just about any situation, anytime, anywhere. That's why Aimpoint created the CompM3, a revolutionary sight that offers the highest standard available for sight systems technology. The CompM3 features their latest technology and is even more rugged than Aimpoint's other sights. Featuring revolutionary Advanced Circuit Efficiency Technology that gives the sight unparalleled battery life and ease of use, the CompM3 works with any generation of night vision device (NVD). Built primarily for armed forces and police use to hold up under the roughest physical handling and most severe weather conditions. The CompM3 comes with a replaceable, outer black rubber cover, which protects it from scratching and adds an additional stealth factor. The outer cover is also available in Dark Earth Brown perfect camo for use in the desert and in the jungle.</itemintroduction> <itemfeatures>Compatible with every generation of NVD New technology called ACET allows 50,000 hours of operation on one single battery (on setting 7 out of 10) 500,000 hours of use on NVD setting Unequalled light transmission Available in 2 dot sizes (2 and 4 MOA) Submersible to 45 meters (135 feet) Comes with replaceable outer black rubber cover Outer rubber cover available in Dark Earth Brown</itemfeatures> <!--- CALS Table Format. (tables is the defined tag in indesign where the table is kept, the rest is IndDesign created stuff ---> <tables> <table frame="none"> <tgroup cols="2"> <colspec colname="c1" colwidth="56.82176592382906pt"></colspec> <colspec colname="c2" colwidth="90.68922308716pt"></colspec> <tbody> <row> <entry namest="c1" nameend="c2" colsep="0" align="center" valign="top">Models</entry> </row> <row> <entry colsep="0" align="right" valign="top">AIM-11408</entry> <entry colsep="0" align="left" valign="top">With 2 MOA dot size</entry> </row> <row> <entry colsep="0" rowsep="0" align="right" valign="top">AIM-11403</entry> <entry colsep="0" rowsep="0" align="left" valign="top">With 4 MOA dot size</entry> </row> </tbody> </tgroup> </table> </tables> <!--- End of CALS Table Format ---> <pageid>199</pageid> </item> </category> </catalog>

Would you believe I just now noticed this XML contains information about weapons and "outer black rubber cover"?? Ok, so that aside, the CALS specific section should be obvious. For my testing I copied that portion into a CFXML variable:

<!--- COLS Data ---> <cfxml variable="colxml"> <tables> <table frame="none"> <tgroup cols="2"> <colspec colname="c1" colwidth="56.82176592382906pt"></colspec> <colspec colname="c2" colwidth="90.68922308716pt"></colspec> <tbody> <row> <entry namest="c1" nameend="c2" colsep="0" align="center" valign="top">Models</entry> </row> <row> <entry colsep="0" align="right" valign="top">AIM-11408</entry> <entry colsep="0" align="left" valign="top">With 2 MOA dot size</entry> </row> <row> <entry colsep="0" rowsep="0" align="right" valign="top">AIM-11403</entry> <entry colsep="0" rowsep="0" align="left" valign="top">With 4 MOA dot size</entry> </row> </tbody> </tgroup> </table> </tables> </cfxml>

From my reading of the spec, my understanding was that a table contained N sets of tgroups. Each tgroup is really it's own table, but they must all fit within one uber table frame. To simplify things though I decided to just work with the main tgroup.

<!--- get the tgroup ---> <cfset myTable = colxml.tables.table.tgroup>

This could be a bit more dynamic, but for a proof of concept I'll go with it. My next step was to parse the colspec tags.

<!--- parse the cols to get names and widths ---> <cfset cols = []> <cfif structKeyExists(myTable, "colspec")> <cfloop index="x" from="1" to="#arrayLen(myTable.colspec)#"> <cfset colspec = myTable.colspec[x]> <cfset col = {}> <cfif structKeyExists(colspec.xmlAttributes, "colname")> <cfset col.name = colspec.xmlAttributes.colname> </cfif> <cfif structKeyExists(colspec.xmlAttributes, "colwidth")> <cfset col.width = colspec.xmlAttributes.colwidth> </cfif> <cfset arrayAppend(cols, col)> </cfloop> </cfif>

Basically I create an array of structs, where each struct contains information about the colspec tags. CALS may support more attributes, but I worked with what I saw in the sample XML.

Next was to parse the cells. For the most part this was simple, but notice how they handle colspans. They have a namest and nameend attribute. These point to named columns and represent a start/end "span" for a cell. That's going to be a bit tricky, but before we get ahead of ourselves, let's begin the basic parsing:

<!--- now parse the rows ---> <cfset rows = []> <cfloop index="x" from="1" to="#arrayLen(myTable.tbody.row)#"> <cfset row = myTable.tbody.row[x]> <!--- each row has N or more entries (cells) ---> <cfset cells = []>
&lt;cfloop index="y" from="1" to="#arrayLen(row.entry)#"&gt;
	&lt;cfset entry = row.entry[y]&gt;

I begin by creating an array for my rows. I then loop over the XML for each row. A row is an array of cells so I create another array as well. Lastly I loop over each entry.

<cfset cell = {}> <!--- support colspan by looking for namest/namend ---> <!--- require both for now ---> <cfif structKeyExists(entry.xmlAttributes, "namest") and structKeyExists(entry.xmlAttributes, "nameend")> <cfset colstart = entry.xmlAttributes.namest> <cfset colend = entry.xmlAttributes.nameend> <!--- Ok, given that we know the name of our start and end col, we can get a colspan. Don't support not starting at 0, just support a count ---> <cfset begin = 0> <cfset end = 0> <cfloop index="z" from="1" to="#arrayLen(cols)#"> <cfif structKeyExists(cols[z], "name")> <cfif cols[z].name is colstart> <cfset begin = z> <cfelseif cols[z].name is colend> <cfset end = z> </cfif> </cfif> </cfloop> <cfif begin gt 0 and end gt 0> <cfset cell.colspan = end-begin+1> </cfif> </cfif>

The cell structure represents one table cell. My first task is to see if a colspan should be in effect. This becomes a bit tricky because it is possible I may not have the named columns. So I do a lot of checking to see if they actually exist, and finally, if they do, I set a colspan value equal to the "distance" between the two columns.

<cfif structKeyExists(entry.xmlAttributes,"valign")> <cfset cell.valign = entry.xmlAttributes.valign> </cfif> <cfif structKeyExists(entry.xmlAttributes,"align")> <cfset cell.align = entry.xmlAttributes.align> </cfif>

The rest of the cell creation is a bit more simpler. If I have a valign or align attribute, copy it over. Finally, add the cell, add the rows, end the loops, etc:

<cfset cell.text = entry.xmlText> <cfset arrayAppend(cells, cell)> </cfloop>
&lt;cfset arrayAppend(rows, cells)&gt;	

</cfloop>

Woot. So at this point I've got an array of arrays. Let's see about rendering it:

<!--- try to render ---> <table border="1">

<cfloop index="row" array="#rows#"> <tr>

&lt;cfloop index="cell" array="#row#"&gt;
	&lt;cfoutput&gt;
	&lt;td
		&lt;cfif structKeyExists(cell, "colspan")&gt;
		colspan="#cell.colspan#"
		&lt;/cfif&gt;
		&lt;cfif structKeyExists(cell, "valign")&gt;
		valign="#cell.valign#"
		&lt;/cfif&gt;
		&lt;cfif structKeyExists(cell, "align")&gt;
		align="#cell.align#"
		&lt;/cfif&gt;
	&gt;#cell.text#&lt;/td&gt;
	&lt;/cfoutput&gt;
&lt;/cfloop&gt;

&lt;/tr&gt;

</cfloop>

</table>

I begin, and end, with a table tag. I added a border to my output to make it a bit clearer, but obviously that is something that CALS handles with an XML attribute not present in the sample above. For each row, and each cell, I check the attributes and output the relevant HTML for it. The result?

I've attached the entire template to the blog entry. As I said above - this code isn't terribly flexible, but hopefully it can give people a head start if they need to work with CALS data.

Download attached file.

Raymond Camden's Picture

About Raymond Camden

Raymond is a senior developer evangelist for Adobe. He focuses on document services, JavaScript, and enterprise cat demos. If you like this article, please consider visiting my Amazon Wishlist or donating via PayPal to show your support. You can even buy me a coffee!

Lafayette, LA https://www.raymondcamden.com

Archived Comments

Comment 1 by Aaron S. posted on 10/1/2009 at 7:54 PM

Thanks Ray! This is a great first step towards making InDesign exported XML useable for the web. I'll be attempting to make a CFC that can convert HTML tables from/to CALS format. Anyone who uses InDesign and wants to export to the web would find it useful. ;-)

Comment 2 by Roland Collins posted on 10/1/2009 at 11:59 PM

A better way to do this is using XSLT. There are quite a few good XSLT templates for this already. The one located here works fantastically (if you remove the semi-colon before the first closing angle bracket): http://sources.redhat.com/m...

All you have to do is copy the xslt template into a file and then load that file into a variable (or use a cfsavecontent block). Then you simply use XmlTransform on the xml data. Something like this:

cfsavecontent variable="xsl"
- cut and paste xslt template -
/cfsavecontent

cfsavecontent variable="xml"
- cut and paste your xml here -
/cfsavecontent

cfset html = XmlTransform(xml, xsl)

cfoutput
#html#
/cfoutput

I just tried this using the xslt at that link and the example xml, and it works fantastically. There are a number of other xslts you can use if this one doesn't suit your needs. Just google "XSLT CALS HTML conversion".

Hope that helps!

Comment 3 by Raymond Camden posted on 10/2/2009 at 12:00 AM

Doh, didn't even think about XSLT. Massively better there Roland. Good find.

Comment 4 by Aaron S. posted on 10/2/2009 at 12:43 AM

Nice! Didn't even think about XSLT. This is perfect for reading from the generated XML but what if I want to take an existing HTML table and convert to CALS? I guess I'm looking for a method to go both ways seamlessly by passing either a chunk of XML or a chunk of HTML and having it spit the conversion out.

Comment 5 by Roland Collins posted on 10/2/2009 at 12:54 AM

Here's XHTML tables to CALS (I haven't tested it). http://www.biglist.com/list... . There were various versions posted all over the place - I suggest googling to find the best fit!

If you're doing poorly-formatted (non-xhtml) html tables, then you've got a much bigger problem on your hands. Fortunately, Ben just posted a solution to make those html tables xhtml compliant the other day: http://www.bennadel.com/ind...