Yesterday in the IRC channel someone asked if there was a way to count the number of times each unique word appears in a string. While it was obvious that this could be done manually (see below), no one knew of a more elegant solution. Can anyone think of one? Here is the solution I used and it definitely falls into the "manual" (and probably slow) category.

First I made my string:

<cfsavecontent variable="string"> This is a paragraph with some text in it. Certain words will be repeated, and other words will not be repeated. The question is though, how much can I write before I begin to sound like a complete and utter idiot. Let's call that the "Paris Point". At the Paris Point, any further words sound like gibberish and are completely worthless. </cfsavecontent>

I then used some regex to get an array of words:

<cfset words = reMatch("[[:word:]]+", string)>

Next I created a structure:

<cfset wordCount = structNew()>

And then looped over the array and inserted the words into the structure:

<cfloop index="word" array="#words#"> <cfif structKeyExists(wordCount, word)> <cfset wordCount[word]++> <cfelse> <cfset wordCount[word] = 1> </cfif> </cfloop>

Note that this will be inherently case-insenstive, which I think is a good thing. At this point we are done, but I added some display code as well:

<cfset sorted = structSort(wordCount, "numeric", "desc")>

<table border="1" width="400"> <tr> <th width="50%">Word</th> <th>Count</th> </tr>

<cfloop index="word" array="#sorted#"> <cfoutput> <tr> <td>#word#</td> <td>#wordCount[word]#</td> </tr> </cfoutput> </cfloop>