A reader asked me how they could use regex to find all the link labels in a string. Not the links - but the label for the link. It is relatively easy to grab all the matches for a regex in ColdFusion 8, consider the following code block:

<cfsavecontent variable="s"> This is some text. It is true that <a href="http://www.cnn.com">Harry Potter</a> is a good magician, but the real <a href="http://www.raymondcamden.com">question</a> is how he would stand up against Godzilla. That is what I want to <a href="http://www.adobe.com">see</a> - a Harry Potter vs Godzilla grudge match. Harry has his wand, Godzilla has his <a href="http://www.cfsilence.com">breath</a>, it would be <i>so</i> cool. </cfsavecontent>

<cfset matches = reMatch("<[aA].?>.?</[aA]>",s)> <cfdump var="#matches#">

I create a string with a few links in it. I then use the new reMatch function to grab all the matches. My regex says - find all HTML links. It isn't exactly perfect, it won't match a closing A tag that has an extra space in it, but you get the picture. This results in a match of all the links:

But you will notice that the HTML links are still there. How can we get rid of them? I simply looped over the array and did a second pass:

<cfset links = arrayNew(1)> <cfloop index="a" array="#matches#"> <cfset arrayAppend(links, rereplace(a, "<.*?>","","all"))> </cfloop>

<cfdump var="#links#">

This gives you the following output:

p.s. Running on ColdFusion 7? Try the reFindAll UDF as a replacement to reMatch.