Ask a Jedi: Getting all the link labels from a string in ColdFusion

A reader asked me how they could use regex to find all the link labels in a string. Not the links - but the label for the link. It is relatively easy to grab all the matches for a regex in ColdFusion 8, consider the following code block:

<cfsavecontent variable="s"> This is some text. It is true that <a href="">Harry Potter</a> is a good magician, but the real <a href="">question</a> is how he would stand up against Godzilla. That is what I want to <a href="">see</a> - a Harry Potter vs Godzilla grudge match. Harry has his wand, Godzilla has his <a href="">breath</a>, it would be <i>so</i> cool. </cfsavecontent>

<cfset matches = reMatch("<[aA].?>.?</[aA]>",s)> <cfdump var="#matches#">

I create a string with a few links in it. I then use the new reMatch function to grab all the matches. My regex says - find all HTML links. It isn't exactly perfect, it won't match a closing A tag that has an extra space in it, but you get the picture. This results in a match of all the links:

But you will notice that the HTML links are still there. How can we get rid of them? I simply looped over the array and did a second pass:

<cfset links = arrayNew(1)> <cfloop index="a" array="#matches#"> <cfset arrayAppend(links, rereplace(a, "<.*?>","","all"))> </cfloop>

<cfdump var="#links#">

This gives you the following output:

p.s. Running on ColdFusion 7? Try the reFindAll UDF as a replacement to reMatch.

Raymond Camden's Picture

About Raymond Camden

Raymond is a developer advocate. He focuses on JavaScript, serverless and enterprise cat demos. If you like this article, please consider visiting my Amazon Wishlist or donating via PayPal to show your support. You can even buy me a coffee!

Lafayette, LA