A while back I wrote a ColdFusion Sample that dealt with reading RSS feeds. In today's sample, I'm going to expand on that a bit and create an application that reads a RSS feed and searches for keywords. This also ties nicely in with my ColdFusion sample on scheduled tasks. The code I show here is meant to be used as a scheduled task that can run nightly, or hourly depending on how active the RSS feed is.

Let's begin our sample with few simple variables:

<!--- Feed to scan ---> <cfset rssUrl = "http://rss.cnn.com/rss/cnn_topstories.rss"> <!--- Keywords ---> <cfset keywords = "obama,debt"> <!--- Person who gets the email ---> <cfset receiver = "raymondcamden@gmail.com">

I assume these variables are self-explanatory, but note that keywords would most likely be dynamic. I could see them being hard coded though if you are building something simple for a client. By the way, if you download this code, please change receiver. I get enough email. :) Moving on....

<cffeed action="read" source="#rssUrl#" query="entries"> <cfoutput> The feed has #entries.recordcount# entries.<br/> </cfoutput>

This code grabs the RSS feed and turns it into a query. I mentioned earlier that this script would most likely be a scheduled task. That being said, there's no reason why I can't include some text in the output. Don't forget ColdFusion allows you to save the result of a scheduled task so you can look at it later. Ok, now for the fun part - the actual processing...

<cfloop index="keyword" list="#keywords#">

<cfquery name="getMatches" dbtype="query"> select title, rsslink, publisheddate from entries where upper(title) like <cfqueryparam cfsqltype="cf_sql_varchar" value="%#ucase(keyword)#%"> or upper(content) like <cfqueryparam cfsqltype="cf_sql_varchar" value="%#ucase(keyword)#%"> </cfquery>

<cfoutput> The keyword #keyword# matched #getMatches.recordCount# entries.<br/> </cfoutput>

<cfif getMatches.recordCount> <cfmail to="#receiver#" from="#receiver#" subject="Keyword match in RSS Feed." type="html"> <cfoutput> <h2>RSS Matches Found: #keyword#</h2>

<p> Matches in the RSS feed for the keyword "#keyword#" have been found: </p>

<ul> <cfloop query="getMatches"> <li><a href="#rsslink#">#title#</a></li> </cfloop> </ul> </cfoutput> </cfmail> </cfif>

</cfloop>

We begin by looping over each keyword. For my report, I want one result per keyword. You could create one result instead. I just thought it would be nicer to have a separate set of results. I make use of Query of Queries to scan for my keyword. Notice the upper! LIKE matches in QoQ are case sensitive. By using upper in the SQL and uCase in CFML I can ensure a case-insensitive match. If we find a match, then I simply fire off an email. That's it. I ended my template with a quick "I'm done" message. Here is the entire template.

<!--- Feed to scan ---> <cfset rssUrl = "http://rss.cnn.com/rss/cnn_topstories.rss"> <!--- Keywords ---> <cfset keywords = "obama,debt"> <!--- Person who gets the email ---> <cfset receiver = "raymondcamden@gmail.com">

<cffeed action="read" source="#rssUrl#" query="entries"> <cfoutput> The feed has #entries.recordcount# entries.<br/> </cfoutput>

<cfloop index="keyword" list="#keywords#">

<cfquery name="getMatches" dbtype="query"> select title, rsslink, publisheddate from entries where upper(title) like <cfqueryparam cfsqltype="cf_sql_varchar" value="%#ucase(keyword)#%"> or upper(content) like <cfqueryparam cfsqltype="cf_sql_varchar" value="%#ucase(keyword)#%"> </cfquery>

<cfoutput> The keyword #keyword# matched #getMatches.recordCount# entries.<br/> </cfoutput>

<cfif getMatches.recordCount> <cfmail to="#receiver#" from="#receiver#" subject="Keyword match in RSS Feed." type="html"> <cfoutput> <h2>RSS Matches Found: #keyword#</h2>

<p> Matches in the RSS feed for the keyword "#keyword#" have been found: </p>

<ul> <cfloop query="getMatches"> <li><a href="#rsslink#">#title#</a></li> </cfloop> </ul> </cfoutput> </cfmail> </cfif>

</cfloop>

<cfoutput> Done processing.<br/> </cfoutput>

And here is a sample email:

Nice and simple, right? My goal for these "ColdFusion Sample" blog entries are to keep it that way. I want to provide samples in ColdFusion for common problems. That being said, you should stop reading now. What follows is superfluous, unnecessary, and just plain silly.

Here be dragons...

Folks know I have something of a geek crush on OpenAmplify. What if we were to use OpenAmplify to tell us what type of match was found, specifically, if it was a positive or negative match. Consider this modified version:

<!--- Feed to scan ---> <cfset rssUrl = "http://rss.cnn.com/rss/cnn_topstories.rss"> <!--- Keywords ---> <cfset keywords = "obama,debt"> <!--- Person who gets the email ---> <cfset receiver = "raymondcamden@gmail.com"> <!--- OpenAmplify CFC ---> <cfset openAmp = new openamplify("my key is better than yours, like my milkshake")>

<cffeed action="read" source="#rssUrl#" query="entries"> <cfoutput> The feed has #entries.recordcount# entries.<br/> </cfoutput>

<cfloop index="keyword" list="#keywords#">

<cfquery name="getMatches" dbtype="query"> select title, rsslink, publisheddate, content from entries where upper(title) like <cfqueryparam cfsqltype="cf_sql_varchar" value="%#ucase(keyword)#%"> or upper(content) like <cfqueryparam cfsqltype="cf_sql_varchar" value="%#ucase(keyword)#%"> </cfquery>

<cfoutput> The keyword #keyword# matched #getMatches.recordCount# entries.<br/> </cfoutput>

<cfif getMatches.recordCount>

<!--- get OA values ---> <cfset queryAddColumn(getMatches,"mood","cf_sql_varchar",[])> <cfset queryAddColumn(getMatches,"moodval","cf_sql_varchar",[])> <cfloop query="getMatches"> <cfset oaResult = openAmp.parse(text=content,analysis="Styles")> <cfset moodLabel = oaResult.Styles.Polarity.Mean.Name> <cfset moodValue = oaResult.Styles.Polarity.Mean.Value> <cfset querySetCell(getMatches, "mood", moodlabel, currentRow)> <cfset querySetCell(getMatches, "moodval", moodValue, currentRow)> </cfloop>

<cfmail to="#receiver#" from="#receiver#" subject="Keyword match in RSS Feed." type="html"> <cfoutput> <h2>RSS Matches Found: #keyword#</h2>

<p> Matches in the RSS feed for the keyword "#keyword#" have been found: </p>

<ul> <cfloop query="getMatches"> <li><a href="#rsslink#">#title#</a> <cfif moodVal lt 0><font color="red"><cfelseif moodVal gt 0><font color="green"><cfelse><font></cfif> #mood# </font> </li> </cfloop> </ul> </cfoutput> </cfmail> </cfif>

</cfloop>

<cfoutput> Done processing.<br/> </cfoutput>

I'll point out the differences here. First - note I make use of the OpenAmplify CFC. (Which has been updated - please grab the download zip!) Later on, if we have matches in the RSS feed, I do a "Styles" analysis of the content from the feed. This will most likely not be very deep. It depends on how much text was in the RSS feed. You could actually tell OpenAmplify to parse the URL. Their API supports that as well. Once I have the result I grab the mean label and numerical value and simply add it to the query of results. Now look in the email. I can check those values and dynamically color based on the mood. Negative? Red. Positive? Green. Forgive me for using the font tag but it works. Here's an example of the updated email.

I've included both test templates and the new version of openamplify.cfc to the end of this blog entry.

Download attached file.