Posted in ColdFusion | Posted on 09-14-2009 | 4,286 views
For a while now I've made use of a service called Twilert. The site has one simple purpose. It allows you to create Twitter search profiles and generate an email report to you daily (or weekly, etc). I thought it might be interesting to look at how difficult this would be to build in ColdFusion. Luckily Twitter goes a long way to providing both a simple to use API and a very powerful API as well. Here's what I came up with - and hopefully this can be useful to others.
First - let me define what I want to build. Like the Twilert service, I'll start with a set of search terms. I'll perform my search daily via a scheduled task that runs right past midnight and then delivers the report to me via email. The Twitter API is very nicely documented. In particular, the Search API is the one we care about. Also of note are the rate limits Twitter applies. While my code won't hit that limit, it is something to keep in mind. I'd suggest spending a few minutes scanning all of the previous links to get a feel for the Twitter API and what is supports. Now that you've done done (ok, be honest, if you are like me, you probably decided to skip it and read it later), let's start to build out our report generator.
First, the search term. This could be dynamic, perhaps based on the URL, which would then make it easy to set up a few scheduled tasks, each with different values. For now though I just hard coded it:
2<cfset search = "coldfusion">
Twitter supports basic AND/OR style searches as well. But I'll keep it simple and just one word. Now, I mentioned the rate limits before. Another thing to note is that when you perform a search, you can only return 100 results at one time. Twitter supports a Page attribute, but they limit you to 15 pages. That's 1500 results which seems a bit much, especially for an email. I created a variable to represent the total number of network requests, or pages, of data to get:
2<cfset maxRequests = 10>
For the most part, this is pretty arbitrary. If I got an email with 1000 results in it I doubt I'd read past the first twenty or so. Obviously this is something you can change to your liking, within the limits of Twitter's API.
2<cfset page = 1>
3
4<!--- max results per page is 100 --->
5<cfset max = 100>
The page variable just tracks the current page and max will be sent to Twitter to request the maximum amount of results possible.
2<cfset done = false>
3
4<!--- A flag to see if something went wrong. --->
5<cfset errorFlag = false>
6
7<!--- A flag to determine if we maxed out our search --->
8<cfset maxFlag = false>
These three variables are just flags. I'll be using the done variable in a loop coming up. The errorFlag will notice if something goes wrong with one of the HTTP calls. The maxFlag will be used if we hit the maximum number of requests.
2<cfset yesterday = dateAdd("d", -1, now())>
3<cfset searchURL = search & " since:#dateFormat(yesterday,'yyyy-mm-dd')#">
4<cfset searchURL = urlEncodedFormat(search)>
Next up we add the date filter to our search terms. Remember I'm running this every day so I want to limit the results to entries from yesterday. This is done with the since operator. Twitter also supports an until operator, but as I plan on running this report right past midnight, it won't matter. (You can see a good report of all the operators here.)
The last bit of code before we actually begin to search is to create the array that will store our results. Ok - so everything so far was setup - now let's look at the actual search:
2
3 <cfhttp url="http://search.twitter.com/search.json?page=#page#&rpp=#max#&q=#searchURL#" result="result">
4
5 <cfif result.responseheader.status_code is "200">
6 <cfset content = result.fileContent.toString()>
7 <cfset data = deserializeJSON(content)>
8
9 <cfloop index="item" array="#data.results#">
10 <cfset arrayAppend(results, item)>
11 </cfloop>
12 <cfif structKeyExists(data, "next_page")>
13 <cfset page++>
14 <cfif page gt maxRequests>
15 <cfset maxFlag = true>
16 <cfset done = true>
17 </cfif>
18 <cfelse>
19 <cfset done = true>
20 </cfif>
21 <cfelse>
22 <cfset errorFlag = true>
23 <cfset done = true>
24 </cfif>
25
26</cfloop>
Ok, let me describe this line by line. The loop will continue until the done variable is true. In each iteration I use cfhttp to hit Twitter. Notice that I ask for JSON back, pass in both page and max, and pass in my search query.
If the result status is 200, it should be good. I get the content and deserialize the JSON. I loop through each result and simply append it to the global results array. If the result JSON contains a next_page value, then more data exists. I do a check first though to see that I've not made too many requests. Lastly, I've got an ELSE block for times when the status wasn't 200. I could add additional logging here, but for now I just use the simple error flag.
Now that we have results, let's begin the display portion:
2<cfsavecontent variable="report">
3<cfoutput>
4<style>
5h2, p, .twit_date { font-family: Verdana, Geneva, Arial, Helvetica, sans-serif; }
6
7.twit_date { font-size: 10px; }
8
9.twit_odd {
10 padding: 10px;
11}
12.twit_even {
13 padding: 10px;
14 background-color: ##f0f0f0;
15}
16</style>
I've begun my display with a cfsavecontent. The reason for this is that I considered also generating a PDF report as well. I didn't end up doing it, but since I'll have my report in a nice variable, I'll be able to do just about anything with it. I then put on my designer hat (it has stars on it) and whipped up some simple CSS I'll use later. Please feel free to send suggestions on nicer CSS.
2<p>
3The following report was generated for the search term(s): #search#.<br/>
4It contains matches found from <b>#dateFormat(yesterday,"mmmm dd, yyyy")#</b> to now.<br/>
5A total of <b>#arrayLen(results)#</b> result(s) were found.<br/>
6<cfif maxFlag><b>Note: The maximumum number of results were found. More may be available.</b><br/></cfif>
7<cfif errorFlag><b>Note: An error ocurred during the report.</b><br/></cfif>
8</p>
Next up is a simple header. I report on the search term, the date, number of results, and on my flags.
2 <cfset twit = results[x]>
3 <cfif x mod 2 is 0>
4 <cfset class = "twit_even">
5 <cfelse>
6 <cfset class = "twit_odd">
7 </cfif>
8
9 <!--- massage date a bit to remove +XXXX --->
10 <cfset twitdate = twit.created_at>
11 <cfset twitdate = listDeleteAt(twitdate, listLen(twitdate, " "), " ")>
12
13 <p class="#class#">
14 <img src="#twit.profile_image_url#" align="left">
15 <a href="http://twitter.com/#twit.from_user#">#twit.from_user#</a> #twit.text#<br/>
16 <span class="twit_date">#twitdate#</span>
17 <br clear="left">
18 </p>
19</cfloop>
Now I loop over each Twit. Twitter reports a variety of fields for each result. I decided to only care about the time, the user (and his or her profile image), and the text. Please keep in mind though that there is even more information in the results. This is what I decided was important. The display is rather simple. Profile picture to the left, name and text on top, and the formatted date below it. (FYI: Notice the x mod 2 if clause there? I actually had the ColdFusion 9 ternary clause first and it was a lot slimmer. I know I could switch it to IIF but I hate that function.)
2</cfsavecontent>
3
4<cfoutput>#report#</cfoutput>
The final bits simply close up our tags and then output to screen. So I did lie a bit - I don't actually email the report, but as you can imagine, that would take about two seconds. I'd just wrap the report result in cfmail tags. I've got a few ideas on how to make this report even slicker. That will be in the next entry. So - is this useful? I could imagine this being a great way for a business to automate monitoring of their name and products.
You can download the full bits below.


Do you think it is worthwhile maybe to say, "hey, I failed, but lets try again a few times." I could, on failure, NOT add to HTTP requests, BUT keep a counter of errors, and stop at 3 or some such.
These are some suggestions I’d like to see in future reports if possible:
1. Keywords filtering: I was wondering if there is a way we can filter out inappropriate words within the users’ tweets such as “sex”, the F-word etc.. Basically, if a tweet has an inappropriate tweet, ignore it, do not display it..
2. Pages Navigation bar at the bottom of the page so we can page maybe 20 pages at a time
3. Would the report refresh itself automatically? Or the user has to refresh the page in order to see the new recent tweets.
4. Maybe a search field so a user will have the option to change the search criteria
I think I am asking too much, lets hope :-)
Thank you. This is a very good start..
-AJ
1) is something that I'm going to support in part 2 of this article - kinda. You will see.
On a side note, I was wondering if the data we are retrieving from Twitter can be imported into a database such as MS Access which will help me a great deal in formatting my own Reports and tackling my questions in my prior thread, and not to mention that my Reports will always work regardless if Twitter is down since I'll be pulling data from my own database, which should also speed up the retrieval process..
Sorry for all these questions, I am just throwing ideas. Hope you don’t mind :)
-AJ
Thx.
-AJ
Name your class .twit_0 and .twit_1
class="twit_#x mod 2#"
<cfset yesterday = dateAdd("d", -1, now())>
<cfset searchURL = search & " since:#dateFormat(yesterday,'yyyy-mm-dd')#">
<cfset searchURL = urlEncodedFormat(searchURL)>
Specifically the 3rd line was urlEncoding search, not searchURL. This is NOT fixed in the zip, but my next blog entry will include both reports and will have fixed code in both.
I haven't seen any updates recently, just wondering if there are more features coming.. Thx.
-AJ
http://www.coldfusionjedi.com/index.cfm/2009/9/16/...
Part 2 involved robots, snakes, and giant armadillos. Really. Well, maybe.
[Add Comment] [Subscribe to Comments]