Ok, so hopefully you've got a nice system for handling errors within your ColdFusion application. (And if not, don't worry, there's a guide for that.) And maybe you even have your error handler sending you a nice, informative email on every error. Great. Until one day something goes haywire and you end up with 1000 emails in your inbox. (Although that's never happened to me. Honest. Well, not this week. I mean today.) Wouldn't it be nice if you could send email - but perhaps tell ColdFusion to not send the same email within a timeframe? Here is my stab at building such a service.
I began by defining an API in my mind. I'd pass to my method an object containing all my mail properties. This would be used for the actual cfmail tag that. Next we will provide a simple timeout value. We will use minutes so that a value of 5 means "If I try to send the same email within 5 minutes, don't send it again." And finally, let's get a bit complex. If your email is partially dynamic, perhaps it has a URL variable printed in it, we don't have each one to fire off a new message when the core reason for the email hasn't changed. So our final attribute will be a simple regex that will be run against the body. Here is the code I came up with:
<cfcomponent output="false">
<cfset variables.cache = {}>
<cffunction name="throttleSend" access="public" output="false" returnType="boolean">
<cfargument name="mail" type="struct" required="true" hint="Structure of args for the mail.">
<cfargument name="limit" type="numeric" required="true" hint="Number of minutes to wait before sending again.">
<cfargument name="regex" type="string" required="false" hint="This regex is performed on your mail body. Helps remove items that may be dynamic in the body but should not be considered for caching.">
<!--- used for required mail tags --->
<cfset var reqlist = "to,from,subject,body">
<cfset var l = "">
<cfset var body = "">
<cfset var cacheBody = "">
<cfset var hashKey = "">
<!--- quickly validate the mail object --->
<cfloop index="l" list="#reqlist#">
<cfif not structKeyExists(arguments.mail, l)>
<cfthrow message="mail object is missing required key #l#">
</cfif>
</cfloop>
<!--- Ok, first, create the hash --->
<cfset body = arguments.mail.body>
<cfif structKeyExists(arguments, "regex")>
<cfset cacheBody = rereplace(body,regex,"","all")>
<cfset hashKey = hash(arguments.mail.to & " " & arguments.mail.subject & " " & cacheBody)>
<cfelse>
<cfset hashKey = hash(arguments.mail.to & " " & arguments.mail.subject & " " & body)>
</cfif>
<!--- If we already sent it and it hasn't expired, don't do squat --->
<cfif structKeyExists(variables.cache, hashKey) and dateCompare(now(), variables.cache[hashKey]) is -1>
<cfreturn false>
</cfif>
<!--- Ok, so we need to mail --->
<cfmail attributecollection="#arguments.mail#">#body#</cfmail>
<cfset variables.cache[hashKey] = dateAdd("n", arguments.limit, now())>
<cfreturn true>
</cffunction>
</cfcomponent>
From top to bottom, the CFC begins by creating a Variables scoped cache. This will store our 'memory' in terms of what we have already emailed. The main function, throttleSend, begins by first doing quick validation on the keys of the mail object passed in. As far as I know, these are the only required arguments for cfmail (as long as you have a server defined elsewhere). Next we need to do some work on the mail body. We have an optional regex and if supplied, we need need to run it against the body of our mail. We create a hash value based on the to, subject, and body of the email. The regexed version checks the cleaned body while the non-regexed version just uses the body as is. I chose these three values randomly. You can may want to only check the body, or just the subject (although that makes the regex feature pointless).
Once we have the hashed value, we can check for it within our cache variable and if it exists, and hasn't expired, we actually send the mail. We then store the hash and create an expirey date and we're done. I made the method return false if the message wasn't sent because I thought there was a chance you may want to know this. Now here is a small template that shows how I tested it:
<cfapplication name="mtdemo">
<cfif not structKeyExists(application, "throttler") or structKeyExists(url,"init")>
<cfset application.throttler = new mailthrottle()>
</cfif>
<cfset mailOb = {
to="ray@camdenfamily.com",
from="ray@camdenfamily.com",
subject="Error about X!",
body="This is the body of the email."
}>
<cfset res = application.throttler.throttleSend (mailOb,2)>
<cfoutput>result was #res#</cfoutput>
<p/>
<cfset mailOb = {
to="ray@camdenfamily.com",
from="ray@camdenfamily.com",
subject="Error about X!",
body="This is the body of the email. Random: #randRange(1,100)#"
}>
<cfset res = application.throttler.throttleSend (mailOb,2,"Random: [0-9]{1,3}")>
<cfoutput>result was #res#</cfoutput>
In the first example I've got a static email. In the second one I've got a bit of dynamicness to it, but I can use the optional third regex value to remove it. What's cool though is that it only removes it from the check. When the email goes out, it includes the value.
Archived Comments
Why not just use a cache name (i.e. key) for the e-mail instead of using the body text and regex for caching--which can get ugly.
So, if no key is used, use the body text for caching--which basically assumes it's static text. Otherwise, provide a cache name like "general-error", and all e-mails with the same cache name will be treated like they're the same e-mail content. Yes, you'd have to use unique cache names, but I think that's easier in the long run.
The idea was to minimize the work necessary to use it. I didn't want you to have to worry about naming the cache.
It just seems like having to write a regex is going to be way more difficult than just a static cache name for most users. That's why I also suggested using the body text as the default cache name, that way you only have to supply a cache name if the message might be altered slightly in each message.
For example, this would be very difficult to use as-is if your error handler dumped the user's session information into an e-mail, but the root cause of the error was the same. If a user's session ends up containing any kind of "last updated" time stamp, it's going to be virutally useless as a new e-mail will always be sent.
I just know that most of my error pages end up including form post data, session data, etc and that stuff would be extremely difficult to parse out in a regex (if not impossible.)
It just seemed like this code was designed to stop sending tons of e-mail when you have a global problem (like the database server being down,) but my error messages always contain stuff that makes the message highly dynamic.
Just some food for thought...
Hmm. Good point. It's not like it would be that difficult to add a 4th optional argument to support that. If used, regex is ignored and the key isn't the hash but what you specify. I'll write a quick update and post it as a new blog entry (since this one seems to have been ignored by everyone but you ;).