CFTHREAD - When to join?

This post is more than 2 years old.

Earlier today I wrote a quick blog entry (CFTHREAD, Names, and Commas) about a bug I had with cfthread names. I mentioned that commas were not allowed in the thread name and that - probably - this was due to the use of the JOIN action allowing for a list of thead names to join together. Tony asked me why someone would use the JOIN action at all.

First off, what happens when you create a thread and don't do anything else?

<cfthread name="find more cowbell"> <cfset sleep(10000)> <cflog file="tdemo" text="All done, baby."> </cfthread>

<cfdump var="#cfthread#">

In this demo I create a thread. It sleeps for 10 seconds and then writes to a log file. Outside of the thread I dump the cfthread scope. (Everyone knows that exists, right? It gives you metadata about threads created during the request.) Running this we can see that even though the page ended, the thread is still running:

If you check your log files a bit later, you will see that the thread eventually did end and write to the log. In essence, this process is a "Fire and Forget" thread. You start the slow process and don't need to worry about waiting for it to end. A real world example could be starting a slow running stored procedure that performs a database backup.

But what about cases where you do need to wait for a result? Imagine an RSS aggregator. You want to hit N RSS feeds and take each result and add it to a large query. (Oh, I've got a CFC for that if you want something like that.) In this case, you want each thread to handle doing the slow process, but you want to wait for them all to finish before proceeding.

Consider this modified example:

<cfset threadlist = ""> <cfloop index="x" from="1" to="10"> <cfset name = "find more cowbell #x#"> <cfset threadlist = listAppend(threadlist, name)> <cfthread name="#name#"> <cfset sleep(10000)> <cflog file="tdemo" text="All done with #thread.name#, baby."> </cfthread> </cfloop>

<cfthread action="join" name="#threadlist#" /> <cfdump var="#cfthread#">

In this example, I've added a loop so that I can create 10 threads. Notice that I store the name in a list. This lets me run the join action at the end. Now if you run the file you will notice it takes 10 seconds to run. This represents the JOIN line waiting for the threads to end. Because they run in parallel, I don't have to wait 100 seconds, but just 10. Obviously most real world applications won't have a nice precise timeframe like that. Now if you look at the dump, you can see that they completed:

So the short answer is simply - it depends. If you need to work with the result in the same request, then use the join. If not, don't worry about it.

Raymond Camden's Picture

About Raymond Camden

Raymond is a senior developer evangelist for Adobe. He focuses on document services, JavaScript, and enterprise cat demos. If you like this article, please consider visiting my Amazon Wishlist or donating via PayPal to show your support. You can even buy me a coffee!

Lafayette, LA https://www.raymondcamden.com

Archived Comments

Comment 1 by tony weeg posted on 5/18/2009 at 10:50 PM

makes sense... i guess i just havent had a real world need for it yet.... i'd be interested to see if anyone else can think of a good real world need for something like this?

i use cfthread for a lot of things now, just hadnt used the join parameter yet. thanks!

Comment 2 by Raymond Camden posted on 5/18/2009 at 10:51 PM

So far, my only real world need was for ColdFusionBloggers.org. It would be a good side question - who is using cfthread in production? (joined or not) I may blog that tonight though to keep things here on topic.

Comment 3 by todd sharp posted on 5/18/2009 at 10:57 PM

I used to rely on cfthread heavily in production with SlideSix. I'd do the PowerPoint conversion 'behind the scenes' without forcing the user to wait. I pulled it out at some point though, can't remember why.

Comment 4 by Raymond Camden posted on 5/18/2009 at 10:59 PM

And what did you replace it with?

<cf_juvenile>
"pulled it out" - thats what she said
</cf_juvenile>

Comment 5 by tony weeg posted on 5/18/2009 at 11:02 PM

i use it in production now. cant get into too much detail but it's made a process that USED TO time out more than it didn't... with the help of cfthread, no more query timeouts, no more cfoutput time outs...

ok, some other things changed to bring that down from 4hrs to 30 minutes but this feature helped them most!

Comment 6 by todd sharp posted on 5/18/2009 at 11:07 PM

brilliant!!! man, i walked right into that one!

Comment 7 by Andy Sandefer posted on 5/18/2009 at 11:07 PM

@Ray and @Tony
The need for join seems very logical to me. Imagine that we kick off a process that is going to execute several different things server side (think transactional database). We would probably use this in combination with cftransaction but basically you could use the join action when you have cascading activities happening as in my application needs to do things in the database within a certain order or I'll violate the referential integrity of the system if certain records are not created in the proper chain of events (sequencing). If the user clicks a button that kicks off this chain of events that occurs on the database server then we probably don't want to make them sit around and wait until the whole mess is finished processing until we give them back the browser - so enter in cfthread with join power.

Comment 8 by Eric posted on 5/18/2009 at 11:29 PM

I've been using cfthread for a car/truck/cycle classifieds aggregator I built using Ray's paragator cfc. Thanks Ray... Amazon WL already made a while back!

Anyway, shameless plug here:
http://www.ownster.com

Works perfectly!

Comment 9 by Raymond Camden posted on 5/19/2009 at 12:13 AM

@Eric - if I didn't send a thank you, please accept it now. Sometimes I get items w/ no marker of who sent it.

Comment 10 by Ryan posted on 5/19/2009 at 12:19 AM

I was just researching zip code proximity searches for a friend and was looking up Troy Pullis' CFDJ article from a few years ago.

Building a Zip Code Proximity Search with ColdFusion

http://coldfusion.sys-con.c...

Using CFThread would probably be handy when implementing a lookup like this, particularly if there was some obscure reason you wanted to solve for distance using all 4 methods described in his article. Computing the distance between a given zip code and a large number of locations (even as few as 100) would normally fall under the 'serial' way of guessing how long it would take. Using threads would make it parallel (though possibly stressing out your server if you spawned too many threads at once).

I assume you would want to join your threads at the end to show your results in a table or show the 1 location with the least distance to your primary zip code.

Comment 11 by Andy Sandefer posted on 5/19/2009 at 12:21 AM

The wish list has brought big conflict into our home. Unfortunately I let it slip that I had bought Ray the Snuggie (may your kids never be cold again whilst changing channels) in front of my wife - apparently she really wanted one too (who knew, I thought she was joking).

@Ray - You need to move to a strictly cash only policy, please for the sake of all that is good! Then I can always get by on the "It's only money" phrase that makes people feel materialistic if they don't instantly agree with you.

Comment 12 by Raymond Camden posted on 5/19/2009 at 12:24 AM

I will happily accept PayPal donations at any time. ;)

Comment 13 by Andy Sandefer posted on 5/19/2009 at 12:25 AM

No you Diiiiiiin't

That may be a good article but I can't click on your link because the SysCon will try to poison my mind with loud auto-playing commercials about Windows and Visual Studio and if that doesn't kill me than that Turkish guy from Ulitzerkaselzer probably will.

Comment 14 by jonathan posted on 5/19/2009 at 6:18 PM

I am developing an app right now with cfthread/join. I have to run about 4000 queries, alter some fields in each row and insert them as new records but the user needs to work with the results, so it makes sense to have them just wait a minute or so.

Question... I would be curious as to people's server performance issues with cfthread. Right now our box (multi cf instances on vm) only has one cpu and it's killing it. The process runs slower than on my local laptop (which is dual core) and freezes up every jrun instance. I even set the number of threads down to 5 and set the cfthread to low priority. That just makes it freeze up longer.

Turned off server monitor as well. We are pushing to get another cpu which may solve the problem. On my laptop i can run between 35-50 threads at a time without any issues

Comment 15 by Brian posted on 5/19/2009 at 8:14 PM

I have an application that pulls user data out of an 800K+ record Active Directory tree. Due to imposed limitations, (not mine) I can only pull x records at a shot. So, cfthread parallels multiple requests for me, and then joins at the end to locally process the retrieved data.

Now, if I could just figure out how to do LDAP paged queries using CFLDAP (which i don't think I can -- seems I have to go right to the JVM), I could mightily improve the process. <sigh>

Comment 16 by Don posted on 5/20/2009 at 8:49 PM

I started using the cfthread tag in a production application that runs 2 massive queries to build 2 graphs. When they run consecutively it takes over 2 minutes for the page to load. Now I have them run concurrently and the time is more than cut in half. But I have to use a join.

Here is my problem tho. How to display the results when they are ready but display something else while they are running. Somebody said this can be done with cfthread but I'm not sure.

Comment 17 by Raymond Camden posted on 5/20/2009 at 8:53 PM

@Don: I can do this. Let me finish my hamburger and I'll show you.

Comment 18 by Don posted on 5/20/2009 at 8:58 PM

Hamburger? How about http://www.icanhascheezburg...

:)

Comment 19 by Raymond Camden posted on 5/20/2009 at 9:18 PM
Comment 20 by Sebastiaan posted on 8/21/2009 at 11:58 AM

Hi Ray,

I have the following challenge where I suspect CFTHREAD could come in handy.

In our CMS we update a record where the user has changed its documenttype. This change needs to be pushed to another database (say deltadatabase) so another system can import these changes. Now in the transaction to update the changed record in the CMS I've included a customtag that does the update to the deltadatabase. But the logic in this customtag calls the CMS to get its information. But this information has just been changed, so the customtag returns zero.

The challenge is to do the update to the deltadatabase and wait for this to have been processed completely before actually updating the record in the CMS. I have changed the transaction code in the CMS to post the old as well as the new documenttype, so I can use this as a trigger in the transaction that an extra update is necessary and that one process has to wait for the other to finish.

Am I making the challenge clear enough?

How to set up these threads? I'm a newbie to threads, just so you know ;-)

Comment 21 by Don posted on 9/22/2009 at 11:20 PM

Screw it. I moved to cflayout stuff and it shows a nice roundy roundy thing while the page loads. :)
BUT I'm also now using a cfthread that does not join. Instead of waiting for a page to be called that runs this huge query, I go ahead and run it so the results are ready when the page is called.
However, there is also a place for a join. This app pulls data from 6 other applications and puts it all together. Right now the app does each one in sequence. Instead I'm working on the cfthread with the join at the end to put it all together in one shot.

Comment 22 by Don posted on 9/23/2009 at 2:30 AM

Now I have another highly important question. Can I view the status of a thread from a page that didn't start it? So like if on the index.cfm page I start the thread, then on another page like checkthread.cfm can I check on the status? So far I can't but then I may be doing something wrong. (ya think?) What I really want to do is check the status of the thread occasionally and see if it is done but I don't want to wait for it to be done.

Comment 23 by Raymond Camden posted on 9/24/2009 at 12:54 AM

It's possible, but only if the threads update an Application variable.

Comment 24 by Don posted on 9/24/2009 at 1:38 AM

Yeh. That's what I was thinking (and did). I always thought threads were sort of asynch so any page could access their status. But it appears it isn't so.