Yesterday Terrence Ryan blogged about updates to the cfcache tag in ColdFusion 9. I thought I'd follow up on his blog post and talk a bit more about the other caching changes in ColdFusion 9, specifically, the new cache based functions.
First, here is a quick list of all the functions, just to give you a mile high view of what's available. For complete descriptions, please see the CFML9 Reference.
- cachePut: Puts an item in the cache.
- cacheGet: Gets an item from the cache. What isn't documented is that if you ask for an item that isn't in the cache, you get a null value back. I'll show how to work around that.
- cacheRemove: Forces an item out of the cache. Supposedly, the poor data is stripped of it's home, belongings, and job, and then forced to sell coffee at a gas station.
- cacheGetAllIds: Returns a lit of all the IDs in your cache. These are in upper case!
Ok, so that's your basic CRUD functions for caching. Along with those functions you have:
- cacheGetMetadata: Returns detailed information about one item in a cache. This is great for checking to see how much your cached information is actually hit. If it isn't hit too often, then maybe it isn't worth caching? if it's hit a lot, then maybe it's worth increasing the lifespan?
- cacheGetProperties: Returns information about the caching system as a whole (well, for your application). Can be focused on either object cache (what I'm discussing), the template cache (what Terry showed with the cfcache tag), or both.
- cacheSetProperties: Allows you to specify cache settings at a high level.
Alright, everyone with me so far? Let's begin with a super simple, but full feature example. This code will check to see if my stuff is cached and retrieve it is so, or set the value in the cache. It also provides programmatic access to clear the cache.
<cfif structKeyExists(url, "clear")>
<cfset cacheRemove("slowprocess")>
</cfif>
<cfif not arrayFindNoCase(cacheGetAllIds(), "slowprocess")>
<!--- fake slow --->
<cfset sleep(1000)>
<cfset data = now()>
<cfset cachePut("slowprocess", data, createTimeSpan(0,0,0,30))>
<cfelse>
<cfset data = cacheGet("slowprocess")>
</cfif>
<cfoutput>
<p>
Result is #data#
</p>
<p>
<a href="test1.cfm?clear=yes">Clear Cache</a>
</p>
</cfoutput>
So going from the top to the bottom, here is what we have. I added a URL 'hook' on top to look for the request to clear the cache. cacheRemove works just as simple as that. You can also pass in a list of IDs to clear a bunch of cached items at once.
Next up is the core caching system. Unfortunately, there is no simple cacheExists function (here is my ER). You can use cacheGetAllIds along with the new arrayFindNoCase() function to see if the item exists in cache. (I used arrayFindNoCase as the array returns the keys in upper case.) If the item isn't found, we run our slow process and stuff it in. When you store an item in the cache, you have 3 options for how long it will persist.
If you don't specify any value for the timespan, it will last as long as the application exists. You can specify a timespan, as I have above. And lastly, you can also specify an idleTime value. This allows you to say, "if the cache isn't used in this amount of time, kill it, no matter what the timespan is."
The other part of the CFIF is simple - just get the item from the cache. Next I display the result. Finally, I added a link so I could easily clear the result.
You can view a live demo of this here. Obviously with multiple people hitting it and clearing, the results will be random. Just like life. Wonderful.
Let's look at one more example. Here is a super simple way to report on all items in the cache. It simply loops over the IDs and runs cacheGetMetaData():
<cfloop index="cache" array="#cacheGetAllIds()#">
<cfdump var="#cacheGetMetadata(cache)#" label="Cache Metadata for #cache#">
</cfloop>
You can view this here. Notice that it returns some pretty interesting information. If you see nothing, it means the earlier cache died out. Since my application has one item in the cache, you may end up with no results. Simply visit the first demo again and then reload.
You may be wondering - where exactly is this information being stored? If my application caches something called Ray (and yes, I encourage all developers to use Ray as their cache keys, my ego needs constant inflating!), what happens if another one uses the same key? Well luckily, all caching APIs are based on the current application. Rename your application and the cache will go away. (Err, well, not technically. It's still there. If you switch back you will have access to the old cache.)
That leaves us with some interesting questions. How would I handle using this system with session based data? If I common use a URL hook to restart my application, how do I take care of the cache? I'll cover those in my next entry.
I'll also remind folks - caching should not, in general, be your first fix to a slow process. Your first fix should be to try to speed the darn thing up. There is a good chance that the process is slow because of some inefficiency that could be addressed through coding instead of simply "hiding" it with a cache. Let's be honest though. We don't always have the time to spend tweaking our code to squeeze out more performance. Also, even if some process isn't necessarily slow per se, if we can cache the result for the lifespan of the application, why not? The flip side to that is that we may need to monitor the size the cache as well. I'm planning a third blog entry that will look at the values returned from cacheGetProperties and talk about what they mean for your server and your application.
Archived Comments
Good stuff. This sould be helpful when we start using CF9 at work.
These really are some excellent additions.
Like, Kumar, I look forward to using them at work.
so rip scopecache
Personally I would have liked to see memcached used - don't get me wrong, happy to see this.
Heh, well, read my comments on Terry's entry. I like hope scopeCache lets you bind one cache to another... but yeah, if I'm on cf9, I'm going to use the built in stuff.
I had never seen memCache in action until this week. The client I'm doing a review for makes heavy use of it.
@Scott, ehcache 1.6, which is what CF 9 implements is much, much faster than memcached. The main advantage to memcached over ehcache is that memcached runs out-of-process, so it's not tied to the same jvm as your app server. You can now do this in ehcache as well using the ehcache server, but that isn't part of the CF implementation. If you want to know more and are planning to be at MAX, check out my session on advanced caching. I cover all this and a lot more.
Really could use a cacheExists function... do you constantly want it to built a list of values?
Ray, is there a reason you're pulling back all of the cache ID's and then looking for your key there? Seems to me this will become a problem if you load your cache up with tens or hundreds of thousands of items, which is not uncommon for large apps.
Wouldn't it make more sense just to do a cache get and if the item isn't there then put it in the cache? That's the typical way I've always worked with external caching providers.
@rob - cool thanks - looking at the docs for ehcache now - http://ehcache.sourceforge....
@Rob: Because the result of getting an item not in the cache is a null value, and I didn't want to confuse folks. Sure, there is an easy workaround (isDefined), but I thought that this was a bit simpler.
Ray, thanks for the post and the upcoming series. This is my favorite new (& overdue) feature of Cf9. I hope adobe works in your suggestions.
Personally I believe in cache what it served static. Memory is cheap. The best tuned code will never out perform microsecond cache retrieval. It gives the best end user experience. Focus your performance optimization on what has to be dynamic.
@Rob
Can CF9 talk to an out-of-process ehcache server? This would be the best of both worlds.
@ray - was just checking out the CF9 new functions and there's an IsNull now, so you don't have to do a (probably less efficient) isdefined.
Personally I think that doing a cacheGet and then an if(isnull(data)) {...} to generate and push the data is cleaner than an if/else. *shrug* personal preference I guess :)
I'm disappointed there doesn't seem to be any dependency support for the new caching functions.
Along with a timeSpan dependency this would really rock if there were options to tie cache to a file (stale if put before the last modified ts), or a list of other cache id's to support cascading cache.
Also, I don't see wildcard support for cacheRemove which is something else I would find very useful. For instance say I stored generated page output in cache, I prefixed each id with pages_ there doesn't seem to be a way to clear them in one shot. This makes any cache partitioning very difficult to implement.
@Brett: Definitely agree w/ the dependency stuff - it's one thing I think ScopeCache does better.
As for wildcard removal - you could easily add that with a UDF. Have you filed an ER for that though?
@Marcin: Agreed - I totally forgot about isNull, and it IS cleaner:
<cfif structKeyExists(url, "clear")>
<cfset cacheRemove("slowprocess")>
</cfif>
<cfset data = cacheGet("slowprocess")>
<cfif isNull(data)>
<cfset sleep(1000)>
<cfset data = now()>
<cfset cachePut("slowprocess", data, createTimeSpan(0,0,0,30))>
</cfif>
<cfoutput>
<p>
Result is #data#
</p>
<p>
<a href="test1.cfm?clear=yes">Clear Cache</a>
</p>
</cfoutput>
In the examples I've seen people have been doing something like this:
<cfset data = cacheGet("slowprocess")>
<cfif StructKeyExists( VARIABLES, "data" )>
...
</cfif>
@Jason - Right, that's how you would do it in CF8 and earlier. With the isNull function added, you would use that instead. (Or should use it IMHO)
Any idea if this stuff will be any more efficient than cf_accelerate?
We use a modified version of the old cf_accelerate tag for a massive amount of caching.
I would assume this would be better than that - and my scopeCache, because it is making use of ehcache and works at a much lower level.
This feature alone is worth upgrading. Just this one feature. I'm a bit concerned about memory control though. JVMs usually can handle about 1.7gb of ram. So my question is: is the cache shared between CF instances? (I'm talking in the enterprise edition). Thanks.
I believe it is unique per instance. Rob, do you know better?
Ehcache's implementation in ColdFusion 9 is in-process so your cache space is shared with the RAM your CF server is using in the JVM. You /can/ implement out-of-process cache with CF9 but this is much more involved as it involves multiple instances of Ehcache running on multiple servers. Even then, is sort of faux out-of-process because each CF server is still running Ehcache inside the same JVM as CF itself.
Aaron, I think you misunderstood the question from Alexander. He is talking about distributed caching and not out-of-process caching. These are two different things.
Distributed caching is way cool! With distributed caching, you can access the same caches across multiple servers and instances. If you update a cache on one instance, the same data and cache is available the different instances, enabling you to keep your cache data in-sync.
Let's say you have 2 server with 3 instances on each and those 3 instances are clustered together or maybe all 6 instances are cluster together. Without distributed caching, the cache on each server and instance would be different and out of sync. With distributed caching, the same cache and data is available on all the servers and instances. This will work even if you are not clustering your instances.
The CF9 ehcache is not distributed by default, but you can enable it by modifying the ehcache.xml file down in the cfusion/lib directory. This file has lots of nice comments to get you started. If you're trying to get it to work, you should visit ehcache.org and read about the xml configuration for distributed caching. You will also need to enable RMI Multicast Protocol on the NIC for the servers.
Dave, thanks so much for pointing out the file. Will post back if it all works as you say.
It looks like you can also add "user-defined" cache's and access them by using the key="user-defined" attribute of the <cfcache> tag. Using the user-defined cache, you can create your own cache's and those caches can be distributed and the default can be local if you want.
The user defined cache would look something like:
<cache
name="myDistributedCache"
maxElementsInMemory="20000"
timeToIdleSeconds="0"
timeToLiveSeconds="0"
eternal="false"
overflowToDisk="false"
diskSpoolBufferSizeMB="0"
maxElementsOnDisk="0"
diskPersistent="false"
diskExpiryThreadIntervalSeconds="0"
memoryStoreEvictionPolicy="LRU">
<!-- distributed listener -->
<cacheEventListenerFactory class="net.sf.ehcache.distribution.RMICacheReplicatorFactory"
properties="replicateAsynchronously=true,
replicatePuts=true,
replicateUpdates=true,
replicateUpdatesViaCopy=true,
replicateRemovals=true,
asynchronousReplicationIntervalMillis=1000"
propertySeparator="," />
<!-- bootstrap cache on bootup -->
<bootstrapCacheLoaderFactory
class="net.sf.ehcache.distribution.RMIBootstrapCacheLoaderFactory"
properties="bootstrapAsynchronously=true, maximumChunkSizeBytes=5000000"
propertySeparator="," />
</cache>
More about user-defined caches:
http://help.adobe.com/en_US...
The two other elements in the ehache.xml you need to enable should look something like:
<cacheManagerPeerProviderFactory class="net.sf.ehcache.distribution.RMICacheManagerPeerProviderFactory" properties="peerDiscovery=automatic, multicastGroupAddress=230.0.0.1, multicastGroupPort=4446, timeToLive=1"/>
And
<cacheManagerPeerListenerFactory class="net.sf.ehcache.distribution.RMICacheManagerPeerListenerFactory" properties="hostName=127.0.0.1, port=44611,
socketTimeoutMillis=120000" propertySeparator="," />
**Don't forget to enable "Reliable Multicast Protocol" on your server NIC.
Ray,
Thanks for posting this, it's really helped me a lot!
Only thing is that when I switched to the isNull instead of ArrayFindNoCase, it's always writing to cache, never getting the cached data. So I had to switch back to the syntax in your orginal example of getting the cacheID's from the metadata and doing an ArrayFindNoCase. Is there something I'm missing in the isNull?
Thanks again.
Ray,
I posted my last comment in haste.
I didn't follow the isNull example correctly.
I was doing an isNull on the cache variable, rather than the data variable.
Also I was not doing the get before the if statement.
It works beautifully.
By the way I attended your ORM presentation at MAX, it was great. Thanks for all the support to the community over the years. I've learned a lot from your books and blog.
@Ali: You are welcome!
Hi Ray,
any chance to get all IDs stored in cache? Not only for the current Application? cachegetallids() only returns the cache of the current App.
Not as far as I know.
HI! Is it a way to flush a cache from other directory?
Ex.
n:
„directory1/page1.cfm”,
there is:
<cfcache action=”get” id=”1” name=”qQ”>
<cfif isNull(qQ)>
<cfset sleep(2000)>
<cfquery name=”qQ” datasource=”some”>
SELECT firstName, lastName
FROM members
WHERE id=1
</cfquery>
<cfcache action=”put” id=”1” value=”#qQ#”>
</cfif>
How could I flush that, from:
„directory1/directory2/flush.cfm” ?
Silviu, I'm not quite sure I understand you. You can use the caching stuff anywhere. If you don't use the old system where you wrap code in cfcache tags, but rather use the IDs, then your question doesn't really make sense. You just remove the IDs. Period.
Hi Ray, thank's for reply!
I a call my self a newbye in Coldfusion!
All the test I've made, shows you're right as long as the „flush” command is coming from within the same directory as the „put” one. If the „flush” comes from a page situated in other directory does nothing.
I'd like to flush the „cfcache action=put Id=1...” from „directory1\put.cfm”
with, a „<cfcache action="flush" ID=1...” from „directory1\directory2\flush.cfm” !
Realy i don't know how to do it! I'v tried to give an absolute path in „directory” or „expireurl” with no result!
Thank you for your time!
Silviu.
Oh - don't forget the cache is Application based. If your other directory has a different Application.cfc/cfm, then this would be expected. You would need to use the same Application context.
Updated link to Terrance Ryan's blog post: ( http://blog.terrenceryan.co... )
Hi, I come from the future. Since version 10, you can achieve something similar to wildcards with the 'exact' attribute set to false: cacheRemove("pages_",false,myRegion,false) will remove all caches within myRegion whose ids contains "pages_".
Thanks for updating us!