CF901: Guide to Amazon S3 support in ColdFusion 9.0.1
One of the most exciting parts of ColdFusion 9.0.1 is Amazon S3 support. If you've never heard of S3, here is the marketing copy. I'll follow this up with my take on it.
Amazon S3 is storage for the Internet. It is designed to make web-scale computing easier for developers.
Amazon S3 provides a simple web services interface that can be used to store and retrieve any amount of data, at any time, from anywhere on the web. It gives any developer access to the same highly scalable, reliable, secure, fast, inexpensive infrastructure that Amazon uses to run its own global network of web sites. The service aims to maximize benefits of scale and to pass those benefits on to developers.
Ok, so in English, what in the heck does this mean?
You can think of S3 as a remote hard drive of infinite size. If you've ever built a web application that lets users upload files (images, attachments, etc), then you know that disk size is a concern. Sure it may take months or years before you have to be worried, and sure, hard drives are pretty cheap, but the point is you still have to think about it!. Amazon S3 gives us a way to not worry about it. (Not that it's totally without worry, see my closing notes below.)
To begin playing with S3, you need to first sign up for an account. You can do that here: http://aws.amazon.com/s3/. You will need to provide a credit card. This worried me. Pricing is based on how much stuff you put on S3 and how much it's accessed. Details can be found here. But to be honest, this kind of read like the The Original Handbook for the Recently Deceased. However, in all my testing, I've yet to incur a bill over 5 cents. Obviously your bill will be different, but for testing, I'd say this is a non-concern.
Once you've set up your account, you need to get two special pieces of information: Your access key and your secret access key. Once you have logged into S3, this may be found under Account/Security Credentials. Copy these down in a safe place. You will need them later.
Ok, so let's talk about S3 usage in ColdFusion. The code comes down to 4 distinct areas:
- Credentials: You have two ways to "tell" ColdFusion your access credentials. I'll cover that below.
- File operations: This is the nut and bolts of using S3, creating your storage and copying and updating files.
- Permissions: This is where you set up who can access resources.
Metadata: This is where you can add additional information about your resources. S3 allows you to set anything you want.
Let's begin with credentials. For all the code below, assume K1 and K2 are my real values. I trust you guys, but not all of you guys. ;)
2
3 this.name="s3test2";
4 //s3 info
5 this.s3.accessKeyid = "K1";
6 this.s3.awsSecretKey = "K2";
7
8}
As you can see, I've provided Application-level credentials. This means any S3 calls in my code will use these values. However, you can also provide the credentials in the path itself. If your application needs multiple different connections then that is the option you would use.
Now let's talk about file operations. The most important thing you must learn is the concept of buckets. A bucket is like a root folder. Your account can have any number of buckets and you must have at least one if you want to store files. Buckets are unique through the entire world. So that means you won't be using a bucket name like "images". From the S3 documentation, we have the following rules for buckets:
Only contain lowercase letters, numbers, periods (.), and dashes (-)
Start with a number or letter
Be between 3 and 63 characters long
Not be in an IP address style (e.g., "192.168.5.4")
Not end with a dash
Not contain dashes next to periods (e.g., "my-.bucket.com" and "my.-bucket" are invalid)
The generally accepted practice is to use a bucket name that matches your domain. So imaging you want to use S3 to store images for foo.com. Your bucket name could then be: images.foo.com. You can also create a bucket for your development work: dev.images.foo.com. And obviously staging too: staging.images.foo.com.
Once you've decided on your bucket then creating it as simple as creating any directory. The only difference is that you will use the s3:// prefix. Much like the VFS system, this tells ColdFusion that the operation is for S3, and not your local file system. Here is a quick example:
2 <cfset directoryCreate("s3://demos10.coldfusionjedi.com")>
3</cfif>
4
5Done
The example above makes use of both directoryExists and directoryCreate. If not for the s3 prefix, this code would look just like code we've used for years. (Well ok, the directoryCreate function is kind of new!) Writing files is just as easy. Consider this next example:
2<cfset files = directoryList("s3://demos10.coldfusionjedi.com")>
3<cfdump var="#files#">
On every reload, a new file is created (with the text 'Foo'). I use the directoryList command to get all the files from the folder. Here is an example result:

I could read these files as well, and copy them to my local server. Deletes too. For the most part, you can almost forget that S3 is being used at all. However, you probably want to actually share these files with others. Images being the best example. Just because you copy the image to S3 doesn't mean folks can actually see the image. Enter the ACL, or Access Control List. This is how S3 does permissions.
ACLs consist of a entity (the thing you are granting permissions to) and the actual permission. The entity is either a group (like all), an email address, or an ID value (the docs call this a canonical value). Think of that like a user id. Permissions are:
- read
- write
- read_acp (basically the ability to get permissions)
- write_acp (basically the ability to set permissions)
- full_control
You have four main ways to work with ACLs. You can get an ACL, using storeGetACL(url). This returns an array of ACLs for the object. That can either be the main bucket (again, think root directory), or an individual file. You can set ACLs using storeSetACL(url, array). This will set all the permissions of the object. This is important. It's going to bite us in the rear in the next example. Next, you can add an ACL using storeAddACL(url, struct). Finally, you can set ACLs when creating a bucket using the cfdirectory tag. Let's look at an example of this, and as I said, this code is going to have a problem. You will see why in a minute.
2 <cfset perms = [
3 {group="all", permission="read"}
4 ]>
5 <cfdirectory action="create" directory="s3://demos11.coldfusionjedi.com" storeacl="#perms#">
6</cfif>
7
8<cfset fileWrite("s3://demos11.coldfusionjedi.com/#createUUID()#.txt", "Foo")>
9<cfset files = directoryList("s3://demos11.coldfusionjedi.com")>
10
11<cfdump var="#files#">
So going from the top, we first see if our bucket exists. If not, we are going to create it. But we begin by creating an array of ACLs. ACLs are just structures really, and in this case I have one. I'm giving everyone read access to it. I then pass this array to the cfdirectory tag when I create the bucket. Easy peasy. I follow this up with a simple write command, like our last command. However, now this returns an error:
S3 Error Message. Access DeniedCould not close the output stream for file "s3://demos11.coldfusionjedi.com/62ECAED9-EB00-C07E-3C0F435E525D1B06.txt"..RootCause - org.apache.commons.vfs.FileSystemException: Could not close the output stream for file "s3://demos11.coldfusionjedi.com/62ECAED9-EB00-C07E-3C0F435E525D1B06.txt".
What the heck? Turns out - when we created the directory, we set the permissions. In other words, we told S3 that those were the only permissions allowed for the bucket. In other words, I didn't give myself access! So this is easy enough to fix - if I get my user ID I can give myself full control. But what in the heck is my user id? It isn't my logon. Luckily if you go back to the Account/Security Credentials page, towards the bottom, there is a link: View canonical user ID. Clicking that gives you a long ID value. I'm going to modify my code now to ensure I've got permission as well.
2 <cfset perms = [
3 {group="all", permission="read"},
4 {id="this is not the real value", permission="full_control"}
5 ]>
6 <cfdirectory action="create" directory="s3://demos12.coldfusionjedi.com" storeacl="#perms#">
7</cfif>
8
9<cfset fileWrite("s3://demos12.coldfusionjedi.com/#createUUID()#.txt", "Foo")>
10<cfset files = directoryList("s3://demos12.coldfusionjedi.com")>
11
12<cfdump var="#files#">
For the most part this is the same code as before. Note that I'm using a new bucket though. (Remember, you can add ACLs too. So we could have fixed that last bucket.) Now my array of ACLs includes one for myself as well, and I've given myself full control. This is not necessary if you don't pass any ACLs at all, as I did the first time I created a bucket. So what do the ACLs look like if you get them? If I modify the previous template to add this:
2<cfdump var="#acls#">
You end up with this:

Forgive my horrible attempt to obscure my ID. As you can see, I pretty much just got the same set of structs back. I do get a display name though. Depending on your needs, that could be useful to display on your site's front end.
So you may ask - how would you use S3 to display images? Or how would you link to them in general? Well obviously you have to give everyone permission to read the file. Without that an anonymous request will be blocked. Once you've done that, you have a few options.
The first option is to simply use an Amazon URL. Here is the generic form: http://s3.amazonaws.com/#bucket#/#yourfile#. Here is an example: http://s3.amazonaws.com/demos3.coldfusionjedi.com/dalek_war_low.jpg. And here it is in viewed on the web page (notice I set the width a bit smaller so it wouldn't be so huge on the page):

The second option is to make use of a CNAME. This is a domain alias where you basically point one of your domains. I've set up demos3.coldfusionjedi.com as an alias to s3.amazonaws.com. That means the image above can also be found at http://demos3.coldfusionjedi.com/dalek_war_low.jpg. That way no one knows you are even using S3. (Although I included 's3' in the name so that kinda gives it away.)
Finally you could - although I wouldn't recommend it, copy media down when requested. For attachments that may be ok. For images it would be overkill. But interestingly enough imageNew() works just fine using S3 for a source image.
So now let's look at another interesting aspect: Metadata. S3 automatically creates a set of default metadata for every file you put up there. This includes: (and I should credit Adobe for this - it comes from their new CF901 reference)
- last_modified
- date
- owner
- etag
- content_length
- content_type
- content_encoding
- content_disposition
- content_language
- content_md5
- md5_hash
Buckets also include metadata, but only date and owner. What's cool though is that you can add any additional metadata you want. So first, let's look at an example of getting metadata.
2
3<cfloop index="theFile" array="#files#">
4 <cfset md = storeGetMetadata(theFile)>
5 <cfdump var="#md#" label="#theFile#">
6</cfloop>
This template simply lists the contents of my last bucket and for each file grabs the metadata. Here is the result:

Adding metadata is as simple as passing a structure of name, value keys.
2
3<cfloop index="theFile" array="#files#">
4 <cfset md = storeGetMetadata(theFile)>
5 <cfdump var="#md#" label="#theFile#">
6
7 <!--- Add coolness --->
8 <cfset storeSetMetadata(theFile, { coolness=randRange(1,100) })>
9
10</cfloop>
This is kind of a silly example since I'm using a random number, but after running this a few times, you can see the impact in the metadata.

So - at a high level - you can see how to read and write to S3. You can see how to set permissions and metadata. What's next? Here are some things to consider.
- Amazon S3 is not perfect. It has failed. The flip side to this is that nothing is perfect. I'd rather let Amazon worry about hard drive safety and backups. I trust them more than I trust myself. If I were the CIA, I wouldn't use them. If I were Flickr... I'd use them with caution and ensure I had a strong plan to cover any issues with Amazon. If I were a forum, or a blog, or simply making use of them to store media that isn't mission critical (forum avatars certainly are not, message attachments certainly are not), then I'd strongly recommend them. If you are ok with Amazon perhaps being down once a year, but want to ensure you never lose anything (and to be clear, I've heard of S3 being down, but not ever losing date), then I'd consider a monthly backup to a local hard drive. The take away from this though is that anytime you make use of a remote resource, please be sure to code for that resource not being available. I'll demonstrate an example of that later this week. (Remind me if I forget.)
- Custom metadata is cool, but right now I'm not aware of a way for you to search it via ColdFusion's API. So if you wanted to create a link between resources in a bucket and a user on your system, consider having a database table that links the user (and any other metadata) and the S3 file ID. This way you can search locally and fetch the files as needed.
- One of the cooler features of S3 is the ability to create temporary links. This allows you to keep a file protected, but say that anyone with this URL can grab a resource for N minutes of time. CF901 does not provide support for that. However, you can grab Barney Boisvert's cool Amazon S3 here. It has a method just for that purpose. And while I recommend everyone upgrade to ColdFusion 9.0.1., I'll point out that Barney's CFC provides full read/write support. (I don't see ACL support but maybe I'm missing it.)
- Amazon provides a web based interface to your S3 account. When you screw up, and you will, it can help you figure out what's going wrong. Once logged in, you can hit it here: https://console.aws.amazon.com/s3/home
- Random link - this blog entry contained some cool ideas and tips for S3: 9 Hidden Features of Amazon S3
Any questions?

this.mappings["/testfoo"] = "s3://demos3.coldfusionjedi.com";
I then copied a CFM over to it and did:
<cfinclude template="/testfoo/test.cfm">
and it worked fine.
Rupesh also shared that *all* s3 operations occur over HTTPS. While that's great for security, for high volume binary file transfers that could be a bit of a performance drag. So it is not possible through the built-in s3 support in CF 9.01 to switch to HTTP for faster, non-ssl transfers.
When reviewing Barney's s3 CFC I noticed that in his s3url function he provides the option to use either HTTP or HTTPS, so that's an advantage over the s3 support offered by CF 9.01 if performance is a concern. Even so, Barney's code is an excellent demonstration of how to maintain secure credentials by utilizing a signature generated with your Secret Access Key to generate a Hash Message Authentication Code (HMAC) as fully explained here: http://bit.ly/8ZeqTm
So the main point of my comment is to share what I learned about securing credentials with s3 in ColdFusion and to point out that Barney's code may be a more performant alternative.
Thanks!,
-Aaron
This setting is a Java system property which is "s3service.https-only", which if set to false (default is true), will do the communication over http and not over https.
I haven't seen CF specific docs on this, but have found the following docs regarding the JetS3t Toolkit:
http://jets3t.s3.amazonaws.com/toolkit/configurati...
http://jets3t.s3.amazonaws.com/api/org/jets3t/serv...
It appears to be integrated into CF:
$ pwd
/opt/ColdFusion9/lib
$ ls jets*
jets3t-0.7.3.jar
I'm interested to find out if that system property is configurable at runtime. An advantage of Barney's code is that the option of http/https can selected at runtime.
source = "#backupPath##filename#"
destination = "s3://myID:myKey@myBucket/#filename#"> and all is good. However, the production server at work is behind a router/firewall controlled/managed by a 3rd party. I read somewhere that S3 needs port 843 open to work (and then lost that reference) but does the CF built in function connect to a particular IP at amazon so I could ask for that port open for just that IP?
I also found that the Amazon IP would vary depending on which bucket you are connecting to. I don't know how reliable or constant the IP for your host would remain.
http://aws.amazon.com/console/
Q: Are there any requirements for use of the Amazon S3 Console?
Yes, the Amazon S3 Console requires that you have a 10.x version of Adobe Flash and access to TCP port 843. These requirements do not apply to the management of other services in the AWS Management Console.
8634 19:49:02 31/07/2010 60.7759789 jrun.exe s3-3-w.amazonaws.com 172.21.62.239 TCP TCP:[ReTransmit #7672]Flags=...AP..., SrcPort=HTTPS(443), DstPort=50768, PayloadLen=677, Seq=1389272228 - 1389272905, Ack=3800477662, Win=400 (scale factor 0x6) = 25600 {TCP:15, IPv4:14}
Thanks in advance
Derek
Cheers
Derek
I've just started experimenting with S3 for some image upload stuff (user uploads an image and I use CFImage to create thumbnails etc).
I've stumbled across an issue that I can't figure out - and it doesn't appear to be documented anywhere as far as I can tell.
I have a number of buckets already in place, i.e 'thumbnails.mysite.com'. User uploads their image, I do some CFImage processing, then I'm using ImageWrite to send to thumbnail to S3 ( <cfset ImageWrite(img1, "s3://thumbnails.mysite.com/#thumbnail#")> ). That all works fine, and using the S3 console shows the file there no problem.
However, no-one else can see the image (which I'm trying to display on the page after processing, as well as reference in a DB for later viewing). The issue is obviously related to permissions - if I manually 'make public' the image in AWS console, it shows up fine.
My above ImageWrite code won't accept your example ( storeacl="#perms# ) as a variable, so I'm not sure how I can tell S3 to make the image public without having to manually go in and do it for each and every image (which, obviously, isn't workable).
Am I missing something or do I need to perhaps look at doing ImageWrite locally, then cffile copy to S3 (which, I presume will allow me to set permissions?).
Thanks for the reply. It appeared to me to make sense to do it all at once, if it was possible, in the same way you demonstrate creating a directory and setting permissions.
Is there any documentation anywhere to show how you set the ACLs in a CF call? I'm not even sure which tag I would use to do it (cffile, perhaps?).
"You can set ACLs using storeSetACL(url, array). This will set all the permissions of the object."
Note - you won't find this in the _main_ CFML Reference. It's only in the 901 release notes.
I did pick up on the storeSetACL part in the main entry - however my confusion came about because your demo shows it as part of the cfdirectory, hence me heading down the road of doing it as part of ImageWrite.
If I'm following this correctly then, I should be able to do it like this;
<cfset ImageWrite(img1, "s3://thumbnails.mysite.com/#thumbnail#")>
<cfset storeSetACL(s3://thumbnails.mysite.com/#thumbnail#, #perms#)>
(above assumes I've set 'perms' in the same manner you demonstrate in the blog entry).
https://s3.amazonaws.com/test1.coldfusionjedi.com/...
But trying to view one file: https://s3.amazonaws.com/test1.coldfusionjedi.com/...test2.txt" target="_blank">https://s3.amazonaws.com/test1.coldfusionjedi.com/......
gives you an access denied. So you definitely do want to follow up with an ACL set to give everyone permission.
To be honest, it's probably better s3 err on the side of caution.
Thanks for bringing this up!
My initial assumption was that setting permissions at bucket level would apply to all items within the bucket. If that had been the case, because I'm only dealing with a handful of buckets, I wouldn't need to deal with ACLs with Coldfusion at all.
As you say, it's probably best that S3 does it the way it does. Thanks for the clarification on how to do this - it's certainly an area that could use more examples - I was surprised at not finding the same question asked elsewhere before.
Any ideas what I may be doing wrong here?
This is awesome!
Congratulations to all involved and thanks again Ray for this great post.
and also thanks to author of this blog for writing good content.
I am from development team side. Bucket Explorer is one of the best tool for amazon s3 and cloudfront services.
Thanks again for this great post. I know its a bit old know, and you're probably flat strap at adobe these days, but I've run across something with large files and amazon s3, and was wondering if anyone else has done something like it.
My little app pulls a whole bunch of files together into one zip file and then tranfers them to my amazon bucket. All works like a charm when the files are less than say 100MB, but when they get larger 200,300+MB then I run into all sorts of headaches. Mainly with the file transfer timing out, with my process taking anywhere up to 1 hour to run. Not nice.
I'm trying to trigger several of these sends at once, each with similar size zip files to send to the same bucket (different names obviously).
It seems that even though each one of the sends is triggered seperately, coldfusion is waiting for one to finish, before sending the next one.
The process that does the sending is a function in a CFC.
I would've thought each send would've been handled in a separate thread.
Perhaps the S3 implementation in CF only allows one connection to the amazons3 bucket at once, and while it's being used no other connection can get through?
Does that make any sense? Has anyone seen anything like this happening, or has any more info?
Thanks again!
First time I've posted a there.
On further thought about it, we're running on ColdFusion Standard - and I know that it only has one thread to use (when using CFTHREAD) and then they get queued. Could that be waht the problem is, even though I'm not using CFTHREAD in this case. ie. Does CF still handle the request in the same way?
Thanks again.
S3 does not use the threadpool used by CFThread. So it can't be because of that. We will investigate this.
Does this bring the file down to the local server (temporary location) - then copy it - and then post it back to the Amazon S3 bucket. Or does it do it within the S3 bucket itself?
I've noticed that it seems to take a while for the copy operation to complete if the file is large (>10MB).
Any ideas anyone?
Bit of backwards and forwards with Adobe about this. Seems we have to purchase platinum support in order to get any action? Just can't believe this.. to have to pay $$$k to get support for a bug with a product that we only paid ~$1.5k for in the first place. wow.
I have started using the REST wrapper available on RIAFORGE http://amazons3.riaforge.org/ and that works much better in terms of threading etc. large file going up to my bucket in seconds rather than minutes. And seems to be threaded correctly with multiple threads firing off at once. Problem with large files still though being held in memory as a result of the PUT operation (mainly due to available memory on our server).
So I am no attempting to do this with a Multipart Post operation via the REST Wrapper.
My concern is - by doing this am I essentially going to recreate the issues that I'm seeing using the native CF9.0.1 techniques described here, and end up with the threading problem/bug again.
Here goes.
Didn't meant to have a go at Adobe there. I did mean to say "Is this correct?" afterwards.
So far both of the Adobe support guys I've spoken to/emailed with have asked for our platinum support number in order to get this investigated. I was told if I didn't have platinum support then there would be very little chance of the bug being fixed promptly.
If that's incorrect then it makes me very happy. :)
We are looking into this issue.We would like to get some more details on this.It would be great if you can give your mail address.
Also what is the bug number?
Bug number is 86983 - all the details are in there.
I haven't seen anything happening on it - so I just presumed there would be no action.
I am now almost completed building a java process that does the heavy lifting for these large files. This is housed on an EC2 instance and means that there is no need for huge files to be handled by ColdFusion. I decided to get the Amazon S3 to get them itself rather than have them pushed by CF.
It would be so much simpler to run with the out of the box solution - but unfortunately I couldn't wait for that solution to present itself.
I think going forward that this bug needs to be sorted out by Adobe - as eventually others are going to hit the problem. More and more of our clients are expecting this stuff to be "easy", with all this "cloud" marketing that is out there.
Thanks to all who had suggestions. And thanks again to Ray for the original post and soapbox to get my problem heard/discovered. Here's hoping Adobe fix this up in the next release/patch/whatever.
http://www.therealtimeweb.com/index.cfm/2011/11/25...
dlist = directoryList("s3://mybucket");
However, I get an error that says this in the exception log (truncated):
<Message>The AWS Access Key Id you provided does not exist in our records.</Message><AWSAccessKeyId>null</AWSAccessKeyId>
However when I do this:
directoryList("s3://#accessKeyId#:#awsSecretKey#@mybucket");
It's okay! (although, this way spews superfluous data that I don't need)
I implemented the exact code Ray has here in his blog. Am I missing something?
As it turns out...
I was doing everything in cfscript inside a cfc and calling the component directly without referencing it in a cfm page which is why it wasn't working. grrrr.
Once I saved it as a .cfm template and replaced the component container with cfscript, it worked like a charm.
org.apache.commons.vfs.FileSystemException: Unknown message with code "S3 Error Message."..
I've just signed up to Amazon S3 and I've found your post great to get started with it.
However, the application I'm working on has a lot of legacy code and it still uses Application.cfm instead of Application.cfc. So the question is, how do I provide the credentials? I've tried setting them in the application scope and the this scope, but it doesn't work.
Is it possible to provide them or should I switch to Application.cfc?
It's just that it would be great if I didn't have to pass the credentials with every single request...
I'm having trouble with the directoryList("s3://mydomain.com") yielding a result. When I dump the variable its empty. However, my s3 bucket is filled with files. When I cfdump storeGetACL("s3:..mydomain.com") I get back the values I would expect (correct displayName, ID, and permissions at "FULL_CONTROL") so I know I can connect to my buckets. Any advice on why my directoryList is coming back empty?
dlist = directoryList(MyAccessKeyId:MyAWSSecretKey@MyRootFolder/subdirectory,true, "name","","asc");
Application.cfc
<cfcomponent output="no">
<cfset this.name = "lny" />
<cfset this.s3.accessKeyId = "[my access key id]" />
<cfset this.s3.awsSecretKey = "[my secret key]" />
...
Test Page
<cfif not directoryExists("s3://media.lightingnewyork.com")>
<cfset directoryCreate("s3://media.lightingnewyork.com")>
</cfif>
<cfset local.dir = directoryList("s3://media.lightingnewyork.com") />
<cfdump var="#local.dir#" />
dump returns: array empty
I noticed that if I used your code to put a new text file into the directory that the array would then populate with that newly added file, but still not show my other existing files on the bucket. Likewise, I could not see the newly created text file in my s3 console. It was that test that brought me to the conclusion that I must be missing an important piece of information.