Daniel emailed me last night with some questions about ColdFusion 9's Virtual File System (VFS). I thought it would be worthwhile to discuss them with the class (you guys ;) and see what people thought.
I have read some of your blog posting about the virtual file system, but I still have a question. So, in our system, the users do a lot of uploading files, many files are quite large, such as Arch. Drawings. I figure some of the uploading time is because of the network, however, I was wondering if the VFS could help speed up our doc upload times, or is that not what this concept is for?
In this case, the VFS won't help you. When you upload a file, the time it takes to get from the client to the server will be entirely dependent on the size of the file being uploaded and the connection between the two parties. The VFS can't help out here at all. Once the file is uploaded, however, the VFS is an excellent place to store the file and can be used with the cffile/action=upload tag. You get the performance benefit of having the file in RAM and being able to work with it there along with not having to worry about security issues. Do know though that the default size of the VFS is 100 megs. You mention files that are "quite large" and involve drawings. I can easily see one detailed image being larger than 100 megs. Don't forget you can check the size of the VFS and determine how much free space you have in it.
I thought maybe the users, when the doc is uploaded, would go directly into RAM since the documentation makes it sound like that would be quicker...maybe I read its purpose wrong... Then, when the user actually navigates away from the page, CF moves the document from RAM to its actual storage place.
Again, with the file in RAM, some things should be quicker. So for example, using imageGetInfo, might be trivially faster since there isn't the need to go to disk. I say "trivially" because I've not done any speed tests myself. But if you do a few operations on the image (get info, resize, etc) then your benefits may stack up. Now to the second part of your question above "when the user .. navigates away" - that's a whole other question: "Can we do something when the user leaves the page?" I assume you don't really mean that - but if you do, let's follow that up with another blog post. I assume you meant that you could move the file to physical storage when you are done "messing" with it, and that is absolutely true.
Is this one use of VFS, or am I totally missing its function? I was not to clear on why you would use VFS to store cfm pages, or cfc page, or images...
One reason to use VFS for CFM pages would be to allow for dynamic CFML to execute. While not typically recommended, you could store CFML in your database, and then use the VFS to write it out as a temporary file and execute it with a cfinclude. Again, this is not typically recommended, but it's something you can do if you choose. I built a simple ORM based CMS when ColdFusion 9 came out. I used this technique to allow for dynamic headers and footers with CFML within them.
As for other uses - so far my only production use of the VFS is for file uploads at Adobe Groups. All file uploads are put in VFS and "checked" there (for images, for size, etc). Once I'm happy with them I move them to the Amazon S3. (Which will be a heck of a lot easier in ColdFusion 901.)
Archived Comments
I had a light just half come on! For downloading stats extracts I write my database query to a file using cfspreadsheet, then temporarily copy it to a location below webroot whilst giving the user the link. The accessible file is autodeleted 30 seconds later. Sounds like I could simplify this, is there some soft of cache/expire option on the file in memory Ray?
Nope, the "file" exists forever - or until the server restarts. If the server never restarts, it will never go away. You can check the 'last mod' property of the file (it should reflect when it was copied to ram) and delete it with a scheduled task.
I want to point out a misleading statement I made above. I brought up the example of how VFS could be used to boost things when doing multiple image operations. You can use an images in ram (and by that I don't mean VFS, just a normal image variable) and perform multiple operations there.
G'day.
I would be very very cautious about putting large files in the VFS. The VFS impinges on heap space, which - despite a 64-bit architecture's advances in addressable space - could very quickly be swallowed up by a couple of large files. Obviously one can throttle the total available space to the VFS so one can allocate a finite storage area, but it's still an important consideration.
If dealing with large files as per the original poster's query, one is going to have to set the VFS size large enough to accommodate the maximum expected file size, which - I think (I should check before saying this, but I haven't) - will allocate that amount of RAM on server start-up, which'll mean that amount of RAM will be unavailable to CF for "normal" heap usage right from the outset, whether the VFS space is being used or not. Not a good use of resources.
Until the VFS can be homed in a different heap from the one CF itself uses, I think its usefulness is limited.
As Ray has cited, it could be leveraged for dynamic CFM files (and I agree with Ray's caveats here), but I question even the merits of this. Most of the performance overheads with dynamic CFM files is the compilation, not the file-system-fetch. On a decent server, file-system fetching is really really quick anyhow: nothing like fetching from RAM, but still - when considering the time-cost of fulfilling a request - I don't think it's usually a major consideration. So I would think gains in using the VFS would be minimal here, when played off against resource consumption and management. For dynamic file reuse, I'd just leverage the fact that CF caches its templates in RAM already anyhow, and has intelligent management of how it ages / clears seldom-used templates from that cache.
I'd love to hear of someone's experiences in using the VFS in a production environment where the gains over using the normal file system have been measured, and a true benefit identified. So far all I've really heard is people citing the theory that "well RAM is faster than disk therefore using the VFS is a great boon for operations that need that edge". I've heard nothing "real".
If I was wanting to use a RAMdisk, I think I'd set it up at OS level, rather than in CF. Leave CF to processing CF files. That's what it's good at.
--
Adam
My guts tells me to agree with Adam, but would like to see some real world case studies.
Does anyone know what the max limit of ram that can be allocated to VFS? I've tried over 1 gb, and the free space gets returned as a negative number, and any attempts to write to the RAM results in a error message that the global limit is exceeded..