Using ColdFusion 9's new FileSeek

This post is more than 2 years old.

Another new feature in ColdFusion 9 (and unfortunately not documented) is the new FileSeek ability. The basic idea of seeking in a file is jumping to an arbitrary position. This could be useful for a variety of reasons. For example, certain binary files may store information at the end of a file. Another example is getting the end of a long log file. I blogged about this back in April using Java via ColdFusion. ColdFusion 9 makes this somewhat easier with the addition of FileSeek.

As I said though, this is currently undocumented. Thanks to Rupesh for sending me the basics which I'll cut and paste right here:

FileOpen(path, mode, charset, seekable) - If seekable is true, you will be able to call fileSeek() and fileSkipBytes() . returns file handle

FileSeek(fileObj, pos)

FileSkipBytes(fileObject, noOfBytesToSkip)

Seems easy enough, right? Here is an example that mimics the Java code from my previous example. First, define the file and create a file object for it:

<cfset theFile = "/Applications/ColdFusion9/logs/server.log">

<cfset fileOb = fileOpen(theFile, "read", "utf-8", true)>

Notice the new seekable argument there. Next, let's define a few variables:

<!--- number of lines ---> <cfset total = 10>

<cfset line = "">

Total is pretty obvious. The line variable will actually store my characters as I read it in. I should have called it buffer, or buffy, or maybe pinkpajamas.

<!--- go to the end of the file ---> <cfset pos = fileOb.size-1> <cfset fileSeek(fileOb, pos)>

The next block of code uses the fileSeek. Notice that I define my position as the size of the file minus one. This will let me read a character in the code coming up.

<!--- go backwards until we get 10 chr(10) ---> <cfloop condition="listLen(line,chr(10)) lte total && pos gt 0"> <cfset c = fileRead(fileOb, 1)> <cfset line &= c> <cfset pos--> <cfif pos gt 0> <cfset fileSeek(fileOb, pos)> </cfif> </cfloop>

So this CFML code is pretty much the exact same as the Java-based code. Get a character. Add it to the line. Move backwards, and loop until we hit the beginning of the file or 10 lines. ColdFusion will do this for us, but it is a good idea to close the file:

<!--- close the file ---> <cfset fileClose(fileOb)>

Now we need to manipulate the string a bit. It is both reversed and has an additional character in it:

<!--- will always have one additional char ---> <cfset line = trim(mid(line, 1, len(line)-1))>

<!--- reverse it ---> <cfset line = reverse(line)>

And that's it! We now have a string with 10 lines from the end of the file. The complete template may be found below.

<cfset theFile = "/Applications/ColdFusion9/logs/server.log">

<cfset fileOb = fileOpen(theFile, "read", "utf-8", true)>

<!--- number of lines ---> <cfset total = 10>

<cfset line = "">

<!--- go to the end of the file ---> <cfset pos = fileOb.size-1> <cfset fileSeek(fileOb, pos)>

<!--- go backwards until we get 10 chr(10) ---> <cfloop condition="listLen(line,chr(10)) lte total && pos gt 0"> <cfset c = fileRead(fileOb, 1)> <cfset line &= c> <cfset pos--> <cfif pos gt 0> <cfset fileSeek(fileOb, pos)> </cfif> </cfloop>

<!--- close the file ---> <cfset fileClose(fileOb)>

<!--- will always have one additional char ---> <cfset line = trim(mid(line, 1, len(line)-1))>

<!--- reverse it ---> <cfset line = reverse(line)>

<cfoutput> <pre> #line# </pre> </cfoutput>

Raymond Camden's Picture

About Raymond Camden

Raymond is a senior developer evangelist for Adobe. He focuses on document services, JavaScript, and enterprise cat demos. If you like this article, please consider visiting my Amazon Wishlist or donating via PayPal to show your support. You can even buy me a coffee!

Lafayette, LA

Archived Comments

Comment 1 by Neil Bailey posted on 8/21/2009 at 11:30 PM

Wanted to say a quick g'bye, Ray. Your blog has been INVALUABLE to me for...more years than I can remember. However, we've made the jump
to BD.Net, and from there to pure .Net (C# - VB isn't a real language :) I'm kidding...kidding, i tell you!), and it hasnt been nearly as bad as I thought. CF
was my first web scripting language, and I love it, but it is too limited and not enough third-party libraries. I appreciate all of your help over the years, and wish the entire CF community well in the future :)

Comment 2 by Raymond Camden posted on 8/21/2009 at 11:32 PM

You'll be back... I know it. ;)

Comment 3 by Neil Bailey posted on 8/21/2009 at 11:36 PM

hahahahaha I wouldn't bet on it. C# is....fantastic. Clean,
organized, and SOOO many people are using it that no matter WHAT you are trying to do, someone has already written a library that
does _exactly_ that. Some things aren't _quite_ as easy (DB interaction, for example), but its not NEARLY as bad as you might think (instead of one command,
you need three). CF has served me _extremely_ well to this point, and I have made a very good living for...12 years or so...w/ it. But... in the end,
I have to tell you, I honestly believe C# is better.

Comment 4 by Emmet posted on 8/21/2009 at 11:38 PM

libraries for what? I'm still trying to get my head around that argument. and fileseek is coolbeans.

Comment 5 by Neil Bailey posted on 8/21/2009 at 11:47 PM

Libraries for literally ANYTHING. For example, if you want charting packages, in CF, your choices are fairly limited, ESPECIALLY if you don't want to use flash charts. In C#, there are DOZENS of choices. Want UI controls like WYSIWYG editors and what not, boom - dozens. Whats that you say, you need....almost _anything_, and there is a library that does it. The community is absolutely ENORMOUS, and while _obviously_ not as knowledgable, helpful and friendly as the CF community (:)), the very size of the community normally means that if you are struggling w/ something, someone else has already found a solution.

We made the switch to .Net w/ our dev team kicking and screaming, but we discovered that anyone w/ any kind of Java, C, C++, etc experience will have no problem at all.

I don't mean to sound all doe-eyed about it, but we've been using it for about six months now, and so far we've been _very_impressed.

Comment 6 by Raymond Camden posted on 8/21/2009 at 11:49 PM

Well first off - I don't want to get too off topic here, but let's be fair. There are multiple charting libraries you can use with CF. It _ships_ with one. Does .Net ship with multiple or do you have to download them? If you have to download them, then this is no better/worse than CF. Actually, I'd say worse if there isn't one built in.

Comment 7 by Neil Bailey posted on 8/21/2009 at 11:54 PM

haha I am definitely _not_ bashing CF. At _all_. And yes, CF does ship w/ a charting package, but we had always found it a bit limited - though our last version of Adobe CF was 7. But I would be willing to argue that anyone who claims that the CF charts are as good as the Dundas package is just...biased.

And, while .Net doesn't ship w/ a charting package, its going to take a LOT of packages to make up the difference in licensing costs between a base .Net server and a base CF server...

I really didn't mean to hijack your blog post, Ray. My apologies.

Comment 8 by Raymond Camden posted on 8/21/2009 at 11:56 PM

Heh no problems. Flame War Mode: Off. ;)

Comment 9 by Leigh posted on 8/22/2009 at 12:50 AM

Too bad there is no fileSeekLine (..) function in CF9. That would be sweet ;-)

Comment 10 by david buhler posted on 8/22/2009 at 1:02 AM

Nothing surpasses the capability and control of the Flex Charting Components.

I do like Adobe's approach of becoming a major vendor packager for CF. I like the cleaner syntax and the IDE for improved productivity. But it remains to be seen if a scripting language that comes fully bundled with some impressive plug-ins can make the language more compelling.

I have seen plenty of companies migrate from CF to .NET and Java, but not the other way around. To do so, Adobe would have an impressive argument. That's my 2cents worth of hijacking.

Comment 11 by Raymond Camden posted on 8/22/2009 at 1:23 AM

@Leigh: I would assume that would be impossible. By that I mean impossible w/o crawling. You _have_ to check every char for a new line since a newline is just like any other char. The file data itself just treats it all like one long string. (afaik)

Comment 12 by jh posted on 8/22/2009 at 5:07 AM

Ray, as you already said, "...this CFML code is pretty much the exact same as the Java-based code." So what's the value-add of using FileSeek? Why not just use the Java code? I don't understand why Adobe keeps adding functions that are just simple "wrappers" around Java, and serve no purpose other than to hide the underlying Java code. Why not just teach people how to use Java. "CF is Java", right?

Comment 13 by Raymond Camden posted on 8/22/2009 at 5:30 AM

Well by that token, since CF is Java, it is 100% useless. :) Why have cffile. Why have cfhttp. Etc. Now fileSeek isn't that much simpler than the Java version, but it does save you from having to create the Java objects.

Comment 14 by Adam Cameron posted on 8/22/2009 at 12:05 PM

Hi Ray
As per an earlier conversation (on another "forum"), Java takes care of the line reading, so CF doesn't need to read char-by-char to find the EOL marker. All it needs to do is to call readLine() ( as many times as requested, and discard all bar the last line.

It's easy.

As for why one should augment CF to do this when it can be done in Java? I think the person asking the question assumes an awful lot about the capabilities of the "average" CF user. Java file ops are somewhat more complicated than CF's ones.



Comment 15 by Raymond Camden posted on 8/22/2009 at 5:27 PM

@Adam - I was reading Leigh as saying you could get a line "magically" - I'm sure (not 100%) that Java's readLine still checks chars. S/He had said: "Too bad there is no fileSeekLine (..)" To me, that isn't readLine N times, but go to line N immediately, which is very different if you asking for line 1M out of a 10M line log file. ;)

Comment 16 by Rupesh Kumar posted on 8/22/2009 at 8:35 PM

+1 to Ray.

@adam, As Ray mentioned already, FileSeek(pos) is not same as reading as many bytes. It takes the file pointer to that position. Also, FileSeek can take the file pointer anywhere without reading anything from the stream. It can even take the pointer to the beginning of the file after you have reached end of file. you cant do that with fileread or filereadline.

Comment 17 by David McGuigan posted on 8/22/2009 at 9:28 PM

Neil, do you really feel like Visual C# is "clean and organized"?

It feels REALLY hacky to me that my HTML elements get littered with .net directives before I use any features that would warrant that and that my classes require so much setup, verbosity, and conformity. That I can only have one "codeFile" per page. Just the fact that it's strictly typed is already a thorn-in-your-heel for rapid and concise web development.

It's cute that VisualStudio will present me an enhanced tree view of my files' relationships to each other, but little bandaids and lollipops like that can't get it anywhere near as rapid and flexible to develop with as CFML, however much I applaud them for trying.

Beyond the fact that it is noticeably, measurably slower to develop with than CFML ( my opinion? harsh reality? ), I feel like you're making the same argument for .NET that people make for PHP, with the exception that you're sidestepping the issue that as a .NET developer you have to drop $500-4,000.00 PER VIRTUAL MACHINE on your operating system licenses. ColdFusion is one license per 2 CPUs ( unlimited VMs!!!! ), can run on a shizload of free OSs, and by that metric is exponentially CHEAPER than ASP.NET, before we even get into the developmental savings suggested in the ColdFusion evangelism kit. And if you really want to bring price into the discussion, *cough* railo *cough* CFML doesn't cost a dime.

Now PHP is totally free, and I'd bet that even MORE people are using it than are using .NET, so maybe you should check PHP out. It kind of sounds like your dream come true. Especially if your primary concern is being able to plug in code that other people have written. Open Sore Software definitely has its benefits.

Now I'm not saying ASP.NET is hard. It's not. Neither is PHP. Neither is Ruby on Rails. Lasso, on the other hand, is incredibly difficult ( kidding ).

But what I am saying is that ASP.NET is hacky and much, much, much less flexible and dynamic than ColdFusion. The irony is that I'm a HUGE Microsoft fan.

Microsoft does make some ridiculously great products: Windows 7. Server 2008. SQL Server 2008 ( I love and use MySQL but SQL Server absolutely dominates it in every way possible ). 360. Zune HD. They're all the best products in their classes ( opinion, you OS X and iPod kids don't need to chime in here thanks ).

But from where I'm standing, ASP.NET is one of its ugly step children. It just can't compete with ColdFusion for fast, quality, maintainable web application development.

Comment 18 by Raymond Camden posted on 8/23/2009 at 12:42 AM

Folks - please discontinue any more off topic comments. A few were definitely ok, but let's consider that part of the thread done with. Thanks!

Comment 19 by Adam Cameron posted on 8/26/2009 at 7:50 AM

Yes, thank-you Rupesh.

All I meant is that Ray was slightly off in that /the developer/ (be it the person writing some CF code, or the bod on the CF development team writing the fileSeekLine() function for CF itself) /does not/ need to check char by char.

And, yes, obviously readLine(), under the hood, is going to have to check the value of each byte. That kind of goes without saying.

Vis-a-vis scanning the file (line by line, or char by char) as opposed to moving the file pointer directly... other than a performance hit (and I'm not trivialising that), what's the difference to the end user of simply setting the file pointer compared to scanning the file (by doing a series of readLine() calls for example), and then setting the pointer at the desired location. Or is it simply that the overhead of doing the scan is sufficiently prohibitive compared to normal random-access operations?


Comment 20 by Leigh posted on 8/27/2009 at 1:50 AM

All the comments you miss when you forget to subscribe...


No, I was not suggesting you could avoid crawling the file (by character, chunk of bytes, whatever..). Eventually you will have to examine individual characters to identify a 'line of text'.

What I meant was it would be nice to have a built in function that handled the mundane parts of that task. In particular, I was thinking of your example of reading the "tail" of file. Having a built in function for that would certainly simplify things. It would also help users avoid some of the common pitfalls of working with lower level file i/o.