In the ColdFusion IRC channel today, someone asked about reading just the top portion of a file. While she was looking for a command line solution and not ColdFusion, I thought it would be interesting to share how easy it is in ColdFusion 8 using the new file attribute to CFLOOP. This code will loop over the first ten lines of a file and display them:
<cfset myfile = server.coldfusion.rootdir & "/logs/server.log">
<cfset c = 0>
<cfloop file="#myfile#" index="line">
<cfoutput>#line#<br /></cfoutput>
<cfset c++>
<cfif c gte 10>
<cfbreak>
</cfif>
</cfloop>
I first create a variable to point to my server.log file. I then create a counter variable "c". Then I simply use the file attribute for cfloop to loop over the file. When I hit 10 lines, I break. No matter how big the file is, this code will run extremely fast as it won't need to parse in the entire file. My server.log file could be 10 gigs and this would still run quickly.
But wait - it gets betteer. TJ Downes pointed out that you can provide a FROM and TO and the tag will actually display a slice, or portion, of the file. This is not documented as far as I know. The following code is shorter and equivalent to the earlier listing:
<cfset myfile = server.coldfusion.rootdir & "/logs/server.log">
<cfloop file="#myfile#" index="line" from="1" to="10">
<cfoutput>#line#<br /></cfoutput>
</cfloop>
One thing to watch out - if you try to read beyond the size of the file, you will get an error. In that case, the first listing would be safer as it would support a file of any size.
Archived Comments
That would have come in really handy back when I was doing a flat-file db conversion. Much easier to just read it in one line at a time instead of parsing it out of one big chunk of text.
If I recall, the person who told that performance-wise this was far faster than parsing line by line, especially if you do not need to start at the top of the file. I haven't run the tests to be certain.
Given that, I think I would just toss in a cftry to catch the EoF error and handle it elegantly using a break. I think Ill run some tests to see how much of a performance gain you get. Now just to find a massive log file......
TJ, you are indeed right. cfloop/file, and the new file funcs, are all faster than the cffile tag, which reads in the entire file in memory.
Unfortunately you cant break out of the loop with a EoF error. Its the tag itself reaching the EoF and catching that simply stops processing of the page. So I guess the rules of thumb for using from & to attributes when reading a file is that you must know the file's length.
Couldn't you just get the file size and use that as a limiting factor for the "to" attribute (make it a variable and throw in some logic before the loop if x gt file length, x=file length?
Eric
File size though has nothing to do with the number of lines. You could have one VERY long line and a bunch of small ones.
It is documented in the error/debug information sent to the browser. Now to find the time to break every tag to find hidden documentation.
Attribute validation error for the CFLOOP tag.
# The tag has an invalid attribute combination: condition,file,index. Possible combinations are:Required attributes: 'file,index'. Optional attributes: 'charset,from,to'.
TJ,
You can do the following:
<code>
<cfset myfile = server.coldfusion.rootdir & "/logs/server.log">
<cftry>
--------BoF----------<br />
<cfloop file="#myfile#" index="line" from="1" to="10">
<cfoutput>#line#<br /></cfoutput>
</cfloop>
<cfcatch>
--------EOF-------<br />
</cfcatch>
</cftry>
....More Processing....
</code>
All lines are output, no error is displayed to the user and it allows you process the rest of the page.
Mike, to make your code more safe, you should check the exception type in catch. For example, if I provide a file that doesn't exist, an error will be thrown, but you don't want to ignore that error.
Thanks Mike, I thought I tried that method and got the EoF error... Ill have to try it again
TJ,
I tried it on my fusion reactor logs and it worked great. Couldn't be better timing as I was working with files.
Ray,
In this instance couldn't I just do fileExists() before the loop? Otherwise, how would you recommend doing the try/catch this instance?
Mike - sure, fileExists would catch it, but it wouldn't catch CF not being able to read it. getFileInfo would check that. My point is though - you can have cfcatch look for a specific type of exception. You would want to do that so you ONLY catch that error, and not others.
One way to break out of loop upon end of file is use the actual exception thrown. Here is the actual catch statement that you can use.
<cfcatch type="coldfusion.tagext.io.FileUtils$EndOfFileException">
On a related note, I blogged about it a while back at
http://coldfused.blogspot.c...
http://coldfused.blogspot.c...
A better and elegant way to do what you want is using new File IO using file handle where you can actually check if you have reached the end of file.
Thanks for adding those links Rupesh.
The _one_ function I wish existed was a FileSeek. That would be useful for jumping to a position in a file (like to examine MP3 files)
Just curious, does anyone know how cf8 knows the end of a line? chr(10) or chr(13) or both?
There is a Java property, or method for it, so I assume they use that. Or they sniff the first part of the file.