Sal asks:
just curious what's the best way (or how you handle) to truncate a paragraph to only show say perhaps 500 chars.? I have a newsletter that I'm emailing out, and I only wanna show 500 chars. of each article in the email.
Ah, I love it when folks ask me the "best" way to do things since no matter what I say, I'm not wrong (grin). Seriously though - here are multiple ways to trim text.
Let's first start off with a block of text that we will use for our tests:
<cfsavecontent variable="quote">
The Constitution is not an instrument for the government to restrain the people, it is an instrument for the people to restrain the government -- lest it come to dominate our lives and interests. Patrick Henry.
</cfsavecontent>
So the quickest way to trim text is with left:
<cfoutput>#left(quote,100)#</cfoutput>
However if you use this on the text, you get:
The Constitution is not an instrument for the government to restrain the people, it is an instrumen
As you can see, the last word in the trimmed text, instrument, was cut off before the final t. This isn't a horrible thing of course, but it could be done better. ColdFusion does ship with a Wrap function, but that won't crop the text, it will simply break the text into lines of a certain length. It will break the text nicely though, so why not use list functions?
<cfoutput>#listFirst(wrap(quote,100),chr(10))#</cfoutput>
This returns a nicer trim:
The Constitution is not an instrument for the government to restrain the people, it is an
This works nicely, but I kinda feel 'dirty' doing it like this, so why not see if a UDF exists for this? Turns out one does: FullLeft. This UDF lets me do this instead:
<cfoutput>#fullleft(quote,100)#</cfoutput>
In theory it's doing a lot less work than wrap so it should be quicker.
Ok, so we're done, right? Well, what if we modify the quote a bit:
<cfsavecontent variable="quote">
The <a href="http://www.raymondcamden.com">Constitution</a> is <b>not</b> an instrument for the government to restrain the people, it is an instrument for the people to restrain the government -- lest it come to dominate our lives and interests. Patrick Henry.
</cfsavecontent>
As you can see I've added some HTML to the text. This HTML messes up my count. If I wanted to show 100 characters, I don't think I'd want HTML to count at all. In fact, I probably don't want to show HTML at all. I can fix that easily enough:
<cfset quote = rereplace(quote, "<.*?>", "", "all")>
Another issue is space. Now this is a contrived example, but it could happen in a live system:
<cfsavecontent variable="quote">
The <a href="http://www.coldfusionjedi.com">Constitution</a> is <b>not</b>
an
instrument for the government to restrain the people, it is an instrument for
the people to restrain the government -- lest it come to dominate our lives and interests.
Patrick Henry.
</cfsavecontent>
You can use another regex to handle this:
<cfset quote = rereplace(quote, "[[:space:]]+", " ", "all")>
Or conversely, if you use the wrap() function, it takes a 3rd argument to strip out existing line breaks and carriage returns.
Lastly - it sometimes helps to visually flag text that has been trimmed. Normally this is done with a "...". You can mimic this affect like so:
<cfif len(quote) gt 100>
<cfset trimmedQuote = fullLeft(quote, 100)>
<cfset trimmedQuote &= "...">
<cfelse>
<cfset trimmedQuote = quote>
</cfif>
<cfoutput>#trimmedQuote#</cfoutput>
I just check the length of the original quote and conditionally perform a trim and add the "...".
Archived Comments
nice, detailed post, might come in handy soon. thanks ray!
Wow. I've had to do this many times but have never put that much thought into it. Thanks for the great solution.
could be better off trimming it at source to avoid unnecessary db traffic.
in mssql select the column with something like:
substring(yourTextyCol,1,100)
then stick your "..." after it
@Luke - Well this suffers the same problem as Left() does. However, you do have a point - it may make sense to do the 'nice left' once and store the result.
thanks yo!
;-)
I never thought of using the Wrap tag to do this - that's neat. My custom function uses Find and Left:
plaintext = ReReplaceNoCase(htmltext, "<[^>]+>"), " ", "all");
Return Left(plaintext , Find(" ", plaintext, 100)) & "…";
Another thing to keep in mind is HTML entities. You may wish to convert the entities to ASCII text to get a better character or word count. So if your user entered something like this:
Using characters like “é”, “ü”, “etc”. is ok.
...would be converted to this...
Using characters like “é”, “ü”, “etc”. is ok.
I ran into this problem awhile back and created a nice little JavaScript function to do this, but it could easily be done in ColdFusion as well.
oops, that first line got converted. It should have been:
Using characters like “é”, “ü”, “etc”. is ok.
Excellent point Doug. It may even be worthwhile to just delete them. Now that may result in some odd misspellings - but it may be the simplest solution.
yeah, if you dont have to worry about HTML entities, special characters etc., you could do this in MySQL via
SELECT CONCAT( LEFT( TextToSelect, 500 ), '...' ) FROM Blah
Awesome post, with consideration of the HTML. Nice.
What if the text contains HTML-tags like <b>, <i>, <a> etc.
I have had trouble wrapping text containing these kind of tags. The problem is when it cuts the text between a start tag and an end tag.
@Mikkel: Um.... you did read the blog entry, right? I cover HTML.
@Ray: I did read the part where you replace any tag with "blank".
My "question" should have been: What if I want to keep the html-tags without breaking the start/end-tag when wrapping the text.
Ah - that gets significantly more complex perhaps. You could do this:
1) Remove html
2) Find FullLeft(N)
3) If fullLeft(n) ends at "the", go back to original content (with html), find "the", and end there.
That would let you keep the html and wrap at text not including html, but the N value would be <N as you didn't count the html. Another issue is that it wouldn't stop you from ending with <b>the and having an unmatched tag.
You could write code to determine if your fullleft(n) result is inside HTML. This is done by looking for <X> </X> around your result. If you find it, you either move to the end of </x> or go to before <x>.
@Mikkel: My "question" should have been: What if I want to keep the html-tags without breaking the start/end-tag when wrapping the text.
You would almost need to create some sort of HTML parser for that. Have you ever looked at the HTML source for a ColdFusion error message? If you notice, it adds a bunch of close tags (</b></p></td></tr></table>...) before it adds the Error message source. It's not calculating those tags, it's just adding a bunch of them to be safe and they don't always work.
Most likely you could create a Regular Express to find all the <BLOCK> tags, and if any of them were still open, you could add their closing tags to the end. I think that would be crazy complicated and would have to ask if it's worth it.
I used this solution that Ben Nadel came up with to close truncated html. It does a pretty good job.
http://www.bennadel.com/blo...
Is it possible to trim all text around a tag. For instance trim all the text in your example before or after "<a href="http://www.coldfusionjedi.com">Constitution</a>" ?
In theory. You would write a regex to match
(1 or more spaces)(link including closing a tag)(1 or more spaces)
and replace with
(link)
Let me gtive it a try.
Not heavily tested, but this seems to work. I assumed you meant replace two or more with one:
<cfset text = rereplacenocase(text, "[[:space:]]+(<a.*?>.*?</a>)[[:space:]]+"," \1 ")>
If you really want NO space, period, just change the 3rd arg to be just \1, not (space)\1(space).
What I am ultimately trying to do is add 'target="_blank"' to an a tag with an external href. I was looking at trying to trim a string provided by a webservice down to just the <a> tag and using javascript for all external links. Possibly I could do add 'target="blank"' with coldfusion? Do you know any methods?
Oh thats simpler. You can't do it (afaik) in one line, but just get all the links (use reMatch in cf8) and then replace any non-local link with the modified version.
I know it's not ColdFusion, but you could use jQuery to do this for you quite easily (assuming all external links start with http):
$('a[@href^="http://"]').attr("target", "_blank");
Or if you want to get fancy:
$('a[@href^="http://"]').attr({target: "_blank", title: "Opens in a new window"});
Hope that's of interest.
@John: Very much of interest. To quote the great Paris: "That's hot."
It's nice to teach you something Raymond after all I've learnt from you :)
Just noticed my comment didn't come out right. There shouldn't be a semi-colon after the http. I'll try posting again in case it was my typo!
$('a[@href^="http://"]').attr({target: "_blank", title: "Opens in a new window"});
I've posted the code here, (with a bonus feature!) if anyone's interested
http://www.aliaspooryorik.c...
:)
My front end is a flex application. I thought if I did the modification on the backend before the links got called that it would save time and coding on the front end.
I assume I would have to have an ExternalInterface in the flex actionscript to communicate with the jQuery code? It would be great if it would automatically detect and append the code.
I do have to append a user code to the end of each external link, so I am interested in how jQuery works. I haven't started this part of the project yet, where is the best source for this resource?
Thanks for your help.
Nice one Raymond. I often find myself on your website having googled 'coldfusion #whatever-i'm-stuck-on-at-the-time#' and generally i've got a head full of code so once ive got the solution i need i'm back to sublime text to implement it and crack on (without commenting) - but i just wanted to say 'thank you' as your posts have been really helpful over the years on quite a few occasions - thank you.
You are most welcome!
I know this one is "ancient", but I just stumbled upon it and found it to fit exactly what I needed. Thank you
You are welcome. I cleaned up the code samples.
Thanks again, I just wanted to let you know that this "old" information is still helping people.
The link for the UDF above is broken, I found it at http://cflib.org/udf/FullLeft.