Ask a Jedi: ColdFusion Search Engine Safe URLs versus URL Rewriting

This post is more than 2 years old.

Dan the Man asks:

s there any advantage to using the CF SEF URL's versus just doing a mod_rewrite in apache?

I've used various forms of mod_rewrite in the past, and I prefer not having a period in the middle of the URL, however, I'm wondering if there are any advantages that I'm missing out on.

I'm currently using the following for my mod_rewrite:

[deletia]

This results in "www.website.com/page/url/param" being translated to "www.website.com/page.cfm?id=url/param". It also checks to make sure that a real directory or file doesn't exist before writing the subdirectory to the URL parameter.

So the short answer is that I think using the web server's SES facilities will give you more power than trying to do it in ColdFusion. But let's look at some pros and cons.

If you decide to use ColdFusion for SES URLs, then the main advantage is that can remove one more dependency for your application. SES URLs are built into Apache, but for IIS you have to use a plugin, and even then you still have one more thing to manage. Now this is not really a big deal, but if you are building software to sell, or open source work, then it matters since you have to both document it and support it.

If you use the web server, you get the benefit of using code that was expressly written for that purpose. Your ColdFusion code gets simpler as well since you don't have to worry about handling it all. You can also get more fancy with your URLs. If you look at CFLib you will notice no .cfm or url parameters anywhere in sight. This wouldn't be possible (as far as I know) with just ColdFusion itself.

Raymond Camden's Picture

About Raymond Camden

Raymond is a senior developer evangelist for Adobe. He focuses on document services, JavaScript, and enterprise cat demos. If you like this article, please consider visiting my Amazon Wishlist or donating via PayPal to show your support. You can even buy me a coffee!

Lafayette, LA https://www.raymondcamden.com

Archived Comments

Comment 1 by Rick Smith posted on 6/12/2008 at 6:20 PM

[quote]If you look at CFLib you will notice no .cfm or url parameters anywhere in sight. This wouldn't be possible (as far as I know) with just ColdFusion itself.[/quote]

So how did you accomplish that?

Comment 2 by Raymond Camden posted on 6/12/2008 at 6:24 PM

IIRF. Free URL rewriter plugin for IIS. I also used that on ColdFusionCookbook.com

Comment 3 by Misha posted on 6/12/2008 at 6:54 PM

Another way how is possible using only CF:

Custom Tag:
*******************************************************
<CFSET strPos = Find('.cfm/', '#CGI.PATH_INFO#', 2)>
<CFIF strPos>
<CFSET n = Len(CGI.PATH_INFO) - strPos - 4>
<CFIF n GT 0>
<CFSET URL_QUERY_STRING = "">
<CFSET str = Right('#CGI.PATH_INFO#', Len('#CGI.PATH_INFO#') - strPos - 4)>
<CFSET num = 0>
<CFLOOP INDEX="loop" LIST="#str#" DELIMITERS="/">
<CFIF num>
<CFSET num = 0>
<CFTRY>
<CFSET "URL.#mem#" = loop>
<CFSET URL_QUERY_STRING = ListAppend(URL_QUERY_STRING,"#mem#=#loop#","&")>
<CFCATCH TYPE="Any">
</CFCATCH>
</CFTRY>
<CFELSE>
<CFSET num = 1>
</CFIF>
<CFSET mem = loop>
</CFLOOP>

<CFSET QUERY_STRING = URL_QUERY_STRING>
<CFSET CALLER.URL_QUERY_STRING = URL_QUERY_STRING>
<CFSET CALLER.QUERY_STRING = URL_QUERY_STRING>

</CFIF>
</CFIF>
**************************
in Application.cfm file put:

<CF_URLARGS struct="URL">

Comment 4 by Raymond Camden posted on 6/12/2008 at 7:01 PM

Misha - no one denies you can't do it in CF. The point was - whats the differences. As my blog says, you can do it a bit better if you do it at the server level.

Comment 5 by Dan posted on 6/12/2008 at 7:04 PM

I'm the one who originally asked the question...

My current apache setup uses the following rewrite:

RewriteEngine on
RewriteCond E:/directory/%{REQUEST_FILENAME} !-f [NC]
RewriteCond E:/directory/%{REQUEST_FILENAME} !-d [NC]
RewriteCond E:/directory/%{REQUEST_FILENAME} !-l [NC]
RewriteRule /([A-Za-z0-9-]+)/(.*) /$1.cfm?param=$2 [L,PT]

It seems to be working pretty well, as it only rewrites if a real file does not exist at the path you specify.

Comment 6 by Justin posted on 6/12/2008 at 7:16 PM

Misha -

That requires a cfm page to process. It would be great if we didn't need the index.cfm.

There are a lot of SES rewriters out there. Fortunately, it gets easier, more robust, and standardized the more we go along. ColdBox has excellent SES tooling and documentation.

More on par with the post, another situation is whenever you have administrators that use IIS and are afraid of url rewriting plugins. Furthermore, without having control over the plugin or Apache because of separation of duties or departments, would you want to develop applications that rely on it? More "it depends" situations.

Comment 7 by Todd Rafferty posted on 6/12/2008 at 7:29 PM

Not sure if anyone saw this, but MS finally started supporting a rewrite module:
http://learn.iis.net/page.a...

Comment 8 by Raymond Camden posted on 6/12/2008 at 7:37 PM

Wow. Next thing you know they will (easily) support virtual hosts on the non-Server edition. Then pigs will fly...

Comment 9 by Jeremy Prevost posted on 6/12/2008 at 7:47 PM

@Todd. I had missed that and it's good to finally see, but it's disappointing to see they made a mod_rewrite importer rather than just supporting the gosh darn mod_rewrite syntax directly. <sigh>.

Comment 10 by John posted on 6/12/2008 at 8:31 PM

While we're on the subject (and since I'm not a regex guru), could someone please explain how to translate the following with IIRF?

folder/id/000
into
folder/scriptname.cfm?id=000
(scriptname may be index.cfm or otherscriptname.cfm, with an occasional &subid=000)

I've been futzing around with IIRF for a couple of hours now and I haven't been able to get it working.

Thanks!

Comment 11 by Raymond Camden posted on 6/12/2008 at 8:35 PM

This should do it.

RewriteRule /folder/id/([0-9]+) /folder/scriptname.cfm\?id=$1

Comment 12 by James Moberg posted on 6/12/2008 at 8:50 PM

I used IIRF on a project and then had to move the CF-driven website to another web server that didn't have IIRF installed. It was a pain researching a new method just because of a server change.

As a result, I do it primarily with CF only so that I don't have to revisit this problem ever again.

I'm using "index.cfm" and then appending extra data to the end of it. I have a custom UDF to ensure that all dynamically generated SES data is SES safe.

For an example of what I'm doing, check out:
http://www.hulaisland.com/

Everyone should review their logfiles when setting this up. I had to write my own logging tool since IIS isn't capable of logging anything that isn't the actual script name or CGI parameter. (In other words, all page views are logged by IIS as "/index.cfm" with no additional information.)

Ray, how do the log files look using SES URLs on Apache?

Comment 13 by Raymond Camden posted on 6/12/2008 at 9:54 PM

James - the logs look like you would expect - the SES urls. You can additionally do logging for SES as well to log the rules run, etc. Useful for debugging.

Comment 14 by John posted on 6/12/2008 at 10:12 PM

Thanks for the sample Ray. I was able to get it working with just a quick change to accommodate alphas & periods. It didn't play too well with my existing SES code, which kept "index.cfm" in the URL, so I killed that :)

Comment 15 by Jake Munson posted on 6/12/2008 at 10:44 PM

"As my blog says, you can do it a bit better if you do it at the server level."

It seems to me that ColdFusions functions combined with regex...you can't get more powerful than that. So I'm curious how the web server can do it better. I am not disagreeing, I'm just wondering if I'm missing out on something...

Comment 16 by Raymond Camden posted on 6/12/2008 at 10:50 PM

With CF, you can't have a URL w/o index.cfm in it. I'd have to do URLs like you see in the blog here (index.cfm/etc) as opposed to urls you see at cflib. That isn't the end of the world of course, but if you want 100% control, you have to go out of CF.

Comment 17 by James Moberg posted on 6/12/2008 at 11:33 PM

Yeah... the "dot-in-path" because of "index.cfm" can sometimes be a showstopper with IIS depending upon how IIS lockdown is configured.
http://support.microsoft.co...

The reasons I prefer using CF for this is because it can sometimes do things that the server can't do. For example, URL rewriters are incapable of performing non-WWW functions like querying a database to check if an IP is in a dynamic blacklist.

It also allows our team to quickly add new rules without having to additionally mess with a sensitive server-wide configuration file. (Sometimes a development team may not even have access to to this or has to rely on a third-party to make the changes.)

That being said, I also depend heavily on IIRF to block access to non-CF resources in directories based on static access rules. Any unauthorized users get an authentic 404 Page Not Found error and not just a "200" status with a page that states "Not Found".

Comment 18 by Elliott Sprehn posted on 6/13/2008 at 11:46 PM

"The reasons I prefer using CF for this is because it can sometimes do things that the server can't do. For example, URL rewriters are incapable of performing non-WWW functions like querying a database to check if an IP is in a dynamic blacklist."

This isn't actually true, as apache's mod_rewrite allows you to callback into an arbitrary process on the server. See the RewriteMap directive and the sample perl program.

http://httpd.apache.org/doc...

A simple program written in perl could very easily hit the CF server (or just the database directly) to figure something about the rewriting out.

I created a CF version of the Pylons routing engine for our sites, soon to be open sourced along with the rest of our web platform. It's quite powerful, and features a route to regexp compiler (not just an interpreter like ColdCourse) and a comprehensive routing system (essentially does everything Pylons routed does and several CF specific features).

http://routes.groovie.org/
http://routes.groovie.org/m...

Comment 19 by James Moberg posted on 6/14/2008 at 12:35 AM

While it may not be true specifically for Apache's mod_rewrite, it is true for many other URL Rewriters like Ionic's ISAPI Rewrite Filter (IIRF), UrlRewriter.NET, ISAPI_Rewrite and IISRewrite.

How much overhead is added to a request when using rewrite, perl, cf+db?

Rewriting modules can sometimes be specific to certain server configurations or be cost prohibitive regarding licensing. Using "CF only" has not hurt any of my clients positioning in search engines and makes it extremely easy to add new portable rewrite rules. ("Portable" meaning that no additional third party tools would be required if the web application were moved to another hosting environment.)

Comment 20 by justin posted on 6/14/2008 at 12:43 AM

The real annoying problem is index.cfm. If only i.cfm could be automatically recognized by default! That would sweeten the CF only pot by at least 50%.

Comment 21 by Raymond Camden posted on 6/14/2008 at 12:46 AM

@justin: Eh? All web servers let you specify a default page. You can easily say i.cfm is your default index page.

Comment 22 by James Moberg posted on 6/14/2008 at 12:48 AM

Unfortunately the URL has to have .cfm in the name (or another mapped extension) or ColdFusion won't parse the template.

CFMX already does some hidden URL rewriting of it's own when it comes to CFGraph and the non-existent "/CFIDE/GraphData.cfm" file.
http://www.bpurcell.org/blo...

Comment 23 by Jeremy Prevost posted on 6/14/2008 at 12:50 AM

@ James: This discussion mostly comes down to URL aesthetics.

Which looks better:

1) http://www.site.com/2008/01...

2) http://www.site.com/index.c...

As Ray stated, OpenSource apps should likely choose #2 to ensure they can be run everywhere. Any app I write in house uses #1.

Comment 24 by Jeremy Prevost posted on 6/14/2008 at 12:55 AM

@ James: uh. sorry, I left out the part I meant to type after @James. Here it is:
"How is Apache and mod_rewrite, a free solution on any sane OS server choice cost-prohibitive?"

The other stuff was just part of the general conversation.

Comment 25 by James Moberg posted on 6/14/2008 at 1:07 AM

I wasn't directing my comments to the free Apache administrators.

I'm not running under the assumption that not all ColdFusion-driven websites are running on top of Apache. I'm not using Apache and cannot use mod_rewrite... and I'm guessing it's possible that there may be others out there in the same boat. I've done the research for IIS and many rewrite APIs are licensed "per hostname" and some of them don't even work that well.

I'm sorry if you thought I was dissing Apache. I'm in a love/hate relationship with Microsoft.

Comment 26 by Raymond Camden posted on 6/14/2008 at 1:36 AM

IIRF is for IIS and it's 100% free. For as many hosts as you want.

Comment 27 by Elliott Sprehn posted on 6/14/2008 at 1:46 AM

@james

ISAPI_Rewrite and mod_rewrite for IIS both support RewriteMap, and CF can easily generate a flat text file that they use, and update it meaning you're not burning any http requests with the round trip for rewriting.

Round trip through perl, cf and the db might be unacceptably slow if you don't do any caching (seriously, why wouldn't you?).

UrlRewriter.NET allows you to easily call any .NET code so that works fine too.

And yes, our rewriting engine is CF only. We use a free rewriter to pass the entire path_info to CF where it uses the routing engine to process and figure out what to do. The overhead is effectively nil for both the rewriting and the url generation with caching enabled. If you're on a server without rewrite support you just tell it to prefix with the script name.

@Jeremy

I don't see any reason that Open Source apps shouldn't all provide a way to choose one or the other.

---

As an aside, I really really don't get why this is such a huge issue in the CF community. Every other major language + platform supports pretty URLs now. Pylons, Django, Rails, Grails, TurboGears, Symfony, Plone, CodeIgniter, CakePHP, ...

Everyone remember this argument: http://www.pbell.com/index....

Now they kind of support it, though...

index.cfm/year/2008/month/6/day/12/name/Ask-a-Jedi-... is not the same as proper routing...

This is the kind of stuff that makes us look behind the curve to other communities and to the younger generation (college students, young startups, kids doing the "cool" stuff) when you try and sell CF.

Comment 28 by Raymond Camden posted on 6/14/2008 at 1:57 AM

Elliot - are you saying PHP has the ability to work with no file extensions? Ie, I can do

server.com/foo/moo

And PHP can correctly process that? (Of course I'm implying foo and moo don't exist as folders.) I didn't think PHP could do that.

Comment 29 by Elliott Sprehn posted on 6/14/2008 at 2:04 AM

@Ray

Certainly not. I was talking about frameworks (which are what I listed). That's why I said language + platform. Sorry if I wasn't clear.

Comment 30 by James Moberg posted on 6/14/2008 at 2:09 AM

@Raymond: I use IIRF and think it's the best free solution for IIS.

@Elliot: If you re-read my posts, you'll see why I'm doing it. Some of the projects that I develop get re-sold and migrated to different hosting environments that don't share the same OS, web server, CF rendering engine or version or database. For the applications I've written, it's critical that they be able to be configured in different environments with as little third-party impact (and personal involvement) as possible.

I agree wholeheartedly with what you are saying and would love to implement it in all projects... I just don't have the time and ability to figure out the best way to do it in multiple environments that I do not have access to.

Comment 31 by Brad Fraser posted on 6/16/2008 at 11:09 PM

What is the best way to do this with just Coldfusion?

Comment 32 by Raymond Camden posted on 6/16/2008 at 11:36 PM

Not sure if this is the 'best', but it's my take on it.

http://www.coldfusionjedi.c...

Comment 33 by Omer posted on 11/25/2008 at 12:49 AM

I am trying to create URL in the same way as you do in your blogcfc appplication like having publishing date and then title in the URL. The problem I am facing is that how you handle the passing of entryid by clicking through a link as I am not able to see entryid anywhere in the URL. In my application, I just have to pass MediaID. When I have the MediaID, I can grab the record information through a query. Please help me in this. I would greatly appreciate your time in this regard.

Comment 34 by Raymond Camden posted on 11/25/2008 at 5:41 PM

I just used the CGI scope. When you have a url like so:

index.cfm/foo/goo/moo

The stuff after the file name end up in CGI.PATH_INFO. Try cfdumping CGI and you will see.

Comment 35 by Adam Polon posted on 1/23/2009 at 7:57 PM

Hi guys,

Figured I'd add my 2 cents to Raymond's comment stating that you would need to include index.cfm in your URL in order to have CF process the SES URL.

A solution to this is to set the 404 error handler in IIS to point to a CF file (like /404.cfm). From there, you can then use CF to process the data however you like. Note that the url requested by the user is accessible in the query_string. It comes across like this:

QUERY_STRING=404;http://www.mysite.com/research

With a few lines of code, it's easy to figured out what the user actually requested and process accordingly.

From there, the following would be very easy to handle in different ways.

http://www.mysite.com/aboutUs
http://www.mysite.com/conta...

http://www.mysite.com/surve...
http://www.mysite.com/surve...

Hope this helps,
Adam

Comment 36 by Raymond Camden posted on 1/23/2009 at 7:59 PM

I'd argue though that if you have IIS admin access you could simply use a 'real' rewriter like IIRF. Then again, some hosts _do_ give you a 404 handler. Good comment, Adam.

Comment 37 by Danny posted on 7/14/2010 at 8:37 AM

How would i go about making a search engine like the one on the nvidia site?

http://www.nvidia.com/Downl...

Comment 38 by Raymond Camden posted on 7/14/2010 at 5:17 PM

Can you please explain what you mean specifically? Does it apply to the topic at hand?

Comment 39 by Danny posted on 7/14/2010 at 6:48 PM

Ya sorry. I'm fairly new to CF and needed to make a search engine like the one on the nvidia site. I tired setting it to when someone chooses one variable from the first box the next one will show the corresponding info. For example: someone chooses the state in the first box the second one will show the corresponding cities. Hope that makes sense.

Thanks

Comment 40 by Raymond Camden posted on 7/14/2010 at 6:55 PM

Sure, it makes sense, but isn't really relevant to this blog entry. CF makes 'related' items via Ajax pretty easy. I'd google for that first.

Comment 41 by Alec E. posted on 10/26/2010 at 1:16 AM

I know this is an old post, but I just wanted to thank everyone who has contributed, especially Adam and Raymond. Without Adam's tip to use the 404 handler in IIS I would be stuck with index.cfm in my SES URL's. Sadly my host doesn't support any IIS mods without literally multiplying my hosting bill 6x higher than it already is, and it really bugged me that PHP and Apache (namely Wordpress) created nicer URLs. It's a Coldfusion pride thing! Anyhow after some tinkering and false starts, I managed to get everything working and now my coldfusion based CMS software has pretty URLs! Oh happy day!!!! Thanks again!!!