Joe asks:
I really like how you did the udf.cfm/NAME on your site. Can you point me to where and how you did this and/or if this is a UDF. Thanks!!!
Joe is talking about the "shorthand" URL tool I built at CFLib.org. It allows for simpler URLs like so: http://www.cflib.org/udf.cfm/isEmail. How is this done?
It's really two parts. First we have to recognize the "weird" URL form - and once we do - we then parse it. The second part is specific to CFLib and I'll only cover it briefly.
To begin with - whenever a URL comes in with the form, http://host/filename.cfm/stuff/at/the/end, your web server will recognize that "filename.cfm" is the file you want. It will then take the "extra" stuff and store it in a CGI variable, path_info. So consider this blog entry. Everything after index.cfm is located in the path_info CGI variable. Sometimes - this CGI variable will also contain the filename. Luckily, Michael Dinowitz wrote a nice little article showing sample regex to "clean" this value. I don't seem to see a "direct" link to his article, but it on the front page at House of Fusion. (Look for the article, "Search Engine Safe (SES) URLs.) In this article he has a full blown UDF for dealing with the values, but I'm going to focus just on the regex. This example below shows it in action:
<cfoutput>
cgi.path_info=#cgi.path_info#<br>
stripped: #pathInfo#
</cfoutput>
You don't have to worry too much about the regex, it basically just handles removing any potential filename from the CGI variable. I'm not seeing any filename on my Apache or IIS server, but I know I've seen it in the past.
At this point we have a pathInfo variable that will store any information that added to the end of our filename. How do we parse this? Obviously you have a ColdFusion list using the / character are a delimiter. In my example above, "http://host/filename/stuff/at/the/end", my pathInfo variable would have: "/stuff/at/the/end". How I parse that is up to the application. In BlogCFC, I check the length of the value (using listLen and / as the delimiter) to make sure the length is 4. The firs three values refer to the date and the last item refers to the alias.
You may want to use a format that is like typical URL variables. Something like: http://host/filename.cfm/product/323. In this form, the URL is simply another way of saying: http://host/filename.cfm?product=323. To parse this form, I would have to loop over the list and set URL variables. Here is a sample that will do that:
var pathInfo = reReplaceNoCase(trim(cgi.path_info), '.+\.cfm/? *', '');
var i = 1;
var lastKey = "";
var value = "";
if(not len(pathInfo)) return;
for(i=1; i lte listLen(pathInfo, "/"); i=i+1) {
value = listGetAt(pathInfo, i, "/");
if(i mod 2 is 0) url[lastKey] = value;
else lastKey = value;
}
//did we end with a "dangler?"
if((i-1) mod 2 is 1) url[lastKey] = "";
return;
}
What are we doing here? As I mentioned before, we begin by looking for stuff after the final slash. If we find nothing, we exit the function. (Normally a UDF returns something. A return statement by itself just means to leave the function without returning anything at all.)
Next we treat the value as a list and loop over it. We want to do things in twos - in other words, the first item is a variable, the second is a value. We simply check our list counter, i, and on odd numbers, we store the value as "lastKey", and on even numbers, we write to the URL scope. (UDFs should never directly access variables outside their own scope. Except when they should. ;) This code assumes an even number of values. So what happens if the pathInfo variable is odd? (Ex: /products/5/foo) We treat this then as a "empty" variable and create the value in the URL scope with an empty string. This could be used as a flag. So for example, /productid/5/short, could mean set url.productid to 5, which is the database record to load, and "short" simply means show the shorthand version of the content.
Archived Comments
It's possible, and a lot easier, to do this on the Web server level, for example on IIS, using the ISAPI Rewrite product (free version will do!), and a rule like:
RewriteRule (/servlet/.*?)(\?[^/]*)?/([^/]*)/([^/]*)(.+?)? $1(?2$2&:\?)$3=$4?5$5: [N,I]
RewriteRule /servlet(.+?)/?(\?.*)? $1.cfm$2 [I]
(two lines)
Then you can do URL's like:
http://www.mydomain.com/ser...
Which would then be:
http://www.mydomain.com/vie...
.. replace the 'servlet' trigger in the rule with whatever you want.
With a bit of template planning, you can do some seriously snazzy URL's with this.
Oh - I _was_ going to mention web-server-side SES parsing but I totally forgot. Thanks for bringing it up. Apache has this feature as well.
Ray, once again, I'm glad I follow along. I'm working on this problem right now! Thanks so much for the overview.
There is a gotcha with this if you setup CF in J2EE mode.
The fix is simple enough, you just need to hack web.xml a little. See http://www.doughughes.net/i... for details.
I really want to do www.domain.com/USERNAME and have it read it as www.domain.com/index.cfm/US... and THEN do the trick in this article to take username as the variable, but I can't seem to get a way to do it...
Anyone have a suggestion on if this is possible to achieve using a method like this?
Really nice to see this article! I have been wanting to figure out how to do this on a shared server site for quite some time now, but never could get it working. This example is perfect, as I am only dealing with 1 url variable.
justin. Just curious, but if you go to www.domain.com/FOO, does index.cfm load anyway? If so, then you are good. I'm sure cgi.path_info will have /FOO. If not, you can look at rewriting techniques on the server level. I'll blog about that as well, but you really have 2 choices:
a) In Apache, use mod_rewrite. It's free. It's awesome.
b) In IIS, use ISAPIRewrite. It is free if you use the simple edition. The simple edition applies all rules to all domains on a box. If you only have one site, or only one site using rewriting, then you are fine. However, the "expensive" edition is only 75 dollars. Not bad.
I question whether or not this method really creates a "Search Engine Safe" URL any better than just leaving it as www.domain.com/index.cfm?fo.... I implemented such SES URL functionality a couple months ago, and the search engines still do not index the content. (I know this because every page on my site that does not use these SES URLs has a PageRank, even if the PR is only 2).
I'm guessing at this point that Google and company have caught on to this workaround and consider it to be the same as just using a query string.
Keep in mind that once your site becomes "important" enough, the search engines will start indexing _all_ of your content, query string or no.
Good point Chris - definitely your mileage will vary. I'd add though that it may also be nice to provide a cleaner URL just for the heck of it, or to allow for stuff likw CFLib, where a user can load a UDF w/o knowing the ID of it.
To Jack Dala... What does the second line of your ISAPI_Rewrite regex do?
RewriteRule /servlet(.+?)/?(\?.*)? $1.cfm$2 [I]
I wish I knew. :-)
.. I just took one of the existing rules from their documentation (or forum? Don't remember), tweaked it until it worked as I wanted, and left it at that. It's regex far beyond my brain-power. :-)
It's a piece of beauty though, works *great*.
Doing SES through the web server (e.g. mod_rewrite) works best, but if you want to do it through code, Bert D. and I wrote something several years ago that works great - http://www.fusium.com/go/ses
[quote]
It's possible, and a lot easier, to do this on the Web server level, for example on IIS, using the ISAPI Rewrite product (free version will do!), and a rule like:
RewriteRule (/servlet/.*?)(\?[^/]*)?/([^/]*)/([^/]*)(.+?)? $1(?2$2&:\?)$3=$4?5$5: [N,I]
RewriteRule /servlet(.+?)/?(\?.*)? $1.cfm$2 [I]
(two lines)
Then you can do URL's like:
http://www.mydomain.com/ser......
Which would then be:
http://www.mydomain.com/vie...
.. replace the 'servlet' trigger in the rule with whatever you want.
With a bit of template planning, you can do some seriously snazzy URL's with this.
[/quote]
Logically it would be better for dynamics to do it with CFML rather then server side, because it allows you to organize template systems by database better, rather then relying solely on "cfml", when parsing it with coldfusion you can effectivly reduce the amount of data passed through the URI allowing you to segment out template and variables, its like we do in PHP most of the time when doing SEO URIS. A little bit of regex and database testing or even file testing and you could welp get er done alot more dynamically.
I don't know I agree Ajaxsaur. I mean, my approach was to do it with CF, but I don't think it is bad to do it server side. I don't agree with you that it is better organized - unless I'm misreading you. You can reduce the amount of info _either_ way.
Where and how do you call this function? Right now I just do this at the top of the page:
<cfif CGI.PATH_INFO DOES NOT CONTAIN ".cfm">
<cfset test = parseSES()>
</cfif>
I'm sure there's a better way than this...
I would include it in onRequestStart.
How might you handle a case where the name has a forward slash in it like this:
http://host/filename.cfm/phrase / with a slash in it/777
You have to escape it. I typically use a function that escapes my data based on the SES form I'm using. Escape it or simply remove it.
Thanks for the good idea, removing it works just fine for me.
Here's my code if anyone else is interested:
<cfset newlink = Replace(link,'/','','all') >
<a href="/results.cfm/#tagid#/#newlink#">#link#</a>
Anyway to get this to work with Flash Forms? They seem to break when running under SES urls!
I'd be willing to bet you have an invalid CFIDE mapping.
Thanks for your response Ray! I have a CFIDE mapping in CF Administrator, and I have a CFIDE virtual directory in IIS for this domain. If I run it outside the fusebox (and SES) it works fine, but inside I just get the white box.
Odd. Now you got me. :) 99% of the time it is just a missing CFIDE virtual path.
I had tried a solution similar to this once before and had a problem using relative paths. Have you run into this before?
Also, is there a link to this UDF on cflib.org at all?
The UDF is Michael Dinowitz's, he would have to post it.
Thanks for your quick reply...
An example of what I have used in the past is here:
http://www.cftopper.com/ind...
The problem I have is when using relative URLs my url gets added to as you click links...
For Example...
you go to index.cfm/page/5
then you click a link for page 23...
now you are at index.cfm/page/index.cfm/page/23
The request still works, but it defeats the purpose.
The thing that I like about the CFML solution is that the key/value pairs are passed in the URL and you could have a varying number of query items on each page... can you do this with the IIS Re-Write as well?
What I've done in the past is just always use full URLs.
Hi,
I'm wondering: how to handle a site with SES urls in combination with CF administrator. E.g. when I schedule a template in the CF Admin with timeout, the URLs look like
http://www.domain.com/path/...
And this goes wrong, it should be
http://www.domain.com/path/...
Sorry 'bout that post with too long URLs
I'll try again: CF Administrator creates this
URL when scheduling a page with a timeout
.../index.cfm/fuseaction/dothis?RequestTimout=60
And this goes wrong when using SES Urls.
How to handle this?
Can you not use a non-SES version?
The other option is to use cfsetting requesttimeout in the code instead of setting a requesttimeout in the CF Admin for the task.
Ray I was testing the SES fix you posted to try and see if its something I wanted to do in my App I'm building only thing I can't get to work is when you don't specify the index.cfm page example below with error.
test.com/testing/id/555
this really should be pulling
test.com/testing/index.cfm?...
is there away to fix this. reason I'm not using the server level replace is I don't know if this application is going to be on a shared or dedicated server yet.
the error I'm getting is
404
/testing/test/1234
java.io.FileNotFoundException: /testing/test/1234
at jrun.servlet.file.FileServlet.service(FileServlet.java:349)
at jrun.servlet.ServletInvoker.invoke(ServletInvoker.java:106)
at jrun.servlet.JRunInvokerChain.invokeNext(JRunInvokerChain.java:42)
at jrun.servlet.JRunRequestDispatcher.invoke(JRunRequestDispatcher.java:284)
at jrun.servlet.ServletEngineService.dispatch(ServletEngineService.java:543)
at jrun.servlet.http.WebService.invokeRunnable(WebService.java:172)
at jrunx.scheduler.ThreadPool$DownstreamMetrics.invokeRunnable(ThreadPool.java:320)
at jrunx.scheduler.ThreadPool$ThreadThrottle.invokeRunnable(ThreadPool.java:428)
at jrunx.scheduler.ThreadPool$UpstreamMetrics.invokeRunnable(ThreadPool.java:266)
at jrunx.scheduler.WorkerThrea...(WorkerThread.java:66)
No, you have to use a real file name. Now - if you don't like that - you can use a server server rewrite like what Apache has built in support for - or what you can add on to IIS.
This blog entry was meant to describe a solution that did not require any web servere mods.
I have run into the same problem as Seth. I'm using Fusebox with SES URLs however my Flash forms aren't displaying.
Works..
http://mysite.com/index.cfm...
Doesn't work..
http://mysite.com/index.cfm...
I'm running CF8 and Fusebox 5.1.
Same problem here...
Did some experiments:
- cfform type="flash" doesn't work
- cfform type="html" does work
- cfchart works
- cftextarea also works
The CFIDE Virtual Directory does exist but no flash forms
Any suggestions?
Flash Form that would not load the movie got a wrong path I think:
src='/ict/jvl/mod_rewrite/1799017575.mxml.cfswf'
But if I'm viewing it directly it works:
src='/ict/jvl/mod_rewrite/content/services/ce/information/index.cfm?ID=2005'
Hi Ray,
I am a newbee to Coldfusion. i am trying to understand your blogCFC 5 i downloaded. How did you hide the blog id create with CreateUID() function?
You have something like http://www.domain.com/index... , how did you translate /2008/03/27/some-blog-title back to blog id and get the details for some-blog-title?
Please help.
Thank you
Every blog entry has an 'alias' based on the title. So if the title was, "I adore Paris HIlton", the alias is something like i-adore-paris-hilton.
I then make links with the date plus the alias, which you have to assume is unique. WHen you view the entry, I just look up based on alias+date.
Thanks Ray,
I try it, it works. Another question, instead of this
www.domain.com/index.cfm/20...
if I want to achieve this,
www.domain.com/2008/04/01/b... or
www.domain.com/blog/2008/04...
How would I do it?
Please help.
Many thanks.
For that to work you need to do it at the web server. Apache has this support built in, and for IIS you need an add on. But the point is - it has to be done _below_ ColdFusion.
Use ISAPI Rewrite when on IIS. Fairly easy, excellent support on their forums, and works really really well for nearly all search-engine friendly URL.
I'll also recommend IIRF for IIS. The price is good too (free!)
Hi,
I'm trying to introduce the use of SEF URLs on this site and have been very successful doing so. Its help a ton as I've been able to build some very much needed redirect applications that get users into certain areas faster. But my problem is similar to several other listed already that appear to not have been resolved - unless I missed something.
It is the flash form problem,
Works:
...roster-edit.cfm?action=update&id=123456
Doesn't Work:
...roster-edit.cfm/action/update/id/123456
Just by swapping out the characters in the URL I can get the form to load. But when I introduce the SEF link naming convention into the site (meaning once I code links the SEF way) all my flash forms that are dependent on a URL parameter break.
Did I miss the solution here somewhere? Coudl someone share it again, please?
P.S. I do not have access to the CF Admin or IIS - I have to do this all in code (shared server).
Thanks in advance.
OK - so this is really weird but since this is a pretty old application, it is still using Application.cfm - and I wish I could easily convert it to a cfc. Anyway, I had added this script:
<cfscript>
pathInfo = reReplaceNoCase(trim(cgi.path_info), '.+\.cfm/? *', '');
i = 1;
lastKey = "";
value = "";
if(not len(pathInfo)) break;
for(i=1; i lte listLen(pathInfo, "/"); i=i+1) {
value = listGetAt(pathInfo, i, "/");
if(i mod 2 is 0) url[lastKey] = value;
else lastKey = value;
}
break;
</cfscript>
do introduce SEF URLs. As I mentioned that was successful but it broke my flash forms.
Before I make my next point, I should let you know that I version my cfapplication names so that I can reset applications and have CFadmin essentially create new application logs for me to compare should I need to.
So, I found it odd when I went back into the Application.cfm file and simply removed that same code (without renaming my cfapplication name). I loaded the form page and it errored out. When I reintroduced that code and saved my Application.cfm file, the refresh of the form loaded the flash form correctly.
I *think* I know why this happened and I am wary that it might not last so I am watching it closely. I have my test and development environments to try this out on too.
When you view source, look at the src for the Flash. What is the URL?
Ok, so I've been able to narrow down and recreate my issue consistently. Here is how it goes down . . .
a.) I have a link to a page that edits a user record via flash form
b.) the link is coded with traditional question mark, equal and ampersand signs for the query string.
c.) having introduced the SEF script, when the page loads I can see the request actually changing from the old format to the new format.
d.) the reason why I see it change is because I have a SSL redirect script in place to ensure the page is loaded and processed using SSL. Since the original request is being made from a non-secure page, my redirect script now rewrites URLs using the SEF format.
e.) If I take the URL it redirects to (the SEF one) after the page is done loading (with no visible flash form) then replace the characters manually with traditional format (for example: ...-action.cfm/userid/123 to ...-action.cfm?userid=123) the flash form loads fine.
f.) I've modified the page that offers the link to the form by prefixing the anchor with my secure url application variable instead of solely relying on the page requested to determine it was not being loaded in an SSL - I still have that code there of course.
So, the issue is bandaided in the respect that I've tracked down the links that point tot he page and have made sure the links to the page are build with SSL. And if someone were to decided to change the SSL to a regular http request at least there is comfort knowing the page will not load. But it still leaves me to with having to look at my SSL redirect page to see if I need to do something there (which I can't imagine I need to since it does its job nicely).
Therefore, this leads me (finally) to answering your question . . . there is no difference between the rendered source code of the form when it doesn't load and when it does - other than the swf cacheid.
So by changing manually reverting the URL from SEF to traditional, then back to SEF everything works. But as its taken me 30 min. to write this post and double check everything, I am faced with starting my troubleshooting issue all over since FF and IE seem to be doing things differently.
I think its a client side issue because I have plenty of other flash forms that have not been affected by this. However, all the other flash forms that do work do not have URL query strings being re-imagined into a SEF url format.
(sorry for the lengthy posts - this is just really boggling me - all the a flash forms work until they happen to be on a page that uses SEF url formatting.)
Wow thanks for posting this. I've been looking at all kinds of solutions (on and off for weeks... and a few hours today) and this one works really well for a project I'm working on! Thanks!
glieu - I think you will achieve what you're after by setting up a custom 404 page, then in that page use CGI.QUERY_STRING to grab what was entered into the url - work out what the user typed and what they need to see, and use a cfinclude to include that page - it will appear to the user as if they typed the original url. Use cfdivs to navigate and keep that url in their address bar.
We use this function in tandem with onMissingTemplate to get URL's where the pagename is one of the variables e.g.
a.com/var/value/value.cfm
this is a great help.
so how would you get it to pull the values for the cleanURL from a a database? (so that you could set any value at anytime for a particular page)
I'm not sure I understand you right, but if you are asking, how do you format the URL, then it is entirely up to you really. If you decide you want to use
page.cfm/X
then you store X in the database or generate it when you make your links.
i guess i'm not understanding how to relate the regular link with a value i've placed into a database as i've only seen formulaic examples (unless i've missed the simpler "whatever you have in a table" version)
yes, the link would be a simple page.cfm/article format.
The thing is - there is no one answer to this. You decide how you want your urls to look, and you need to figure out what a good unique value is. So for CFLib, I knew I had unique UDF names. Therefore making the link was simple, I just link to /udf/#NAMEHERE#. Where #NAMEHERE# is the udf name from the database.
Did that answer your question?
it did, it just took me a long time to understand.
thanks Ray
@Ray - Thank you for this example. Really saved me a lot of time!
Not sure if this helps anyone, but I am passing this on every request for the front-end of my e-commerce application. Then, using different .cfm files, using different rules for each variables passed.
ie: productBrowser.cfm/brand/brandName/category/snowboards/
productViewer.cfm/product-name/brand/brandName/
Great stuff but I am a CF SES noob.
I am looking for a solution for categories and the product detail page.
Category URL: site.com/Results.cfm?catego...
I would like to end up as site.com/category-name
(Need something for Pagination also)
Product details page: site.com/Details.cfm?ProdID...
I would like it to end up as site.com/product-name
Any ideas?
Thanks!
Jeff
If you don't want ".cfm" to show at all, you need to use a server-side rewriting tool like Apache rewrites. IIS has a rewriting tool too.
The site is hosted on IIS but also has the option to use .htaccess or use an IIS rewriting tool. I believe it gets written to a web.conf. I use both but have only used each for the basics up to this point.
I would be interested in a quote for that and possibly even integrating a redirection and canonical option in the back end.
Thanks,
Jeff
If you have used server-side rewriting before, then you know it is easier than what I describe here. This post is for folks who don't have access to those options. Since you can do _any_ rewrite to simple URL params (ie, /cat/5 == index.cfm?cat=5), then your CF code can assume ugly, plain URL variables while the user is presented with nice, simpler URLs.
Right? If that doesn't make sense, let me know.
Right, I follow you for the most part and just looking for the clean URL that folks and SE's see without breaking anything in code.
How can we move forward?
Sorry - are you asking if you can just pay me to do this for you?
Yes, I will need a quote. I will email you.
Thanks,
Jeff