A few days ago I blogged an example of "search as you type" implemented with jQuery and multiple types of data. ColdFusion was used to serve up data based on searches against two types of data. The front end client was rather simple. Because ColdFusion returned an array of results for one data type and another array for the second, it wasn't too difficult to render that out. I wanted to build upon that demo and work with data that was a bit more complex. In this example I'll show how you can work with data that comes back in one main "chunk" but contains different types of results.
To begin, let me talk about the data. I created a quick template to index blog entries and comments for coldfusionjedi.com. While not exactly relevant to this blog entry, here is the code I used. Do note I had to 'massage' the data a bit to make things work within ColdFusion's limit of 4 custom fields. Solr itself does not have that restriction.
<cfset col = "blogcontent"> <cfset dsn = "blogdev"> <cfcollection action="list" engine="solr" name="collections"> <cfif not listFindNoCase(valueList(collections.name), col)> <cfoutput><i>creating #col# collection</i><p></cfoutput> <cfcollection action="create" collection="#col#" path="#server.coldfusion.rootdir#\collections" engine="solr"> </cfif> <!--- remove existing ---> <cfindex action="purge" collection="#col#"> <cfquery name="getentries" datasource="#dsn#"> select id, title, body, morebody, posted, "entry" as type from tblblogentries </cfquery> <cfoutput>Adding #getentries.recordcount# blog entries to index.<p></cfoutput> <cfflush> <cfindex action="update" collection="#col#" key="id" title="title" body="body,morebody" custom1="posted" custom2="type" query="getentries"> <cfoutput>Done with entries.<p></cfoutput> <cfflush> <cfquery name="getcomments" datasource="#dsn#"> select c.id, c.entryidfk, concat(c.name," ",c.email) as nameemail, c.comment, c.posted, "comment" as type, e.title as entrytitle from tblblogcomments c left join tblblogentries e on c.entryidfk = e.id </cfquery> <cfoutput>Adding #getcomments.recordcount# comments to index.<p></cfoutput> <cfflush> <cfindex action="update" collection="#col#" key="id" title="entrytitle" body="comment" custom1="posted" custom2="type" custom3="nameemail" custom4="entryidfk" query="getcomments"> <cfoutput>Done with entries.<p></cfoutput> <cfflush>
I'm not going to cover every line of this code, but the important thing to note is that it indexes my blog entries and blog comments, along with the commenter's name and email address. I also create a 'fake' column called type that will be a static value. Altogether this leaves me with a Solr collection containing one index that covers two types of data. Now let's go to the service component that's going to be used by the front end.
<cffunction name="search" access="remote" returnType="query" output="false"> <cfargument name="string" type="string" required="true"> <cfset var initialResults = ""> <cfset var results = queryNew("key,type,title,summary,posted,author,gravatar")> <cfsearch collection="blogcontent" criteria="#arguments.string#" name="initialResults" maxrows="20"> <cfloop query="initialResults"> <cfset queryAddRow(results)> <cfset querySetCell(results, "key", key)> <cfset querySetCell(results, "type", custom2)> <cfset querySetCell(results, "title", title)> <cfset querySetCell(results, "posted", dateFormat(custom1) & " " & timeFormat(custom1))> <cfif custom2 is "comment"> <cfset querySetCell(results, "summary", summary)> <cfset var spacemarker = len(custom3)-find(" ",reverse(custom3))> <cfset querySetCell(results, "author", left(custom3, spacemarker))> <cfset var email = right(custom3, len(custom3)-spacemarker-1)> <cfset querySetCell(results, "gravatar", "http://www.gravatar.com/avatar/#lcase(hash(email))#?s=64")> <cfelse> <cfset querySetCell(results, "summary", htmlEditFormat(summary))> </cfif> </cfloop> <cfreturn results> </cffunction> </cfcomponent><cfcomponent output="false">
Ok - so I've got something interesting going on here. The beginning of the method is simple. Take in the search string and run the cfsearch tag. Solr takes over - does it's voodoo - and returns the result. But before I send this back out I want to manipulate the data a bit. I want jQuery to have a simpler time working with the results. I created a new query called results. I copy some things - for example I copy the custom2 column which stores whether or not the result is a blog entry or a blog comment.
For comments - I take the personalized data, the name and email, and break it out of the custom column I used. Note too I remove the email address and just return a Gravatar url. I could have done that client side. But that would mean people searching on my site would be able to get other people's email address. Always assume people are looking at your data sent over Ajax calls. Whether I actually printed out the email address or not wouldn't matter. If I send it over the wire, someone is going to see it. Now let's take a look at the front end.
<script type="text/javascript" src="http://ajax.googleapis.com/ajax/libs/jquery/1/jquery.min.js"></script> <script> $(document).ready(function() { //http://stackoverflow.com/questions/217957/how-to-print-debug-messages-in-the-google-chrome-javascript-console/2757552#2757552 if (!window.console) console = {}; console.log = console.log || function(){}; console.dir = console.dir || function(){}; //listen for keyup on the field $("#searchField").keyup(function() { //get and trim the value var field = $(this).val(); field = $.trim(field) //if blank, nuke results and leave early if(field == "") { $("#results").html(""); return; } console.log("searching for "+field); $.getJSON("search.cfc?returnformat=json&method=search&queryformat=column", {"string":field}, function(res,code) { var s = ""; s += "<h2>Results</h2>"; for(var i=0; i < res.ROWCOUNT; i++) { //display a blog entry if(res.DATA.TYPE[i] == "entry") { s += "<p><img src="blog.png" align="left">"; s += "<b>Blog Entry: <a href="">" + res.DATA.TITLE[i] + "</a></b><br/>"; s += res.DATA.SUMMARY[i]; s += "<br clear="left"></p>"; //display a blog comment } else { s += "<p><img src="" + res.DATA.GRAVATAR[i] + "" align="left">"; s += "<b>Comment by " + res.DATA.AUTHOR[i] + "</b><br/>"; s += "<b>Blog Entry: <a href="">" + res.DATA.TITLE[i] + "</a></b><br/>"; s += res.DATA.SUMMARY[i]; s += "<br clear="left"></p>"; } } console.dir(res); $("#results").html(s); }); }); }) </script> <style> #results p { border-style:solid; border-width:thin; padding: 10px; } </style> <form> Search: <input type="text" name="search" id="searchField"> </form> <div id="results"></div>
I'm going to focus specifically on what's changed based on the last entry. The main change begins with the loop in the callback handler of the getJSON call. I've got one array of results in a DATA object. Because my back end flagged comments and blog entries I can use a simple IF clause to branch between them. For blog entries notice I render a static image. (Also note the URLs are intentionally blank. I do store enough information to render links but I wanted to keep it a bit simple.)
For blog comments we get a bit fancier. Since I've got the gravatar URL I used that for each comment. This allows me to put a face to the comment. So how well does it work?
In my testing various search strings seemed to work well, but play with it and you will see (hopefully) how the results go back and forth between blog entries and blog comments. Any questions or comments on this approach?
p.s. Those of you familiar with ColdFusion and Solr may note how I 'hacked' up email and name into the collection within one custom field. Looking back at my code I could have used the category attribute to store 'comment', 'entry' instead of "wasting" one of my 4 custom fields.
Archived Comments
@Raymond:
Well outside the scope of demos, one thing you out to at least mention (and possible cover) is using debouncing techniques when doing any event-based functionality that relies on AJAX operation. I wrote a blog entry a while back covering using debouncing techniques:
http://blog.pengoworks.com/...
What's really nice is it's very easy to implement the debounce technique using the function in my post. You just write your normal JS callback function, then append the debounce() method.
The benefit is that instead of firing off an AJAX event for every single keypress, you only fire off the AJAX request once until a certain delay is reached. This technique can drastically help to reduce unwanted AJAX calls to your server. It's even useful for click operations, because it can essentially "block" users who double-click on everything.
Ray,
Very nice, something to add as footnote, is that something like this in its current form should probably not be used on high volume sites. One of the issues with the auto-complete AJAX/JS routines is that if no caching on client side exists you get huge amounts of query requests.
Facebook is the extreme example but that is how its handled. So if someone wants to do an auto-complete that is robust should look to the scripts out there that implement caching and incorporate your stuff into that.
Thanks and really appreciate your examples especially on the json data read. So tired of nebulous examples for plugins.
@Dan: I didn't get it at first but damn that is slick. I'm going to do a follow up blog entry on this if you are cool with it.
@Kevin: Do you think Dan's suggestion would help? Also - caching could be used. I could remember what the results were for searching for "foo". If I remember just the generated string it shouldn't take up too much RAM.
So - how about I try adding _both_ your guys suggestions?
@Ray:
Debouncing definitely helps reduce unneeded AJAX hits. I mean if someone types "smith" in 200ms, there's no reason to make 5 separate AJAX request. By debouncing the request, you limit the requests to just when there's some delay. It's a simple way to reduce overhead on what end up being unnecessary requests.
Obviously, caching on top of that takes it to another level (and most autocomplete plug-ins I've seen implement some form of caching.)
Well then I know what my lunch time blog entry will be. :)
@Ray
Cool, will be interesting to see what you come up with in regards to caching.
@Dan,
Checked your debouncing. Nice, such an obvious thing, so of course never tried that. would definitely made a recent project much easier and now looking at incorporating into a new project. Thanks.
What interesting is that outfits like Facebook use a two tier system because of the large number of users. In FB case they focus on your friends first and then if you want wider search you have to go elsewhere.
My initial stab (not online) isn't quite working. I append .debounce(2000) to the end of $.getJSON(....). I'm getting an error because the object has no method debounce.
Oh - I bet it's because jQuery's getJSON is chainable and returns jQuery itself.
I also tried at the end of the keyup handler itself. It doesn't throw an error but doesn't pause either.
Next I tried the 'standalone' version and
$("#searchField").keyup(debounce(function() {
....
},2000);
Nm - I think I got it now. Too many damn parenthesis. ;)
Slick. I've got this in along with a cache as well. 10 minutes total time.
@Ray - cool. yeah that is the one think with jQuery those parentheses can get crazy to follow.
@Ray
I actually just did something really similar for the search of our galleries, using the indexes of photos and galleries.
http://go.cua.edu/galleries...
I like the debounce idea mentioned above, another reason I'd implement it is because the search of the indexes can change rank of results slightly even though all the same records are returned, which can be a bit jarring. As an example you could try typing John Garvey in the search and play around deleting and adding the -ey at the end of the name.
Same results, different ordering. Seems like debouncing might help with that provided the user doesn't keep playing with the searches like I just suggested!
@Jeremy: Well hopefully when I get my blog post up (thinking today - 3:10ish) it will help you with the debounce. I liked your demo - thanks for sharing it. I do so many POC (proof of concepts) that I rarely get to see things in production. :)
Hmm, I think I see a problem guys. When I type quickly, let's say AB, because only "A" is fired off, the results you see are not limited to AB matches but rather just A. Any ideas how to solve that? Can debounce be made to call itself again after the period? Ie, given N the debounce period: One event 1, we fire and begin the N count. On the second, 3rd, etc, we do not fire, BUT, we set a flag saying after N, go ahead and fire again. So if I typed ABC quickly, I'd get the initial fire for A, BC would wait for N to run, and when N is done, we would fire with ABC.
Does that make sense?
Ok, I stand corrected. The code handles it perfectly. When I quickly type AB, it waits and fires off AB.
Updated blog entry:
http://www.coldfusionjedi.c...
Guys - as always - thank you for sharing your ideas!