Some tips for working with Data Import Handlers in ColdFusion 10

While working on the Solr chapter for CFWACK10, I ran into a few interesting quirks that I thought I would document to - hopefully - save others from pulling their hair out. If you aren't aware of what Data Import Handlers (DIH) mean, it is simply a means by which you can tell Solr where your data is and how it should index it. It means you can skip the normal "query and index" process in ColdFusion and basically just tell Solr to get your crap by itself.

The first issue you will run into is with the sample XML provided by the documentation (see this page). It shows a dataconfig and datasource xml tag. But you should use dataConfig and dataSource instead.
How did I discover this? That leads to my next tip. To debug issues, you want to check the logs under CFINSTALL/cfusion/jetty/logs. You should see a file named stderr-YYYY_MM_DD.log. I kept this up and running during my testing and it was very helpful in figuring out what went wrong.
The docs mention using the JDBC URL for your database. Initially I got this by opening up one of the core neo-xml files, but that was a mistake. In your CF Admin, go into the Server Settings page and scroll down to the datasources. Under each one you will see the JDBC URL used for your connection.
Speaking of JDBC, in order for Solr to use a JDBC URL to MySQL, you have to copy ColdFusion's MySQL jar. Mine was named mysql-connector-java-commercial-5.1.17-bin.jar. You copy that from the cfusion\lib folder to the jetty\lib folder. You probably have to restart your Solr service in order for that to work.
Speaking of restarts, as you edit your data-config.xml file, remember you can reload just one collection in the ColdFusion Collections page in the CF Admin.
As part of the data-config.xml, you create a mapping between database fields and index fields. There are a few things you should know.

First, ColdFusion will throw a fit if you do not provide a field called uid. I mapped the primary key of my data to this field.

Secondly, you can add index fields with any name, but if you do not follow the pattern that ColdFusion likes (title, body, *_s, *_i), etc, it will not work. So consider the following block.

Note the ID/UID field. You must have this. Now note the last two lines. While I was allowed to index to a field called goober, it was not returned in cfsearch. But the one right after, booger_s, works because cfsearch will look for it.

Anyway, I hope this helps.

Archived Comments

Comment 1 by CF Stumped posted on 2/22/2014 at 2:34 AM

Ray - know this is an older post so hope you will still respond to it. I followed your instructions and was able to get this to work with a MS SQL database. My collection is being populated and if I do a cfsearch without a criteria results are returned. If I add a criteria no results are returned.Here is the cfsearch I am using: Do you have any suggestions on how to resolve this?

cfsearch
name = "mySearch"
collection = "FORD OEM"
maxRows="200"
criteria="window"
type="dismax"

Thanks in advance.

Comment 2 by CF Stumped posted on 2/22/2014 at 5:50 AM

Ray

I was able to make some progress and found if I do searches like this where I specify the column name in the criteria they return results

cfsearch
name = "mySearch"
collection = "FORD OEM"
maxRows="200"
criteria="description:window"
type="standard"

I was not able to get results from any of the examples found on http://help.adobe.com/en_US...

Comment 3 by Raymond Camden posted on 2/22/2014 at 6:26 PM

Going by memory, if you do not provide a field, I think it is hitting contents. Do you have a line like this: <field column="body" name="contents"/>

Comment 4 by CF Stumped posted on 2/23/2014 at 3:47 AM

Thanks for the response. No I do not have a line like the one you mentioned. Will add that on Monday and see what happens.

Comment 5 by CF Stumped posted on 2/25/2014 at 12:25 AM

Ray thanks for helping me out the lack of contents was the problem.

Comment 6 by Raymond Camden posted on 2/25/2014 at 12:25 AM

Glad it was that easy - the rest of DIH isn't. ;)

Raymond Camden

Some tips for working with Data Import Handlers in ColdFusion 10

Hire Me!

Support this Content!

Archived Comments

Webmentions