Twitter: raymondcamden


Address: Lafayette, LA, USA

Date/Time issue with CFINDEX and SOLR

03-11-2013 3,513 views ColdFusion 15 Comments

I'm not sure if this is a bug or totally expected, but as it hit my blog, I figured I'd share it. A reader (thank you Aaron!) noted that searches on my blog were all returning dates in 2000 and 2001:

I noticed that the months and dates were right, it was just the year that was off. I then noticed that most of my posts were in the AM, including some at 1 and 3AM. Now, I'm not that great of a sleeper, but even I need to sleep some time.

I first looked at the code I used to index my data. (By the way, did I mention I switched to SOLR-based searching here? Well, I did. :) This is the code used to query the database and store the results in the index. I only use this when I need to blow away everything and start fresh, but similar code is used for atomic inserts as well.

Note the use of the custom field, posted_dt. This tells cfindex to store data using the dt format, which according to the docs, is...

Note: _dt supports only the date formats supported by ColdFusion.

Since the dates from the database were being used already by ColdFusion in my entry display, I assumed I was ok. Here is an example of one of the dates from my blog:

2011-12-22 10:22:00.0

I then went to the front end and added a dump to my search results. This is where I noticed something odd. Here is what one of my results looked like from cfsearch:

Thu Dec 15 12:30:00 CST 2011

That passes isDate, but if I parseDateTime the string, I get:

{ts '2001-12-15 02:30:00'}

So it appears as if I can pass into cfindex a value that ColdFusion can handle correctly, but SOLR returns something that ColdFusion cannot handle correctly. Luckily I've got no issue just printing exactly what SOLR returned. It doesn't exactly match how I show dates elsewhere, but frankly, I don't care. I could have gotten around this - possibly - by storing the value with the postfix _s instead (ie, simple string), but again, I'm happy just displaying the result as is.

15 Comments

  • Patrick Heppler #
    Commented on 03-12-2013 at 3:10 AM
    Maybe it's just a typo but the time in 2011-12-22 10:22:00.0 looks a bit weird
  • Commented on 03-12-2013 at 8:47 AM
    Afaik it is valid. The .0 at the end is just milliseconds.
  • Commented on 06-14-2013 at 5:22 AM
    Okay, Q's? I want to ask you about the posteddt. Are you saying that the dt at the end of this custom field is what is used to tell the SOLR that it is a datetime field? If yes, where can I find more of those endings for cfindexing control? Looked through a lot of documentation and have not noticed those yet. SOLR was introduced in CF9 correct?
  • Commented on 06-14-2013 at 6:13 AM
    Yeah but custom fields were CF10, not 9, if I remember correctly.
  • Commented on 06-14-2013 at 6:44 AM
    I figured that was the case. Will need to do more research for those handy dandy extensions for custom fields. Currently creating a tag and category system along with a search box for searching blog pages by keyword, tag or category. The only real problem I continue to battle; is the fact it keeps wanting to show results for the document pagename.cfm and the document title. I only want it to show the document title that is turned into a link to the blog post. Basically it is producing the same document in the results but with two different sets of links to the same document.

    Any suggestions?
  • Commented on 06-14-2013 at 6:51 AM
    I'm sorry, I don't understand what you are saying. You have complete control over how you display results from cfsearch. I must not be understanding what you mean here.
  • Commented on 06-14-2013 at 8:01 AM
    Result output:
    Title = link
    pagename = link

    This is the same output for the same page formatted as ahref. Only want the first one to show.
  • Commented on 06-14-2013 at 8:18 AM
    .... um... again... I'm confused. The result of cfsearch is a query. You decide what columns to output when you loop over it.
  • Commented on 06-14-2013 at 8:49 AM
    Believe the problem to be solved at this point for the linking output. Next, It would appear I need to create more than one cfcollection. One for normal searching and a second for tags and categories. If you follow: http://www.linkworxseo.com/blog/2011/11/03/why-is-... and then click the first tag named analytics. You will then be shown a result set for the tag and a different result set for the same cfcollection. This is why the search input should be split from the tag results output. Thinking a second cfcollection to split the searches. All tags and categories are not ready yet as I am still developing.
  • Commented on 06-14-2013 at 9:39 AM
    I'm confused - why do you need a second collection for categories and tags? CF's full text indexing supports categorizing content.
  • Commented on 06-14-2013 at 12:44 PM
    Yeah, I understand that the categoryTree and category can work, but I think I am torn between running a SQL statement or a cfindex tag to populate the cfsearch for results output.

    <cfloop index="i" from="2011" to="#LSDateFormat(now(),'yyyy')#">
    <cfindex
    categoryTree="tag/"
    category="#form.Criteria#"
    collection="site-search"
    action="update"
    type="path"
    key="#expandPath('\')#blog\#i#"
    custom1="pagepath"
    custom2="pagename"
    custom3="pagedescription"
    extensions="*."
    recurse="yes"
    language="english"
    urlpath="http://www.linkworxseo.com/blog/#i#">;
    </cfloop>

    This cfindex is going to change my first site-search cfindex on the tag/ page. This cfindex is on the tag/analytics/ page which is a tag. Thinking about using refresh first and then run an update for the action on the tag/ page. What you think?
  • Commented on 06-14-2013 at 2:10 PM
    Um... no idea what your doing here. :) But yeah - doing a refresh with a query (and it can be a fake query) is better than N atomic index operations.
  • Commented on 06-15-2013 at 7:11 AM
    Can you fix that link for me? Remove it or remove everything after blog/... Appreciate it.
  • Commented on 06-15-2013 at 7:57 AM
    Eh? You mean in the code?
  • Commented on 06-15-2013 at 4:35 PM
    No, just the bad link at urlpath. Think I got it all worked out, But it seems the cfindex is not updating. I ran a refresh and then changed it back to update and the results are only showing for one keyword now (google). Going to give it some time to find out if it will start showing other results now.

Post Reply

Please refrain from posting large blocks of code as a comment. Use Pastebin or Gists instead. Text wrapped in asterisks (*) will be bold and text wrapped in underscores (_) will be italicized.

Leave this field empty