Twitter: raymondcamden


Address: Lafayette, LA, USA

Using jQuery to load HTML and filter it by N selectors

08-02-2012 8,003 views jQuery, JavaScript 18 Comments

Forgive the somewhat awkward title. Hopefully an explanation will make things a bit clearer. I was working on an application yesterday that needed to load in a HTML file via AJAX and display it on screen. The HTML happened to be documentation so I was going to simply display it as is on screen. Since I wasn't doing any processing, my code was very simple:

Easy, right? Well, the first thing I discovered was that the HTML I was loading included things I didn't want - headers, footers, etc. Again though this is easy enough to handle. You can tell jQuery's load() function to filter down to a DOM item. (As a reminder - if you are concerned about performance don't forget that you are still asking jQuery to load N bytes of HTML even though you are using <N bytes in the display.)

Woot. Almost there. This worked great, but the "block" of HTML this rendered was missing a nice header on top. I went back to the original source HTML and discovered that there was another div, header, that contained the title and would be perfect.

But here was a problem. How do I tell the load() function to select *two* DOM items? Turns out this was easy as well - just provide a list:

This worked fine. But this leads to my question. Is this a good idea? Is there a better way? (Assuming you can't get "pure" data and must work with the HTML files.)

18 Comments

These comments will soon be imported into Disqus. To add a comment, use Disqus above.
  • Commented on 08-02-2012 at 9:09 AM
    @Raymond:

    While this code is super concise, I don't find it intuitive at all. I prefer to use a complete call back and spell out the code.

    Also, keep in mind that if the source document DOM changes, this code could break. At bare minimum, I'd carefully comment what the code is supposed to do.
  • Commented on 08-02-2012 at 9:13 AM
    I'm going to ignore your second comment because, as I said, this was the source HTML and it would obviously be better if I had pure data. That just isn't an option for now. ;)

    To your first one... ok - so given you decide to switch to $.get, or $.ajax. Given you have X which is the result HTML. How do you get N nodes? I had found this SO question:

    http://stackoverflow.com/questions/405409/use-jque...

    But it didn't work well for me. Using $(data).find('a') works, but not $(data).find('#id')
  • m13z #
    Commented on 08-02-2012 at 12:27 PM
    In jQuery you can use a context instead of a filter:

    $('#header, #docs', data)

    Maybe that works better?
  • Commented on 08-02-2012 at 1:32 PM
    The issue I had with that is that it seems to 'execute' data, and if there are syntax errors in the DOM, like trying to use a script it can't reach, then I get errors in the console.
  • Tim Leach #
    Commented on 08-02-2012 at 1:45 PM
    @M13z

    FYI,
    Doing $('#header, #docs', data) is just a shortcut for
    $(data).find('#header, #docs')

    Both will execute the same under the hood.
  • Commented on 08-02-2012 at 2:21 PM
    Raymond,

    What errors are you seeing in the console? I'm going to take a guess, but are you seeing a Permission Denied error? Behind the scenes the .load() method is using $.ajax() to do it's work. Internally it uses a find, but before it does that it removes any scripts that can cause problems.

    What site where you trying to get markup from? If you point me to that I can run a test in the console of that site to see what you are running into. Is the site public?
  • Commented on 08-02-2012 at 2:22 PM
    It is a "bit" private as in I'm using it for a demo at a keynote on Monday. It isn't really important though in terms of being top secret. Give me a bit to get a 'demo' of what I saw live, or at least get more of the error.
  • Commented on 08-02-2012 at 2:27 PM
    Ok, some info. First off, you can see a sample of the HTML source here:

    http://www.raymondcamden.com/demos/2012/aug/2/acos...

    I modded my code to this, just for testing:

    $.get("cfdocs/"+url, {}, function(res,code) {
       console.log('ready');
       var header = $("#header", res);
       console.dir(header);
       var header2 = $("#header", $(res));
       console.dir(header2);
    });

    As you can see, I wasn't sure if I needed to jQuery-wrap res or not. But running the above, I get no matches, even though #header is cleary part of the dom.
  • m13z #
    Commented on 08-02-2012 at 2:42 PM
    replace the "res" context with:

    $('<div />').append(res.replace(/<script\b[^<]*(?:(?!<\/script>)<[^<]*)*<\/script>/gi, ''))

    That's basically what .load() does internally.
  • Commented on 08-02-2012 at 2:45 PM
    Did you mean:

    var header3 = $("#header", $('<div />').append(res.replace(/<script\b[^<]*(?:(?!<\/script>)<[^<]*)*<\/script>/gi, ''));

    Throwing a syntax error.
  • Commented on 08-02-2012 at 2:46 PM
    Sorry missing ). Testing.
  • Commented on 08-02-2012 at 2:48 PM
    That worked. So if I read it right, this is what you did:

    Take the result HTML.
    Remove any script block.
    Append it to a virgin DIV block made on the fly.
    Then run my selector against it.

    Would you say that is an accurate description?
  • m13z #
    Commented on 08-02-2012 at 2:52 PM
    Yepp. Accurate.

    Lines 7180 to 7187 of jQuery 1.7.2 for source.
  • Commented on 08-02-2012 at 2:54 PM
    So another take away from this is: You can always work with arbitrary HTML but you just need to remove script blocks first.

    Thanks M13z!
  • m13z #
    Commented on 08-02-2012 at 3:24 PM
    ¬°De nada!

    About the trigger of this discussion, I would normally agree with Dan about more control over the callback, but we have ended doing exactly what load() does internally, so the original code of the article is the better solution (It's exactly what jQuery was created for in the first place: "write less, do more").
  • Commented on 08-03-2012 at 9:22 AM
    jQuery 1.8 has a new method that may change the way this kind of processing is done. Look into the `$.parseHTML()` method. It takes a string of html and returns it as document fragment with or without scripts. line 485 of http://code.jquery.com/jquery-git.js
  • Commented on 08-03-2012 at 9:26 AM
    Oh man that's pretty cool. I have not been following the development of 1.8 much. I'll have to pay more attention.
  • Commented on 08-06-2012 at 9:43 AM
    @Raymond:

    My comment about the source code changing, was more to the fact that if the source code changes the comments are important because you may not remember what "#header, #docs" are supposed to do. Commenting the code to say:

    // get article header (#header) and the body of the article (#docs)

    Should help to fix the error. This is really more a note for someone trying to do this kind of thing in production.