jsoup adds jQuery-like parsing in Java

Earlier this week James Moberg introduced me to a cool little Java utility - jsoup. jsoup provides jQuery-like HTML manipulation to your server. Given a string, or a URL, you can do things like, find all the images, look for links to a PDF, and so on. Basically - jQuery for the server. I thought I'd whip up a quick ColdFusion-based demo of this so I could see how well it works.

I began by downloading the jar file and dropping into a folder called jars. Then, using ColdFusion 10, it was trivial to make it available to my code:

I then whipped up a demo that loaded (and cached) CNN's html. I create an instance of jsoup, parse the HTML, and then run a "select" using my selector, in this case, just 'img':

Notice how I can loop over the matches and grab attributes from each one. Again, very jQuery-like. I wanted to play with this a bit more free form so I created an application that lets me supply any URL and any selector. Here's that code - minus the UI cruft around it:

You can run this yourself by hitting the demo below. All in all - a very interesting Java library. Sure you could do all of this with regular expressions, but I find this syntax a heck of a lot more friendly. (And that's with me having used regex for the past 15 years.)

Talk about synchronicity - within 10 minutes of each other, both Ben Nadel and I posted on the same topic! Parsing, Traversing, And Mutating HTML With ColdFusion And jSoup

Like This?

If you like this article, please consider visiting my Amazon Wishlist or donating via PayPal to show your support. You can also subscribe to the email feed to get notified of new posts.