Earlier this week I got to look at some code using CasperJS. CasperJS is a testing utility for PhantomJS, a headless (i.e. virtual) Webkit browser. This is probably unfair, but I like to think of Casper as a super powered Curl. Hopefully you know Curl as a command line tool that lets you perform network requests and work with the result. Unlike Curl, CasperJS (and PhantomJS) can actually interact with the results like a real browser. This allows for some cool testing/utilities. I've only begun to scratch the surface of the tool, but I thought I'd share an interesting little issue my coworker and I ran into with it.
My coworker, Paul Robertson (smart and friendly guy who needs to start up his blog again), created a CasperJS script that crawled a set of URLs and downloaded the HTML to a local directory. That by itself sounds just like Curl. But these pages were special. They used jQuery to perform an XHR request to a JSON file. Once the XHR request was done, the data was turned into HTML and rendered into the DOM. Here is where CasperJS/PhantomJS shines. He was able to tell the headless browser to wait for a particular selector to appear (the one used by the JavaScript code to render HTML) and only then actually save the result. Cool, right?
To give you an idea, here is an example file. This one loads in a simple HTML fragment but you get the idea.
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
<title></title>
<meta name="description" content="">
<meta name="viewport" content="width=device-width">
</head>
<body>
<div id="main"></div>
<script type="text/javascript" src="http://ajax.googleapis.com/ajax/libs/jquery/2.0.3/jquery.min.js"></script>
<script>
$(document).ready(function() {
var $main = $("#main");
//fetch the content and insert it...
//timeout here just to fake some network latency
window.setTimeout(function() {
$.get("data.html", function(res) {
$main.append(res);
});
}, 800);
});
</script>
</body>
</html>
Nothing weird there so I won't go over it. Now let's look at the CasperJS script. This is a simplified version of what Paul wrote. Even if this is your first time ever seeing CasperJS in action, you can probably grok what's going on.
var casper = require("casper").create();
var fs = require("fs");
var outputFolder = "./output/";
var timeoutLength = 2000;
// TODO: delete contents of output folder
if(fs.exists(outputFolder)) {
console.log("Removing "+outputFolder);
fs.removeTree(outputFolder);
}
casper.start();
var url = "http://localhost/testingzone/trash/testtransition2.html";
var selector = "#dynamicContent";
console.log("Getting "+url);
casper.thenOpen(url);
casper.waitForSelector(selector, function then() {
var htmlToWrite = this.getHTML();
var outputFile = outputFolder + "testoutput.html";
fs.write(outputFile, htmlToWrite, "w");
console.log("Wrote html to " + outputFile);
}, function timeout() {
this.echo("Selector not found after " + timeoutLength + " ms");
}, timeoutLength);
casper.run();
The result is an HTML file with the contents including those loaded via jQuery. Just for completeness sake, here is the html fragment. Note it has the selector my script is waiting for.
<div id="dynamicContent">
This is so cool.
</div>
Woot! Ok, so like I said - that's really freaking cool. We pushed the content up for testing and someone pointed out something odd. The area where dynamic content was loaded had an odd gray dimness to it. In case you're curious, this is the second Google result for "odd gray dimness":
I did some digging and discovered something truly odd. The div that had dynamic content injected within it had this: style="display: block; opacity: 0.04429836168227741; "
WTF
I did some more clicking around - this time on the original version - and then I saw it. When the dynamic content was loaded, jQuery was doing some fancy fade in/fade out action. Because... fancy. So I modified my original code a bit to try to recreate this:
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
<title></title>
<meta name="description" content="">
<meta name="viewport" content="width=device-width">
</head>
<body>
<div id="main">Existing HTML...</div>
<script type="text/javascript" src="http://ajax.googleapis.com/ajax/libs/jquery/2.0.3/jquery.min.js"></script>
<script>
$(document).ready(function() {
var $main = $("#main");
//fetch the content and insert it...
//timeout here just to fake some network latency
window.setTimeout(function() {
$main.fadeOut(function() {
$.get("data.html", function(res) {
$main.html(res);
$main.fadeIn();
})
})
}, 20);
});
</script>
</body>
</html>
Note the use of fancy fading action. Fancy! I then reran my CasperJS script and immediately saw the same thing added to my output. It was the fade! CasperJS saw the DOM item because, well it was there, but jQuery wasn't finished fading it back in. I discovered that CasperJS had a simple wait command, so I threw that in:
casper.waitForSelector(selector, function then() {
this.wait(1000);
var htmlToWrite = this.getHTML();
var outputFile = outputFolder + "testoutput.html";
fs.write(outputFile, htmlToWrite, "w");
console.log("Wrote html to " + outputFile);
}, function timeout() {
this.echo("Selector not found after " + timeoutLength + " ms");
}, timeoutLength);
And that did it. Technically I could have used a smaller wait (the default duration for jQuery fade transitions is 400), but this seemed simpler and did the trick.
Archived Comments
So are you running your automation tests using node.js or is it just something that you run via the browser (as it's JS)? We've been running Selenium (via a modified version of CFSelenium) to do our integration/automation tests with MXUnit. CasperJS and PhantomJS seem like a nice fit also.
We weren't using this for tests. We needed a static version of a page that also included the JSON rendered data in it.
Don't use wait() because that defeated the asynchronous purpose. Plus it's not reliable and slow things down. The best approach in general for css related problem is to remove css classes itself. You can do that in evaluate() with jquery and then do whatever you need after that.
"Don't use wait() because that defeated the asynchronous purpose."
Sorry - how?
"The best approach in general for css related problem is to remove css classes itself. "
Again, eh?
"You can do that in evaluate() with jquery and then do whatever you need after that."
Ok, I'm 0 for 3. ;) Seriously dude - nothing you said made sense to me. Can you try again?
OK my second and last before you figure out your own: css timing and effects can change! Even this.wait(1000) is not safe. What if the fading is 10 sec long (little extreme but possible)? Or what if jQuery effect depends on something else that was delayed because of connectivity? wait() should be banned and the number in it is just random arbitrary number.
Try to do this with 20k pages and 50GB of data processed and you will appreciate. Enjoy headless!
Ok, I see your point about how the transition duration can change. But we control the source files here and we know the length, so I don't see this as a problem. If we didn't, if we were slurping someone else's files, then yeah. Ditto for your comment about 20K pages. In our case, we had a very short list of pages (around 15).
So how would you change it then? If jQuery is injecting a new DOM element with ID of X, but used a transition, how would you tell Casper to wait w/o ... well waiting. ;)
When the page loads, you can inject JS and disable jQuery's fx queue (https://api.jquery.com/jque.... On my own personal projects, when I do integration testing and load my env up in test mode, I disable all animations via an application config. Reason being, I use jquery.transit (http://ricostacruz.com/jque... which has no such notion. Hope that helps.
@Ken: Oh that is slick. I didn't know - thanks!
There's another cool feature that you may use across your projects, waiting for an element to finish (like a loading widget).
casper.waitWhileVisible( "the_element", function then(){
// your code
});
Thanks for sharing that, Alexandru.