I've been a happy Disqus user for a while now, but I noticed this week that the stats provided by the service are pretty poor. For example, you can't even determine the total number of comments for your web site. That seems... a bit crazy. It isn't necessarily some crazy stat like, "How many Europeans create comments on the weekend." You can see how many comments you got this week:

Shot

When you go deeper though, you can't get an aggregate anymore of your comments. All you can get is a day by day (or month by month) break down which is limited to one year. Here is the most I can see from my site:

Shot

So obviously I could add those numbers up in my head, but one year of stats isn't nearly complete for my blog. Maybe Disqus offers more stats for paying customers (and to be clear, I'd totally understand that), but I can't find any links/details about that if it exists.

So screw it - let's use the API and build our own tools! I've set up a new repo where I'm going to start building my own stats. The API is rather simple and you get a somewhat generous usage allowance (1000 hits per hour) out of the box. Also, you don't have to use OAuth for the types of operations I need, and that makes it even easier to use.

For my first demo, I focused on what started this off - just figuring out how many damn comments my site has gotten. Not surprisingly, there is no direct API for this. There is an API to get details for a forum, but that information isn't provided.

The best I could come up with was the API to get lists of threads. They do a great job of supporting pagination in all their APIs, so my code simply needs to paginate over all the threads and then do math. Let's take a look.

First, the front end view, which is rather simple.


<!DOCTYPE html>
<html>
	<head>
		<meta charset="utf-8">
		<title></title>
		<meta name="description" content="">
		<meta name="viewport" content="width=device-width">
	</head>
	<body>

		<h2>Comment Count</h2>
		
		<input type="text" id="forum" placeholder="Forum name">
		<input type="button" id="startCount" value="Get Comment Count">

		<div id="results"></div>

		<script type="text/javascript" src="https://code.jquery.com/jquery-3.0.0.min.js"></script>
		<script type="text/javascript" src="app.js"></script>
	</body>
</html>

I assume nothing here is weird, but let me know otherwise. The fun part is the code:


var $forum, $startCount, $results;
var key = 'XrkXWYcSYFsQC74AMA9J37tNXuWbw0PwRl2DZSx3LHfu3pMJMio8Gts9qUAMBAV5';

$(document).ready(function() {

	$forum = $('#forum');
	$startCount = $('#startCount');	
	$results = $('#results');
	$startCount.on('click', doCount);
});

function doCount() {
	var forum = $forum.val();
	if($.trim(forum) === '') return;
	console.log('search for '+forum);
	$startCount.attr('disabled','disabled');
	$results.html('<p><i>Working on getting your stuff.</i></p>');

	doProcess(function(data) {
		console.log('back from doProcess');
		console.dir(data);
		var total = data.reduce(function(a,b) {
			return a + b.posts;
		},0);

		console.log('we had '+data.length +' threads');
		console.log('total comments = '+total);
		var avg = (total/data.length).toPrecision(3);
		$startCount.removeAttr('disabled');
		$results.html
			('<p>Total number of threads: '+data.length+'<br/>Total number of comments: '+total+'<br/>Average # of comments per thread: '+avg+'</p>'
		);
	}, forum);
}

function doProcess(cb, forum, cursor, threads) {
	console.log('running doProcess');
	var url = 'https://disqus.com/api/3.0/forums/listThreads.json?forum='+encodeURIComponent(forum)+'&api_key='+key+'&limit=100';
	if(cursor) url += '&cursor='+cursor;
	if(!threads) threads = [];
	$.get(url).then(function(res) {
		res.response.forEach(function(t) {
			threads.push(t);
		});

		if(res.cursor && res.cursor.hasNext) {
			doProcess(cb, forum, res.cursor.next, threads);
		} else {
			cb(threads);
		}
	},'json');
}

The bulk of the work is in the doCount and doProcess calls. doCount is responsible for validating your input and firing off the call to doProcess. When done, it simply does some math and reports. (See my notes at the end for how I could go further with this.)

doProcess is a recursive function that gathers all the threads. It uses the Disqus pagination support to go over all the threads and create a large array of threads. The thread data contains a post count which doCount uses to create the report.

Here is the result for my own blog:

Shot

You can find the complete source code for this here - and note - I'm going to be adding more demos soon (again, see notes below): https://github.com/cfjedimaster/disqus-analytics/tree/master/commentcount

You can run the demo here - but note that I'm using my public API key. Most likely it will not work for you. If you want to use my tools, download the source, create your own key (at Disqus of course), and go to town.

https://cfjedimaster.github.io/disqus-analytics/commentcount/

Ok, now for some notes!

  • You may be curious about 'threads' - threads are simply unique locations for your Disqus embeds. So for a blog, it would be every site visited. That's an important thing to note. If you visit a new bog entry, Disqus will create a 'thread' for it even if no comments exist yet.
  • I like that I can see an average number of comments. What I really want to know though is how those comments appear over time. I'm curious both about my traffic per month/year, as well as my traffic in terms of the age of my content. What I mean by that is - let's say my comment traffic is pretty much consistent. What may not be consistent is that people are commenting on older blog posts versus new. Technically Disqus can't help with that. A thread is created when someone visits the blog post. So I may have a very old blog post that no one visited. When someone visits it today, the thread is new, but the content is old. Since my content has date information in the URL, I can use that to perform analytics based on my content. This all comes down to one question - is my content more engaging now than it was ten years ago?
  • Maybe I'm wrong about Disqus and I just didn't find the right link for deeper stats, or the upsell to a paid account for more stats. Cool! Tell me where I'm wrong and I'll be fine with that. I had fun writing the code and that's all that matters. :)