One of the more interesting services available on IBM Bluemix is Insights for Twitter. This service provides a deep look at real-time Twitter data. The API provides a basic "search" feature but can also include some incredibly detailed filters. So for example, you can look at data from users who are (possibly) married and have kids. As part of the analysis, you can also get a sentiment value: positive, negative, neutral, and ambivalent. I thought it would be interesting to build a tool that let me compare the "general" sentiment for a search term compared to that of more focused segments of the audience. Obviously this isn't 100% accurate, but it provides an interesting look at how different types of people view/discuss the same topic.
The API supports multiple filters:
- By language
- By location
- By country
- By the number of followers
- By the number of people the person follows
- By people with children
- By people who are married
- By verified users
- By people in a range of lists
- By people within a circular region around a long/lat.
- In a certain sentiment
- In a certain date range
- And more
For my demo, I decided to focus on a certain set of filters that would be most applicable to more terms. (I'm toying with the idea of building an "advanced form" later.) Given a term, I will return the sentiment for:
- The general dataset
- People with 5k+ followers
- Married people
- People with children
- Results over the last year (Jan1-Dec31 of 2015 as of the time I'm writing this blog post)
The docs explain how the API works including the query language. In general it is pretty simple, but the best news is that you can skip getting results and just use the count
method instead. That means I can easily get sentiment analysis with 4 API calls (one for each sentiment). As an aside, I ended up not displaying all four sentiments in my charts, but the back end code still fetches them all.
I'll include a link to the code base below, but here is a snippet showing how I fetch the general sentiment for a term.
sentiments.forEach(function(s) {
var p = new Promise(function(fulfill, reject) {
request.get(getURL()+'messages/count?q='+encodeURIComponent(term)+' AND sentiment:'+s, function(error, response, body) {
if(error) console.error(error);
fulfill({scope:'main',sentiment:s,data:JSON.parse(body).search.results});
});
});
promises.push(p);
});
Here is the block that handles married (again, possibly married) people:
sentiments.forEach(function(s) {
var p = new Promise(function(fulfill, reject) {
request.get(getURL()+'messages/count?q='+encodeURIComponent(term)+' AND is:married AND sentiment:'+s, function(error, response, body) {
if(error) console.error(error);
fulfill({scope:'married',sentiment:s,data:JSON.parse(body).search.results});
});
});
promises.push(p);
});
As you can see, I'm using promises to handle the fact that I'm firing a bunch of async processes. As I've got 5 charts and 4 sentiments, each query performs 20 different HTTP calls. Handling that isn't too hard:
Promise.all(promises).then(function(results) {
console.log('deep query done');
console.dir(results);
var metaResult = {};
results.forEach(function(r) {
var append = '';
if(r.scope !== 'main') {
append = '_'+r.scope;
}
metaResult[r.sentiment+append] = r.data;
});
fulfill(metaResult);
});
The code in the forEach there is handling applying each HTTP result to a proper location in my main result object. Generally your array of results will match how they were applied to the array, but as I'm using 5 forEach statements for my reports, I can't know for sure the order they will be added to the array. Kind of complex, but that's all nicely hidden away in my library. Back in the main Node.js route that handles all of this, it is rather simple.
app.get('/analyze', function(req, res) {
var term = req.query.term;
if(!term) return res.end();
twitterInsights.deepSentiment(term).then(function(result) {
res.send(result);
});
});
So how about the front end? I decided to use Bootstrap to create a simple UI for the term searching. For my charts, I decided on Highcharts. It is free for commercial use and I liked the animation of the charts. Here's a sample report.
You can see all the code for this project on the GitHub repo I just set up: https://github.com/cfjedimaster/twitterinsights.
And you can run the demo yourself here: http://twitterwall.mybluemix.net/. I had some good folks test this and they managed to crash it a few times, but hopefully I've covered up most of the issues. Let me know what you think. And remember, the sentiment analysis is not going to be perfect. I'm sure you can find examples that don't make sense. Be nice. :)
Archived Comments
As I mentioned in the CFML slack, I tested it with 'Scottish Independence', which brought back general Twitter user sentiment of 45.2% negative and 54.8% positive. The most interesting thing about that is that it almost exactly equaled the exact result of the independence referendum - the actual result was 44.7% for and 55.3% against - in reverse.
Again the interesting thing about that result is that during and after the whole campaign it seemed to almost everyone that the Yes campaign would win because of the strength of the online presence of the Yes campaign. Sadly reality didn't pan out like that but the data provided by this experiment is, as I said, interesting nonetheless!
I arrive 2 years late, but thanks for posting it!
You are most welcome.
http://twitterwall.mybluemi... not working... can you please fix it, i just wonder how this sentiment analysis really works means i want to see an working example... Thanks
I no longer work at IBM so I don't have a free account there anymore. You have access to all the source code and you can get a free trial so it would be possible for you to set this up yourself.