Building the Serverless Superman

So yes - I built something stupid again. Recently I discovered the awesomeness that is @Big Data Batman. This is a twitter account that simply copies tweets with "Big Data" in them and replaces it with "Batman." It works as well as you may think - either lame or incredibly funny. (At least to me.) Here are a few choice samples.

That last one is a bit subtle.

I thought it would be fun to build something similar for serverless, and obviously, I had to name it Serverless Superman.

Alright, so how did I build this? My basic idea was this:
On a schedule, look for tweets about serverless from the last X minutes, X being the same as my schedule, find a random one, replace the word "serverless" with Superman, and tweet.

As with everything complex I've done with OpenWhisk, my solution involved a sequence of actions. I began by plotting out my actions in text form.

seq1:

    action1:
    set search: serverless
    since: today

    twitter/getTweets

    action3:
    remove RTs or older than X minutes

    action4:
    pick one and use Superman

    twitter/sendTweet

I named it "seq1" as I thought maybe I'd end up with multiple sequences, but one was enough. Let's break this down action by action.

Action 1: setupsearch

The purpose of this action was to serve as the input provider for the next one that will perform the Twitter search. Here is the code:

/*
I basically set up the args to pass to twitter/search
*/
function main(args) {

    let now = new Date();
    let datestr = now.getFullYear() + '-'+(now.getMonth()+1)+'-'+now.getDate();

    let result = {
        term:"serverless",
        since:datestr
    }

    return result;

}

The only real complex part here is since. The Twitter API lets you filter by date. Unfortunately you can't filter by time. That's going to be a problem later on but I'll address that in the third action. Notice I'm using two keys related to my Twitter account. I got these by logging into the developer portal with my Serverless Superman account.

Action 2: twitter/getTweets

This is an action I built as part of a public package. You can find the complete source code here: https://github.com/cfjedimaster/twitter-openwhisk. I'm not going to share the code since I blogged on it a while back, but I did have to update the package action to support the "since" argument.

Action 3: filterresults

The purpose of this action is multifold. It's main role is to filter the Tweets, but I also flatten the data quite a bit as well. I filter out retweets, replies, and items older than X minutes, where X is 10.

Finally I return an array of results where I just carry over the id, text, created_at, and hashtags value of the tweets.

/*
given an array of tweets, remove ones older than X minutes, and RTs, and replies
also, we remove a shit-ton of stuff from each tweet
*/

//if a tweet is older than this in minutes, kill it
const TOO_OLD = 10;

// http://stackoverflow.com/a/7709819/52160
function diffInMinutes(d1,d2) {
    var diffMs = (d1 - d2);
    var diffDays = Math.floor(diffMs / 86400000); // days
    var diffHrs = Math.floor((diffMs % 86400000) / 3600000); // hours
    var diffMins = Math.round(((diffMs % 86400000) % 3600000) / 60000); // minutes
    return diffMins;
}

function main(args) {

    let now = new Date();

    let result = args.tweets.filter( (tweet) => {
        //no replies
        if(tweet.in_reply_to_status_id) return false;
        //no RTs
        if(tweet.retweeted_status) return false;

        // http://stackoverflow.com/a/2766516/52160
        let date = new Date(
    tweet.created_at.replace(/^\w+ (\w+) (\d+) ([\d:]+) \+0000 (\d+)$/,
        "$1 $2 $4 $3 UTC"));

        let age = diffInMinutes(now, date);
        if(age > TOO_OLD) return false;

        return true;
    });

    //now map it
    result = result.map( (tweet) => {
        return {
            id:tweet.id,
            text:tweet.text,
            created_at:tweet.created_at,
            hashtags:tweet.entities.hashtags
        };
    });

    return { tweets:result };
}

One thing that kind of bugs me is the TOO_OLD value. Right now I have to ensure it matches my cron job (more on that later) and if I forget then I'll have a issue with my data. It's not that too bad of an issue and so I just got over it.

Action 4: makeresult

Yeah, that's a pretty dumb name. The idea for this action is - given an input of tweets, pick one by random and replace the word "serverless". Here is where things get a bit wonky. Sometimes I found tweets where "serverless" wasn't in the text. When I looked online, I saw them in the hashtags. Ok, so I updated my code in action 3 to include the hashtags. This is where I then discovered that the Twitter API seemed to not include all the hashtags I could see in the Tweet.

So... I shrugged my shoulders and got over it. As you can see, I wrote a note that it would be good to not give up and select another Tweet, but I thought maybe Serverless Superman could just STFU for a bit and wait.

/*
so i have an array of tweets. i pick one by random and replace serverless w/ superman
*/

function main(args) {

    return new Promise( (resolve, reject) => {

        if(args.tweets.length === 0) return reject("No tweets.");

        let chosen = args.tweets[ Math.floor(Math.random() * (args.tweets.length))];
        console.log('i chose '+JSON.stringify(chosen));

        if(chosen.text.toLowerCase().indexOf('serverless') === -1) return reject("No serverless mention");

        //todo - maybe loop to find another one if first item found didn't have the keyword

        let newText = chosen.text.replace(/serverless/ig, "Superman");
        console.log('new text is: '+newText);

        /*
        ok, so the next step it to tweet, for that, i need to pass:
        status
        */
        resolve({
            status:newText
        });

    });

}

Note that if I don't have any Tweets or if I can't find the word "serverless", I reject the sequence. This is not the right thing to do. OpenWhisk does support conditional sequences but it's a bit... complex right now. There is an open issue to make it a bit simpler and when that happens, I'll consider updating the post then, but for now I dealt with it. It does mean, however, that my action is going to report errors when an error really didn't occur.

Action 5: twitter/sendTweet

Finally - I send my Tweet. This is a new action in my Twitter package so I'll share the code here.


const Twitter = require('twitter');

/*
I send a tweet. i need:

args.status (the text)

and that's all I'm supported for now! Note, unlike getTweets
which can get by with less access, for this you need user auth
as documented here: https://www.npmjs.com/package/twitter
*/

function main(args) {

    return new Promise( (resolve, reject) => {

        let client = new Twitter({
            consumer_key:args.consumer_key,
            consumer_secret:args.consumer_secret,
            access_token_key:args.access_token_key,
            access_token_secret:args.access_token_secret
        });

        client.post('statuses/update', {status:args.status}, function(err, tweet, response) {
            if(err) return reject(err);
            resolve({tweet:tweet});
        });

    });

}

exports.main = main;

Nothing real complex here, but note I'm only allowing for text based Tweets. The API supports a lot more than that.

Finally, you may have noticed that my sendTweet action requires multiple authentication tokens. How did I pass them? I didn't. I simply used the OpenWhisk "bind" feature and made a copy of my package with all my tokens attached to it. Bam - done.

Putting it Together

The final bits included actually setting up the scheduled task. The first part required making a CRON based Trigger. Here's the command I used for that:

wsk trigger create serverless_superman_trigger --feed /whisk.system/alarms/alarm -p cron "*/10 * * * * *"

I used https://crontab.guru to help me build the cron value.

Then I made a rule that associated the trigger with the sequence I created of the actions above.

wsk rule create serverless_superman_rule serverless_superman_trigger serverless_superman/dotweet

And honestly, that was it. I opened up the OpenWhisk dashboard on Bluemix and kept watch of it and it just plain worked. (After some help from Carlos in Slack!)

Here is an example:

And my current favorite:

You can find the code this demo (excluding the Twitter actions which have their own repo) here: https://github.com/cfjedimaster/Serverless-Examples/tree/master/serverless_superman

Like This?

If you like this article, please consider visiting my Amazon Wishlist or donating via PayPal to show your support. You can also subscribe to the email feed to get notified of new posts.

Want to read more like this?