Serverless and Persistence

This post is more than 2 years old.

Today's post isn't going to be anything really deep, more of an "A-ha" moment I had while talking with my coworker Carlos Santana. No, not that a-Ha...

Take on this!

More a "adjustment of a misunderstanding" of the serverless platform. One of the things I knew right away about serverless is that my code acted as a single, atomic unit, in an ephemeral form. So yes, a server is still involved with serverless, but it's created on the fly, used my code, then disappears when my code is done running. The OpenWhisk docs have this to say:

Statelessness

Action implementations should be stateless, or idempotent. While the system does not enforce this property, there is no guarantee that any state maintained by an action will be available across invocations.

Moreover, multiple instantiations of an action might exist, with each instantiation having its own state. An action invocation might be dispatched to any of these instantiations.

So the practical side effect of this is that you can't use non-database persistence. No file system and no memory variables. Obviously you could use a cloud-based file system like S3 and obviously you can persist in a database, but you can't do this within the action itself.

Except you can.

First off - when your action is fired up, OpenWhisk does not kill it immediately when done. Rather, it keeps the container used for your code around to see if the action will be fired again soon. You can see this yourself with a simple action:


var x = 0;

function main(args) {
	x++;
	return {x:x};
}

exports.main = main;

If you run this a few times, quickly, the variable will persist and increment:

Increment

The practical use for this would be storing the result of a database call in memory. For example, imagine if your action needs data from Cloudant in order to process. You could store that value in a global variable and check to see if it is undefined before you talk to Cloudant again. You can only do this if the data you need isn't dependent on arguments - unless you want to start caching with keys related to function parameters.

The second "a-Ha" moment is that you absolutely do have a file system. That means your code can CRUD locally to work with binary data. You could also simply leave files there and see if they exist on the next invocation. I'd probably treat this carefully, but if you are using one, or a few, particular files, you could consider leaving them there before fetching them.

So as a practical example, if you need to work with a file from S3, you could copy it locally, and keep it there. When the action is run next time, see if it exists before copying from S3.

A real example of this is the incredible demo Daniel Krook made called openchecks. It's an OpenWhisk sample app demonstrating how to build a serverless check scanning/depositing system. As I said, it's really, really cool and you should check it out.

Anyway, I hope this helps. As I said, this is something that I assume most people working daily with serverless already grasp, but it was eye-opening to me and just adds another interesting level to what you can do with the platform!

Raymond Camden's Picture

About Raymond Camden

Raymond is a developer advocate for HERE Technologies. He focuses on JavaScript, serverless and enterprise cat demos. If you like this article, please consider visiting my Amazon Wishlist or donating via PayPal to show your support. You can even buy me a coffee!

Lafayette, LA https://www.raymondcamden.com

Archived Comments

Comment 1 by Jesse Monroy posted on 2/10/2017 at 4:49 AM

Raymond in review of your comments, I find them all correct - as stated.
However, there is a misnomer that should be addressed.

Namely, that it is "serverless". In theory this is true, but in practice it is not.
Depending on the implementation, it is an on-demand (or in memory) instance of
a set of routines (functions, procedures, whatever makes you confortable)
that run as a cohesive unit. They act and behave much like a server,
except that is much more akin to SaaS - except that service is
very private and at the direction of the connecting app.

The major advantage is no configuration and tinkering on the
edges. The OpenWhisk is a class of servers that perviously
ran as inetd(8) services.

https://www.freebsd.org/doc...

These class of servers (no longer in vogue) were the basis for
college students learning TCP/IP programming. The services
it provided are listed in the link above. While these "services"
were very limited, the concept was the same. As a matter
of fact, a contest between programmers I know were challenged
to create the smallest inetd(8) program (in terms of lines of code).
The winner create a 0 (zero) size program size - by using
the command-line program 'cat' to echo back whatever it
received from a network connection.

No doubt OpenWhisk is much more powerful and much more
useful.

More?

All the Best
Jesse

Comment 2 (In reply to #1) by Raymond Camden posted on 2/10/2017 at 12:11 PM

I'm confused - are you saying the term "serverless" doesn't make sense here compared to other serverless products, like Amazon's? Or that it doesn't make sense *in general*, if so, I'd argue it makes as much sense as 'cloud'. ;) The name bugs me too, but it seems to be the marketing term we've decided to use for this class of service.

Comment 3 (In reply to #2) by Jesse Monroy posted on 2/10/2017 at 11:04 PM

No. I am not saying the name "does not make sense". I am saying in polite terms this is a zero-configuration server - M$ uses the term zero-configuration. And in some sense I like the term "server-less", but the ground is difficult to defend. Hence the explanation. And I agree this is wholey a marketing invention - a bit too clever ALSO.

To be clear, the "explanation" is to forewarn other developers. Working on "new" technology is always looking back and saying "yes it's like that, but this is better for XYZ reasons.

All the Best
Jesse