Time To Rethink Cattle vs. Pets (serverless)
AWS’s re:Invent conference always involves a large set of announcements, followed by weeks of experts, myself included, trying to figure out what it all means. Often times large debates are created in the process and this year is no exception.
One of the most interesting debates is about the definition of serverless. It used to be that serverless meant Lambda … after all Lambdas don’t have servers (we’re going to ignore the reductionist argument that eventually everything runs on something). Over the years the definition of serverless was expanded to include things like S3 or Aurora where we knew there were computers someplace but we didn’t have to pay attention to them. This year as AWS announced that almost all of their servers had at least the option to run without you the user owning the responsibility for the underlying hardware the debate expanded to ask the question if “serverless” had any meaning?
This whole discussion has been colored by the metaphor of “Cattle vs. Pets”, and I now think that base metaphor has to change.
Pets vs cattle was based on the notion that we care about our pets, we love them, name them, grieve for them when they die. Conversely, so the metaphor goes, we don’t name individual cattle and we don’t’ care when they die, we just replace them. For the past tens years or so we’ve just accepted this as both true and as a great metaphor for dedicated EC2 instances vs. AutoScaling Groups (ASG)/ Containers / Serverless.
I’m pretty sure that the metaphor wasn’t created by someone who had spent any time on a ranch or a farm. I have, and I’m here to tell you that the death of an individual cattle is not a non-event. One has to determine what caused the death, find out if more deaths should be expected, and you incur a variety of costs for the lost individual.
By the same token, losing a member of an ASG is also not a non-event. Pretty much the same analysis goes on to see if the failure was a one-off or the sign of a cascading failure.
I think it’s time to change the base metaphor. I think a more accurate metaphor is “Pets vs. Cattle vs FastFood”. (I’m serious here, hear me out).
In this metaphor pets continue to map to dedicated EC2 instances, cattle maps to ASGs, and fast-food maps to serverless.
Think about a burger at a fast food restaurant. That burger has a whole lifecycle before it gets to your tray. It began as a cattle, got processed into hamburger, got shipped to a restaurant, got put on a grill, got assembled into a burger and then placed on your tray. The key point is that it didn’t become your burger until the very last step. You not only were aware of which patty was to become your burger, you in fact could not know. It was just a resource (spare cycles?) until it landed on your tray.
The analogy with serverless is that when you run Aurora or Lambda or S3 you fundamentally do not have access to the resources you will be given until they are in fact assigned to you. Even then, you have no way of knowing which fundamental hardware is servicing your system. To me, that should be the new definition of serverless: “a system is serverless to the degree that you do not have knowledge of the hardware/instance running your system.”. If you’re running a Lambda you have certain cpu/memory based on what you requested but that’s it. If you’re running ECS/EKS with Fargate you have access to your container but not to the host instance.
The shift in thinking is that for cattle/ASGs we don’t care about the particular individual, but we care very much about the collection made up by the individuals. And that caring translates into responsibility to observability, monitoring, troubleshooting and remediation of the set of individuals. If an availability zone runs out of an instance type needed by our ASG its up to us to take action. By contrast, if an AZ ran out of instances for a Lambda (which I’ve never even heard of happening), there really wouldn’t be anything we could do.
In other words, Cattle isn’t the end of the server/serverless spectrum, its actually a midpoint.
Once we understand that I think we’ll be in a better position to discuss just what serverless is now and what it might be in the future.