When the shoulders of giants are offered, you’d do well to stand on them

Quite often, I read and hear about people who are reluctant to embrace the AWS Serverless ecosystem. Some of their reasons are valid, some are rooted in uncertainty, and in some cases their arguments are very specific to their context. I personally built many small, large, and extremely large Serverless applications. In this article I will provide my view on the most common arguments fielded against it.

The four topics I will cover are:

#1: Serverless locks you in

This is likely the most commonly heard argument against Serverless: it locks you into the AWS ecosystem. Often, the person voicing this opinion prefers container-based ecosystems like Kubernetes instead. They will explain how your workloads are portable if you just embrace k8s. This article will not debate whether workloads on Kubernetes are actually portable. Instead I will ask: is lock-in an objectively bad thing? Or is it a trade-off?

Portability is the ability to lift your workload from one hosting provider or cloud to another. Inherent to this ability is the limitation of using cloud-specific services. For example, you can’t use Amazon S3, DynamoDB, or SQS, because there are no like-for-like replacements in other clouds. There might be distant relatives, but the differences in their APIs and characteristics make a drop-in replacement impossible. Instead, you only have two options to achieve portability:

  1. Use simpler, more common solutions like NFS, MySQL, or Redis
  2. Write middleware to allow your applications to swap between services

The downside of simpler solutions is that they are, well, simpler. You can’t use all the features and integrations that cloud-native solutions have to offer. The downside of middleware is that it needs to be built and maintained, which is no small task and will never equal a native integration.

And so a clear trade-off emerges: you gain portability, at the cost of functionality or developer resources.

AWS Serverless inverts the argument. Instead of the ability to move away from the cloud, you embrace it. You use all the features, functions, and services AWS has to offer, allowing you to move faster, build more scalable systems, and spend less time on the operation and maintenance of your infrastructure. The further you progress on the Serverless scale, the more you delegate to AWS. Don’t want to be responsible for managing compute nodes? Use Fargate. Don’t want to maintain containers either? Use Lambda. Don’t want to write business logic in software? Use Step Functions and Pipes.

Of course these choices lock you in – migrating an extremely Serverless application away from AWS is difficult. But in return for your commitment, you get to delegate the hardest problems of running infrastructure, such as networking, redundancy, high availability, patch management, and scalability, to AWS.

In reality, the decision is not binary. There is no portability-versus-nativeness on/off switch. Instead you choose your position on the Serverless scale based on your business context and requirements. You try to estimate the real need for portability. How likely are you to actually migrate to another cloud? If the answer is “not very”, you might be better off committing to one cloud and embracing all it has to offer.

#2: Serverless is too expensive

The next commonly heard argument against Serverless is that it is too expensive at scale. When people argue the cost of Serverless is too high, they generally compare it against a well-known alternative. A simplified example: a given EC2 instance costs $50 per month, and can handle 10 requests per second. The same volume handled by API Gateway and Lambda would cost about $150, so it’s a bad deal.

But comparisons like these aren’t realistic. First of all, your users are unlikely to generate a uniform, sustained flow of requests. Instead, your traffic might peak at 10 requests per second but drop to close to 0 at night. If average utilization lies around 50%, the $50 EC2 instance would cost about $75 in Serverless.

However, we’re still not comparing apples to apples. In the example above, the Serverless solution would be highly available and autoscaling. The EC2 instance is a single machine. To achieve comparable levels of fault tolerance it would require at least two instances in an autoscaling group and a load balancer. The load balancer easily adds another $50 to your bill, and now Serverless is more cost-effective than the EC2 example.

And still this is not a fair comparison, because the EC2 instance needs maintenance. For example, its operating systems and libraries need to be regularly patched. This will cost hours of engineering time, which is much more expensive than the EC2 instances themselves will ever cost.

And finally, your EC2 instance doesn’t do anything by itself. Your application does all the work, which means you need to write code, or configure libraries, for common functionality like authentication, queuing, failure handling, throttling, monitoring, and so on. All AWS Serverless services offer a plethora of secondary features, often at no additional cost. These features are fully managed by AWS and are built to the highest operational standards. If you take on the responsibility to build and maintain these features yourself, be prepared to add a few zeroes to your development costs.

So looking at the cost per event is too simplistic. We used EC2 as an extreme example, but the same principles apply to orchestrator-based solutions like Kubernetes. The only valid comparison is total cost of ownership (TCO). TCO takes engineering hours, running costs, maintenance, operations, incident response, and everything else into account. In an honest and thorough TCO analysis, you’ll find that the price of Serverless is hard to beat.

Some will argue that at a given scale, level of stability, or maturity, Serverless is no longer the most economical choice. There are very public examples of this, including Amazon Prime Video. If you find you’re thinking in this direction, ask yourself: do you expect to run an application at their scale? And if so, when? Are you willing to reject all the benefits Serverless has to offer, just in case your application hits truly massive scale?

Instead, I suggest you start with Serverless. You hit the ground running, minimize your time to market, and find your product fit, while avoiding much of the undifferentiated heavy lifting like setting up networking and container infrastructure. And keep in mind: choosing Serverless is not a one-way door. If your application becomes massively popular, you can always choose to rearchitect all or part of it. That’s also what Prime Video did.

#3: Serverless is too difficult

The third common complaint about Serverless is that it is too difficult. Every service has dozens of options, features, and billing modes. They all have their edge cases, failure modes, and limitations. Their APIs and resource definitions are not quite consistent, and there are thousands of smaller and larger releases every year. It is a lot.

But not every engineer and architect needs to know every detail of every service. There are some tools – like the CDK, Amplify, and Powertools – that mitigate some of the rough edges through abstraction, sensible defaults, and embedded best practices. These can help a team be productive and efficient by hiding the most difficult parts of Serverless development. Whether hiding complexity is the right solution, however, is debatable.

In fact, one of the core reasons Serverless is considered difficult is because its complexity is undeniable. The solution design of a simple web application might involve API Gateway, Lambda, S3, CloudFront, and DynamoDB. A more advanced application adds SQS, EventBridge, Step Functions, Cognito, and so on.

However, a non-Serverless application with the same functionality and business value also has all these components. In a monolith hosted on EC2, for example, the API Gateway might be hidden in the application’s routing logic, user management might be a local component, and the queues might live in memory, but they are still there. In this case the complexity of the application is hidden behind the castle walls of the monolith.

I believe the forced visibility of complexity in Serverless is good. It might be uncomfortable at first, but it enables engineers and architects to have an open conversation about their applications. It makes writing documentation easier and more consistent. It eases onboarding new engineers. And it allows cross-team collaboration on common patterns, building blocks, and reference architectures. If your teams are reluctant to embrace serverless because it is too complex, you should wonder where that complexity is currently hidden, and what the implications of the answer are.

It’s also possible that a current non-Serverless application is actually less complex than its Serverless counterpart would be. For example, the current solution might not involve a load balancer, so the developers never need to think about parallelism or shared storage. Or the current design does not include a queue, because the developers did not envision request surges in their design. If this is the case, ask yourself if simpler is indeed better. The article When Taylor Swift crashed Ticketmaster: A lesson on scaling for spikes might help shape your opinion.

A Serverless application is inherently distributed. Among other things, this powers the scalability, high availability, and concurrency of high-volume applications. But distributed systems are hard. You have to take into account transient failures, duplication, congestion, throttling, buffering, poison pills, sharding, scaling ramps, and many, many more complex topics. The thing is.. if you’re building a non-Serverless application for scale, you should be thinking about the same things. But when the platform doesn’t force you to, many teams don’t. And you only find your application’s limits when it’s too late.

In summary: to build Serverless solutions your software developers also need to be architects and distributed systems engineers. This may be uncomfortable, and might require training and difficult conversations. But if you succeed, you’re setting up your teams and applications for a bright future of more business value, fewer incidents, and higher scalability. Remember the adage of the CFO who asks: what happens if we educate our people and they leave? To which the CTO responds: what if we don’t and they stay?

#4: Serverless is too opinionated

The last topic I want to address is that Serverless is too opinionated. There is no argument that Serverless is opinionated, but let’s focus on the “too” part.

For context, we need to talk about the mind-blowing scale of Serverless services. Amazon S3 processes 100 million requests per second, DynamoDB peaked at 126 million requests per second during Prime Day 2023, and Lambda processes more than 10 trillion invocations per month. Running infrastructure at those scales requires very strict architectural principles.

One of these principles is that processes and requests need to be distributable across many servers, racks, data centers and availability zones. This allows any given workload to be hosted on a large and dynamic fleet of instances, which prevents hotspots and congestion.

Another principle is multi-tenancy, where many different customers share the same underlying resources. Multi-tenancy allows the system to provision capacity for its collective user base. This dampens any surges caused by individual customers; the spikes they generate are simply lost in an ocean of capacity.

One possible negative effect of multi-tenancy is what is known as “noisy neighbors”. The noisy neighbor problem occurs when one customer executes some large operation and it affects other customers on the same physical infrastructure. This problem can be prevented by imposing quotas and throttling; by artificially limiting the scale at which individual customers can operate, AWS protects the shared infrastructure. These limits are not set in stone though; they can often be increased if your use case requires it.

The scale at which AWS operates allows them to develop services no smaller-sized company could build. For example, (almost) no company could build a service like S3 or Lambda. Maybe something like it, with the same type of functionality and APIs. But nothing offering the same scale, stability, reliability, continued evolution, and cost-effectiveness AWS provides. To be able to provide these services, AWS needs them to be opinionated. The limits and restrictions are simply a reflection of the internal architecture that powers them.

However, there is another reason Serverless is opinionated. AWS has decades of experience designing, building, and maintaining the largest applications and services in the world. They have experienced, firsthand, what works and what hurts. Their learnings are widely shared and documented, for example in the well-architected framework. The services in the Serverless ecosystem are the embodiment of that experience. By using Serverless services to their fullest extent, your applications automatically achieve a base level of maturity typically only gained through blood, sweat, tears, and many stressful nights of on-call duties.

So instead of approaching perceived limitations or peculiarities with dread and frustration, I suggest you view them as an opportunity to ask: why can’t I do this? What are the possible implications of this approach, and are there alternatives? You might find that architecting your applications within the given boundaries leads to more reliable and scalable solutions.

Conclusion

AWS has the ultimate economy of scale. The services they offer, like Lambda, DynamoDB, or S3, are used by millions of customers. This allows them to invest in the most mature architectures, most powerful networks, highest standards of operational excellence, and smartest CI/CD processes. The resulting products are more stable, reliable, performant, scalable, and feature-rich than anything you can build yourself. They are available to you with the click of a button, and the hardest problems – actually running these services – are entirely AWS’s responsibility.

Of course, Serverless isn’t free – well, it might be at small scale, but not for any significant workload. However, its price is completely predictable and elastic; you pay for what you use. Baked into the price is the research and development of the services. The physical infrastructure it runs on. The operational responsibility. Acquisition of additional capacity. Maintenance of operating systems and runtimes. Continuous improvement of the service and the developer experience. Maintaining security of the cloud. The list is almost endless. If you calculate the cost of all of these responsibilities, the price of Serverless is a bargain.

Serverless services are built for any and every scale. Amazon has an enormous amount of experience running the largest workloads in the world. They know how to build applications that scale. This knowledge is codified into Serverless and its limitations. By embracing Serverless, you’re setting yourself up to work within proven constraints – the kind of restrictions that allow you to scale.

In summary, AWS is a giant. The biggest cloud provider by any measure. They know how to build, operate, and scale services like literally no other. AWS Serverless is the most opinionated, most explicit expression of their insights and experiences. And it is ready and waiting for your applications.

When the shoulders of giants are offered, you’d do well to stand on them.


Posted

in

Blog at WordPress.com.