Lambda Cold Starts and Bootstrap Code

AWS Lambda has a typical serverless pay-per-use pricing model. You can deploy a Lambda function and it will incur no costs if you never invoke it. To support this pricing model, AWS has to make sure the Lambda function consumes almost no resources when it’s idle – and the only way to achieve that is to deploy the function to compute nodes only when it needs to run. This introduces a short delay before your function runs: the code needs to be copied and an execution environment needs to be spun up, which takes time. This is called a cold start.

Once a function has been run, its execution environment will stay available for a while. If the function is invoked again while the execution environment is still available, Lambda will reuse the existing environment. This time, deployment and initialization code do not need to run again, and the function will execute much faster. This process is called a warm start.

There are two main causes for cold starts:

  1. The function hasn’t been invoked for a while and all its execution environments have been terminated.
  2. The function is invoked more than once at the same time. The first environment is still working on the first request, and a new, second environment needs to be spun up to serve the second request. This is called concurrency.

Init and handler code

When a Lambda function goes through a cold start, AWS first downloads the function’s code to an compute environment. Then it starts an execution environment, which is a micro virtual machine (or microVM) based on Firecracker. Then it runs the initialization part of your code, also known as the bootstrap. When these steps have completed, your Lambda function is ready handle its request.

The actual request is performed by the Lambda execution handler – the function in your code you’ve designated as the entry point for your Lambda function. In the screenshot below you can see the function lambda_handler() in the file index.py has been configured as the handler function.

When an execution environment is available and idle, a second invocation will reuse the environment and go through a warm start. In this scenario, the compute environment and microVM already exists, and your initialization code will be skipped. Instead, only your handler code will be executed, which will result in a much faster execution.

Cold starts and warm starts can be distinguished in the REPORT lines of the Lambda function’s CloudWatch Logs. A cold start will include an Init Duration field, like this:

REPORT RequestId: d7af5927-cd7c-4e54-a4cb-db004bd2419e
Duration: 1.40 ms
Billed Duration: 2 ms
Memory Size: 128 MB
Max Memory Used: 63 MB
Init Duration: 165.87 ms

While a warm start will not:

REPORT RequestId: fa6ce55a-7d6b-4dba-a670-6f7cb17f1472
Duration: 1.24 ms
Billed Duration: 2 ms
Memory Size: 128 MB
Max Memory Used: 63 MB

From these logs we can see that the handler code only took 2 ms to execute, while the init code added a whopping 166 ms. We can also see that the Billed Duration is only 2 ms. In other words, the init phase is free. Later in this article we will see how we can use this to our advantage. But first, let’s look at the actual init and handler code.

In this example, lines 1 to 6 are the init code which will only execute on cold starts. The Lambda handler on lines 9 to 17 will execute on every invocation. The logic behind what goes where is simple: the init section has code which will have exactly the same result for every execution, regardless of input. The Lambda handler has code that will have a different result for every execution, based on user-provided input.

Bootstrap code gets more CPU power

We now know that init code only runs once per execution environment, and the handler code might be executed hundreds of times within the same execution environment. But there is another important difference between init code and handler code. This difference cannot be found in the official documentation, but it can have a big impact on how you structure your code: if your function is configured at less than 3 GB of memory, the init code will execute faster than your handler code.

With AWS Lambda, you can only configure the amount of memory available for a function. The CPU performance (and even the number of vCPUs) scales with the memory configuration. In the lower band of memory configurations (e.g. 128 MB, 256 MB, 512 MB or 1 GB), the CPU is throttled: you get less than one vCPU worth of processor power. But this throttling is not applied to bootstrap code. The idea is that throttling bootstrap code would lead to very slow cold starts, which would result in a bad user experience. So you get the full, unthrottled two vCPUs during the init phase.

The difference is most profound at the lowest memory configuration: 128 MB. To demonstrate the difference, let’s take a look at a simple CPU intensive benchmark:

"""Function to benchmark the Lambda bootstrap and handler."""

import time
from math import sin, cos, radians


def lambda_handler(_event, context):
    """Run the main lambda function."""
    print("# In handler")
    print(f"** Lambda configured with {context.memory_limit_in_mb} MB of memory")
    benchmark(phase="Handler")
    print("# Handler done")


def benchmark(phase):
    start_time = time.time()
    bench_cpu()
    print(f"## {phase} - CPU benchmark duration: {time.time() - start_time}")


def bench_cpu():
    product = 1.0
    for _ in range(1, 10000, 1):
        for dex in list(range(1, 360, 1)):
            angle = radians(dex)
            product *= sin(angle) ** 2 + cos(angle) ** 2
    return product


print("# In bootstrap")
benchmark(phase="Bootstrap")

Running this code in a 128 MB function yields the following output, which tells us the bootstrap was 13.6 times as fast as the handler.

Function Logs
START RequestId: 532b980c-2049-446e-8f78-1ad381b392b0 Version: $LATEST
# In bootstrap
## Bootstrap - CPU benchmark duration: 1.4617431163787842
# In handler
** Lambda configured with 128 MB of memory
## Handler - CPU benchmark duration: 19.891417741775513
# Handler done

When we configure the Lambda function at 3008 MB (the value at which the Lambda function handler gets two unthrottled vCPUs), we can see that the handler and bootstrap are equally fast.

Function Logs
START RequestId: 8f0b6798-21c6-4f69-b774-bdadb6d046db Version: $LATEST
# In bootstrap
** Bootstrap - CPU benchmark duration: 1.337852954864502
# In handler
## Lambda configured with 3008 MB of memory
** Handler - CPU benchmark duration: 1.3489363193511963
# Handler done

Above 3008 MB the bootstrap code runs at the same configuration as your handler. For example, at 10 GB, both the bootstrap and handler have six unthrottled vCPUs available to them:

Function Logs
START RequestId: f6d4bbce-0bca-49e9-aed0-80705f999288 Version: $LATEST
# In bootstrap
** Bootstrap - CPU benchmark duration: 1.3585302829742432
# In handler
## Lambda configured with 10240 MB of memory
** Handler - CPU benchmark duration: 1.3792297840118408
# Handler done

Conclusion

Many Lambda functions are configured to use a small amount of memory. By following best practices and moving as much code as possible to the Lambda’s bootstrap, these functions will run more efficiently (and thus cheaper) in two ways:

  1. The bootstrap code will only execute once per (hopefully) many Lambda invocations.
  2. When the bootstrap code gets executed, it will run faster than it would in the handler (for Lambda memory configurations up to 3 GB).

When designing and writing your Lambda functions, try to limit your handler functions to only the bare necessities to perform its task. Importing modules and classes, creating class instances, initializing configuration values and environment variables, and any other non-core functionality should be moved to the bootstrap instead.


Posted

in

Blog at WordPress.com.