~/blog/aws-lambda-best-practices-production-2026
zsh
CLOUD

AWS Lambda Best Practices 2026: What We Actually Use in Production

Engineering Team 2026-03-19

We run Lambda functions in production across multiple client accounts. Some handle 50 requests per day. Others handle 50 million. The best practices that matter depend heavily on scale, but the fundamentals apply everywhere.

This is not a rehash of the AWS documentation. These are the practices we actually follow, the mistakes we have fixed, and the patterns that work in 2026.

Cold Start Optimisation

Cold starts are Lambda’s biggest practical limitation. A cold start happens when Lambda creates a new execution environment — loading your code, initialising the runtime, and running your handler for the first time.

Actual Cold Start Times (2026 Benchmarks)

RuntimeCold Start (p50)Cold Start (p99)Notes
Python 3.13200-400ms800ms-1.2sFastest scripting runtime
Node.js 22200-350ms600ms-1sGood general choice
Go50-100ms150-250msNear-zero cold starts
Rust50-80ms100-200msFastest overall
Java 212-5s6-10sWithout SnapStart
Java 21 + SnapStart90-140ms200-400msDramatically better
.NET 8 (Native AOT)200-400ms500-800msAOT required for good performance

Fix 1: Use ARM64 (Graviton2)

Switch every function to ARM64. It is a one-line change that improves both performance and cost:

# SAM template
Resources:
  MyFunction:
    Type: AWS::Serverless::Function
    Properties:
      Runtime: python3.13
      Architectures:
        - arm64    # 20% cheaper, 15-40% faster
      Handler: app.handler

ARM64 functions are 20% cheaper per GB-second and run 15-40% faster than x86 equivalents. There is no reason to use x86 for new Lambda functions in 2026 unless you have a compiled dependency that does not support ARM.

Fix 2: Minimise Package Size

Every megabyte of deployment package adds to cold start time. The runtime has to download, decompress, and load your code:

# Python: Use Lambda layers for large dependencies
# Bad: 50MB deployment package
pip install pandas numpy scipy -t ./package/

# Good: Split into layer (loaded once, cached)
pip install pandas numpy scipy -t ./layer/python/
zip -r layer.zip layer/
aws lambda publish-layer-version \
  --layer-name data-deps \
  --zip-file fileb://layer.zip \
  --compatible-runtimes python3.13
// Node.js: Tree-shake with esbuild
// Bad: node_modules with 200MB of unused code
// Good: Bundled to a single file
// esbuild config
import { build } from 'esbuild';

await build({
  entryPoints: ['src/handler.ts'],
  bundle: true,
  minify: true,
  platform: 'node',
  target: 'node22',
  outfile: 'dist/handler.js',
  external: ['@aws-sdk/*'],  // AWS SDK v3 is included in runtime
});

Fix 3: Initialise Outside the Handler

Code outside the handler function runs once during cold start and is reused for subsequent invocations:

# Good: Connection created once, reused across invocations
import boto3
import psycopg2
import os

# These run ONCE during cold start
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table(os.environ['TABLE_NAME'])

# Database connection pool (reused)
conn = psycopg2.connect(
    host=os.environ['DB_HOST'],
    dbname=os.environ['DB_NAME'],
    user=os.environ['DB_USER'],
    password=os.environ['DB_PASSWORD']
)

def handler(event, context):
    # This runs on EVERY invocation
    # conn and table are already initialised
    result = table.get_item(Key={'id': event['id']})
    return {'statusCode': 200, 'body': result}

Fix 4: Use SnapStart for Java

If you run Java on Lambda, SnapStart is mandatory. It creates a snapshot of the initialised execution environment, reducing cold starts from 5-10 seconds to 90-140ms:

Resources:
  JavaFunction:
    Type: AWS::Serverless::Function
    Properties:
      Runtime: java21
      Architectures: [arm64]
      SnapStart:
        ApplyOn: PublishedVersions

Fix 5: Provisioned Concurrency for Latency-Critical Paths

For user-facing APIs where cold starts are unacceptable, pre-warm execution environments:

Resources:
  ApiFunction:
    Type: AWS::Serverless::Function
    Properties:
      AutoPublishAlias: live
      ProvisionedConcurrencyConfig:
        ProvisionedConcurrentExecutions: 10

Cost warning: Provisioned Concurrency charges whether the functions are invoked or not. At 10 concurrent instances, you pay approximately $80-120/month. Only use this for latency-critical, high-traffic endpoints. For everything else, accept the occasional cold start.

Memory and Performance Tuning

Lambda allocates CPU proportionally to memory. More memory = more CPU = faster execution. Sometimes increasing memory reduces both execution time AND cost:

The Memory-Cost Experiment

Run your function at different memory settings and measure:

# Use AWS Lambda Power Tuning (open-source tool)
# Deploy: https://github.com/alexcasalboni/aws-lambda-power-tuning

# It runs your function at 128MB, 256MB, 512MB, 1024MB, etc.
# and charts execution time vs cost

# Common finding:
# 128MB: 3000ms execution, $0.0000625 per invocation
# 512MB: 800ms execution,  $0.0000667 per invocation  (7% more expensive, 4x faster)
# 1024MB: 400ms execution, $0.0000667 per invocation  (same cost, 7.5x faster!)

The sweet spot: For most functions, 512MB-1024MB provides the best cost-performance ratio. Going below 256MB rarely saves money because execution time increases proportionally.

Right-Size with AWS Compute Optimiser

Enable Compute Optimiser for Lambda — it analyses your functions and recommends memory settings based on actual usage patterns. This is free and often identifies functions that are either over-provisioned or under-provisioned.

Architecture Patterns

Pattern 1: API Gateway + Lambda (Synchronous)

The standard serverless API pattern:

Client → API Gateway → Lambda → DynamoDB/RDS
                              → Return response

Best practices:

  • Use API Gateway HTTP APIs (not REST APIs) — 71% cheaper, lower latency
  • Enable response caching for read-heavy endpoints
  • Use Lambda Proxy integration for simpler code
  • Set appropriate timeouts (API Gateway: 29s max, Lambda: match or lower)

Pattern 2: Event-Driven Processing (Asynchronous)

S3 Upload → SQS Queue → Lambda → Process → Store result
SNS Topic → Lambda → Send notification
EventBridge → Lambda → Scheduled task

Best practices:

  • Always use SQS between event source and Lambda for buffering and retry
  • Configure Dead Letter Queues (DLQ) for failed messages
  • Set maxBatchingWindow to batch events and reduce invocations
  • Use ReservedConcurrentExecutions to prevent downstream overload
Resources:
  ProcessorFunction:
    Type: AWS::Serverless::Function
    Properties:
      Events:
        SQSTrigger:
          Type: SQS
          Properties:
            Queue: !GetAtt ProcessingQueue.Arn
            BatchSize: 10
            MaximumBatchingWindowInSeconds: 5
      ReservedConcurrentExecutions: 50  # Protect downstream services
      DeadLetterQueue:
        Type: SQS
        TargetArn: !GetAtt DLQ.Arn

Pattern 3: Fan-Out (Parallel Processing)

Input → Lambda (coordinator) → SQS → N × Lambda (workers) → Aggregate

Use Step Functions for complex orchestration with error handling, retries, and parallel execution branches. Lambda alone cannot coordinate multi-step workflows reliably.

Observability

Lambda functions are black boxes without proper observability. Here is our standard setup:

Structured Logging

import json
import logging

logger = logging.getLogger()
logger.setLevel(logging.INFO)

def handler(event, context):
    # Structured JSON logging
    logger.info(json.dumps({
        "message": "Processing request",
        "request_id": context.aws_request_id,
        "function_name": context.function_name,
        "memory_limit": context.memory_limit_in_mb,
        "event_source": event.get("source", "unknown"),
    }))

    # Your logic here
    result = process(event)

    logger.info(json.dumps({
        "message": "Request completed",
        "request_id": context.aws_request_id,
        "items_processed": len(result),
    }))

Metrics with CloudWatch Embedded Metrics Format

from aws_embedded_metrics import metric_scope

@metric_scope
def handler(event, context, metrics):
    metrics.set_namespace("MyApp")
    metrics.put_dimensions({"Service": "OrderProcessor"})

    start = time.time()
    result = process_order(event)
    duration = time.time() - start

    metrics.put_metric("ProcessingDuration", duration, "Seconds")
    metrics.put_metric("OrdersProcessed", 1, "Count")

    if result.get("error"):
        metrics.put_metric("ProcessingErrors", 1, "Count")

Distributed Tracing

Enable X-Ray tracing for every function. It adds 1-2ms overhead but gives you end-to-end visibility:

Globals:
  Function:
    Tracing: Active

Resources:
  MyFunction:
    Type: AWS::Serverless::Function
    Properties:
      Policies:
        - AWSXRayDaemonWriteAccess

For more advanced observability, integrate with Prometheus and Grafana via CloudWatch metric streams.

Security Best Practices

Least-Privilege IAM

Every Lambda function gets its own IAM role with only the permissions it needs:

# Bad: Wildcard permissions
Policies:
  - Statement:
    - Effect: Allow
      Action: "dynamodb:*"
      Resource: "*"

# Good: Specific actions on specific resources
Policies:
  - Statement:
    - Effect: Allow
      Action:
        - dynamodb:GetItem
        - dynamodb:PutItem
      Resource: !GetAtt OrdersTable.Arn

Secrets Management

Never put secrets in environment variables as plaintext. Use AWS Secrets Manager or Parameter Store with the Lambda extension:

import json
import boto3
from functools import lru_cache

@lru_cache(maxsize=1)
def get_secret(secret_name):
    client = boto3.client('secretsmanager')
    response = client.get_secret_value(SecretId=secret_name)
    return json.loads(response['SecretString'])

# Called once, cached for the lifetime of the execution environment
db_credentials = get_secret('prod/db-credentials')

VPC Considerations

Only put Lambda in a VPC if it needs to access VPC resources (RDS, ElastiCache, internal services). VPC-attached Lambdas used to have terrible cold starts, but Hyperplane ENI has mostly resolved this. Still, avoid VPC unless necessary.

Cost Optimisation

1. Right-Size Memory

As covered above, more memory is often cheaper because execution is faster. Use AWS Lambda Power Tuning to find the optimal setting.

2. Use ARM64 Everywhere

20% price reduction, no code changes for most runtimes. This is free money.

3. Batch Events

Process multiple records per invocation instead of one. SQS batching reduces invocation count by 10x:

Events:
  SQSTrigger:
    Type: SQS
    Properties:
      BatchSize: 10                        # Process 10 messages per invocation
      MaximumBatchingWindowInSeconds: 30    # Wait up to 30s to fill the batch

4. Avoid Lambda for Steady-State

If a function runs continuously (invoked every second, 24/7), consider moving it to Kubernetes or Fargate. Lambda’s per-invocation pricing loses to always-on compute at sustained high throughput.

5. Monitor with Cost Allocation Tags

Tag every function with team, project, and environment:

Resources:
  MyFunction:
    Type: AWS::Serverless::Function
    Properties:
      Tags:
        Team: payments
        Project: order-processing
        Environment: production

Then track costs per tag in AWS Cost Explorer.


Building Serverless on AWS?

We design and deploy serverless architectures on AWS — from single-function APIs to complex event-driven systems processing millions of events daily.

Our AWS managed services cover:

  • Serverless architecture design — API Gateway, Lambda, DynamoDB, SQS, Step Functions
  • Performance optimisation — cold start reduction, memory tuning, provisioned concurrency
  • Observability setup — structured logging, X-Ray tracing, custom CloudWatch dashboards
  • Cost optimisation — right-sizing, ARM64 migration, batch processing
  • Hybrid architecture — combine Lambda with EKS for optimal cost and performance

Talk to our AWS serverless experts →

Continue exploring these related topics

$ suggest --service

Need AWS expertise?

From migration to managed services, we help teams get the most out of AWS.

Get started
Chat with real humans
Chat on WhatsApp