Skip to main content

05 Scaling Serverless Architectures

Thinking Serverless at Scale

Scalability means that successful, growing systems often see an increase in demand over time. A system that is scalable can adapt to meet this new level of demand.

Build today with tomorrow in mind

You can update your architectures separately and iterate and learn from each iteration. This modularity aims to break your complex components or solutions into smaller parts that are less complicated and easier for you to scale, secure, and manage.

Scaling best practices

  • Separate your application and database
  • Take advantage of the AWS Global Cloud Infrastructure
  • Identify and avoid heavy lifting.
  • Monitor for percentile.
  • Refactory as you go.

Scaling considerations for Serverless Services

Scaling considerations

  • Timeouts
  • Retry behaviours
  • Throughput
  • Payload size

Scaling considerations for API gateway

  1. Trade-offs and optimizations are key.
  2. Do production-like load testing end to end.
  3. Stay abreast of updates to the services and take advantage of improvements.

API Gateway features help you manage access patterns

API Gateway is your front door, and you have configuration options for each API that can help you manage the access pattern that you’re expecting.

  1. Edge-optimized endpoints

Edge-optimized endpoints have a built-in Amazon CloudFront distribution to serve content quickly for geographically dispersed clients.

  1. Throttling options

Set throttling limits by method. Set up API keys and usage plans to throttle request volume client by client.

  1. Optional cache

Optional API Gateway cache can reduce hits on your backend.

API Gateway features

Scaling considerations for Amazon SQS

Charateristics of an SQS queue as a Lambda event source

ParameterValue or LimitHow the Parameter is Set or Changed
Number of messages that can be in a batch1 to 10Configured with the event source on the Lambda function
Number of default pollers (batches returned at one time)5Managed by the Lambda service
Rate at which Lambda increases the number of parallel pollersUp to 60 per minuteManaged by the Lambda service
Number of batches that Lambda manages simultaneouslyUp to 1,000Managed by the Lambda service
Number of Lambda functions that can be running simultaneouslyThe lesser of 1,000 functions and the account limitConfigured by setting a limit (reserved concurrency) on the function
Messages per queueNo limitN/A
Visibility timeout0 seconds to 12 hoursConfigured on the queue
Number of retries1 to 1,000Configured on the queue (maxReceiveCount)
Function timeout0 seconds to 15 minutesConfigured on the function