ThrottlingException

AWS Lambda Throttling: Fix Reserved Concurrency Limits Fast

Server & Cloud Intermediate 👁 1 views 📅 May 29, 2026

Lambda throttles when reserved concurrency caps max executions. Here's how to fix the most common causes and get your functions running again.

1. Reserved Concurrency Set Too Low (The Most Common Cause)

Had a client last month whose order processing Lambda kept throwing ThrottlingException. They'd set reserved concurrency to 5 because they thought it'd save money. Their traffic spikes hit 50 concurrent invocations during peak hours. Result? 90% of requests failed with 429 errors.

Reserved concurrency is a hard cap. Once your Lambda hits that number, it throttles everything else—even if the account-level concurrency limit is 1000. The fix is dead simple: increase the reserved concurrency value, or remove it if you don't need strict limits.

How to check and fix it

Open the AWS Console, go to Lambda, select your function, then click "Configuration" > "Concurrency". Look at the "Reserved concurrency" setting. If it's anything less than your peak load, that's your problem.

To fix:

  1. Set reserved concurrency to something higher—say 100 or 200 depending on your traffic. Or just check "Use unreserved account concurrency" to remove the cap entirely.
  2. Monitor with CloudWatch metrics: ConcurrentExecutions and Throttles. If Throttles spikes, you're hitting that ceiling.
  3. Set a CloudWatch alarm on Throttles > 0 for more than 5 minutes. It'll email you before customers complain.

Real talk: Reserved concurrency is useful for prioritizing critical functions, but don't set it arbitrarily low. That's how you break things.

2. Unreserved Concurrency Exhausted by Other Functions

Here's a fun one. You've checked your function's reserved concurrency—it's fine (or not set). Yet you're still getting throttled. I've seen this when someone's CI/CD pipeline Lambda grabs all the account-level concurrency, starving your function.

AWS Lambda accounts have a soft limit of 1000 concurrent executions by default (Region-dependent). If all your functions combined hit that, every new invocation gets throttled. No function is safe.

How to diagnose

Go to CloudWatch, check the ConcurrentExecutions metric across all functions. If the total is near 1000 (or your account limit), there's your bottleneck. Also check the UnreservedConcurrentExecutions metric—that shows what's available for functions without reserved concurrency.

Fix it

  1. Request a service quota increase from AWS Support. I've done it in 24 hours. Ask for 5000 or 10000 if you're scaling hard.
  2. Set reserved concurrency on the greedy function that's hogging resources. For example, cap that CI/CD Lambda at 200 so it doesn't eat everything.
  3. Use a Lambda queue with SQS or EventBridge to smooth out traffic. Slow functions can't monopolize concurrency if they're queued.

Pro tip: Don't request a limit increase unless you've actually hit it. AWS might ask for your use case. Be ready to explain.

3. Burst Concurrency Limits (Short-Lived Spikes)

This one's sneaky. Your function runs fine 99% of the time, but every day at 3 PM, it throttles for about 30 seconds. Reserved concurrency is fine, account-level is fine—what gives?

It's the burst concurrency limit. AWS Lambda allows a burst of 500-3000 concurrent invocations per Region (depends on the Region and function memory). After that, the function scales at 500 per minute. If your traffic spikes faster than that, you get throttled during the scaling ramp-up.

I had a client running a flash sale site. First 5 seconds: 4000 invocations. Burst limit: 1000. Throttled for 45 seconds until the scaling caught up.

How to fix it

  1. Pre-warm your Lambda with a CloudWatch Events rule that invokes your function every 5 minutes for 10 minutes before the spike. This forces AWS to keep execution environments ready.
  2. Use Provisioned Concurrency to keep a set number of environments warm. Set it to your expected burst (say 500). You'll pay for idle time, but it beats throttling.
  3. Add a buffer with SQS. Send invocations to a queue, and have Lambda poll it at a rate it can handle. This spreads the load and avoids the burst limit entirely.

Choose Provisioned Concurrency if latency matters. Choose SQS if you can handle delays.

Quick-Reference Summary Table

Cause What Happens Fix
Reserved concurrency too low Function hits hard cap, throttles immediately Increase reserved concurrency or remove it
Unreserved concurrency exhausted Account-level limit hit by other functions Request quota increase or cap greedy functions
Burst concurrency limit Short spike exceeds Regional burst limit Pre-warm, use Provisioned Concurrency, or buffer with SQS

Your move: check your function's reserved concurrency first. That's the fix in 80% of cases. If it's not that, look at account-level limits and burst behavior. You'll have it sorted in 20 minutes.

Was this solution helpful?