AWS Lambda VPC Timeout: Fix Network Latency
Your Lambda function in a VPC times out because network calls get stuck. We'll fix the most common causes first: NAT gateway routing, then security group rules, then subnet choices.
1. Missing or Misconfigured NAT Gateway
This is the #1 cause of Lambda timeouts in a VPC. Here's what's actually happening: when you attach a Lambda function to a VPC, it loses internet access unless you route outbound traffic through a NAT gateway or NAT instance. Without that route, any call to an external service (S3, DynamoDB, an external API) times out after the default 3 seconds.
The fix: Check your VPC route tables. The private subnet where your Lambda runs must have a default route (0.0.0.0/0) pointing to a NAT gateway in a public subnet. Don't just verify the route exists — verify the NAT gateway itself is alive. I've seen cases where the NAT gateway was deleted or stopped, but the route still referenced it.
# Check route table association for your Lambda's subnet
aws ec2 describe-route-tables --filters Name=association.subnet-id,Values=subnet-xxx
# Look for a route: 0.0.0.0/0 -> nat-xxx
# Then check the NAT gateway state
aws ec2 describe-nat-gateways --nat-gateway-ids nat-xxx
# State must be 'available', not 'deleted' or 'pending'
If you're using a NAT instance instead of a NAT gateway, also check that the instance's source/destination check is disabled. If it's enabled, the instance drops traffic not destined for itself, and your Lambda hangs until timeout.
2. Security Group or NACL Blocking Traffic
Even with a working NAT gateway, security group rules can silently drop traffic. Lambda functions attached to a VPC still need outbound rules that allow traffic to the NAT gateway's IP, or more broadly, all traffic (0.0.0.0/0) on ephemeral ports (1024-65535). The reason this matters: Lambda's outbound connections use random high ports, and if the security group doesn't allow response traffic back on those ports, the connection stalls.
The real-world trigger: You copy a security group from an EC2 instance that only allows SSH (port 22) and HTTPS (443) outbound. That works for the EC2 instance because you configure the apps to use port 443. But Lambda's runtime uses ephemeral ports for the response — the actual outbound call goes to 443, but the return traffic comes back on, say, port 54000. If that's blocked, you get a timeout.
The fix: Add a single outbound rule: all traffic, 0.0.0.0/0. If your compliance team doesn't allow that, at minimum allow all TCP traffic on ports 1024-65535. Also check the network ACL at the subnet level — NACLs are stateless, so you need both inbound and outbound rules for ephemeral ports.
# Lambda security group outbound rules (AWS CLI)
aws ec2 authorize-security-group-egress --group-id sg-xxx --protocol tcp --port 0-65535 --cidr 0.0.0.0/0
# Or more restrictive: allow ephemeral ports
aws ec2 authorize-security-group-egress --group-id sg-xxx --protocol tcp --port 1024-65535 --cidr 0.0.0.0/0
3. Subnet Placement Without VPC Endpoints
If your Lambda function only needs to access AWS services (S3, DynamoDB, SQS) and doesn't need full internet access, you can skip the NAT gateway entirely by using VPC endpoints. The problem many people hit: they put Lambda in a private subnet without internet access and without VPC endpoints, then wonder why calls to S3 time out.
The nuance: A VPC endpoint for S3 (gateway type) is free and doesn't need a NAT gateway. But a VPC endpoint for DynamoDB is also gateway type. For services like SQS, SNS, or API Gateway, you need interface endpoints, which cost money per hour. You decide based on your traffic patterns.
The fix: Create gateway endpoints for S3 and DynamoDB. Add routes in your private subnet's route table pointing to those endpoints. For interface endpoints, create them and update the security group to allow traffic on port 443 to the endpoint's prefix list.
# Create gateway endpoint for S3
aws ec2 create-vpc-endpoint --vpc-id vpc-xxx --service-name com.amazonaws.us-east-1.s3 --route-table-ids rtb-xxx
# Get the prefix list ID for interface endpoints
aws ec2 describe-prefix-lists --filters Name=prefix-list-name,Values=com.amazonaws.us-east-1.sqs
One more thing: if you're using interface endpoints, your Lambda's security group must allow outbound HTTPS (443) to the endpoint's security group. I've debugged cases where the endpoint was created but the security group rules didn't match — same timeout symptom.
Quick-Reference Summary Table
| Cause | Symptom in CloudWatch Logs | Fix |
|---|---|---|
| Missing NAT gateway route | connect() timed out on external IPs |
Add 0.0.0.0/0 route to NAT gateway in private subnet |
| Security group blocks ephemeral ports | connect() timed out on random high ports |
Allow outbound TCP 1024-65535 to 0.0.0.0/0 |
| No VPC endpoint for AWS service | connect() timed out on AWS service endpoints |
Create gateway/interface endpoint for the service |
Start with cause #1 — it accounts for about 70% of VPC Lambda timeouts I've seen. Then #2, then #3. If you've checked all three and still get timeouts, look at ENI creation lag (rare, usually only at cold start with functions that haven't run in days) or Lambda's execution role missing ec2:CreateNetworkInterface permissions.
Was this solution helpful?