The Fastest Way to Deploy Your AI Agent to AWS Lambda
You’ve built an AI agent. You need it to get to the cloud in 10 minutes.
Why Lambda for AI Agents
AI agents calling external LLM APIs (OpenAI, Bedrock, Anthropic) share a key trait: they’re computationally lightweight. The LLM provider does the heavy lifting. Your agent just orchestrates.
This makes Lambda ideal:
- Zero idle cost — pay only for actual requests
- Auto-scaling — handles spikes without capacity planning
- No operations — no servers to maintain
The Problem
A deployment artifact comes from three inputs:
Artifact = Build(SourceCode, Dependencies, Environment)
| Input | Description |
|---|---|
| SourceCode | Your Python files |
| Dependencies | Packages from lockfile |
| Environment | OS where pip install runs |
When you pip install on macOS, pip downloads macOS binaries. Packages with C extensions (pydantic-core, numpy, orjson) contain platform-specific code.
Copy these to Lambda’s Amazon Linux → crash.
Root cause: Environment is treated as implicit, assumed identical between dev and prod.
The Solution: Clean Room Pattern
Make Environment a constant by building inside Lambda’s environment:
Artifact = Build(SourceCode, Dependencies, Lambda_Linux)
AWS publishes official Lambda images. Build inside them:
public.ecr.aws/lambda/python:3.12
If it builds in this container, it runs on Lambda. Guaranteed.
Architecture
┌─────────────────────────────────────┐
│ AWS Lambda │
├─────────────────────────────────────┤
│ Mangum (ASGI → Lambda) │
├─────────────────────────────────────┤
│ FastAPI (HTTP Interface) │
├─────────────────────────────────────┤
│ Agent + OpenAI SDK │
└─────────────────────────────────────┘
| Component | Role |
|---|---|
| FastAPI | HTTP interface, request validation, OpenAPI docs |
| Mangum | Adapts ASGI to Lambda event format |
| OpenAI SDK | Unified LLM interface (supports OpenAI + Bedrock) |
Project Structure
my-agent/
├── app/
│ ├── __init__.py
│ ├── main.py # FastAPI + Lambda handler
│ ├── agent.py # Agent logic
│ └── config.py # Configuration
├── pyproject.toml
├── uv.lock
├── build.sh
└── terraform/
├── main.tf
├── variables.tf
├── outputs.tf
└── terraform.tfvars # Secrets (git-ignored)
Implementation
Configuration
# app/config.py
from pydantic_settings import BaseSettings
class Settings(BaseSettings):
openai_api_key: str
openai_base_url: str | None = None # Set for Bedrock
model_name: str = "gpt-4o-mini"
settings = Settings()
Agent
Minimal RAG pattern — replace in-memory store with vector DB in production.
# app/agent.py
from openai import OpenAI
from .config import settings
client = OpenAI(
api_key=settings.openai_api_key,
base_url=settings.openai_base_url,
)
KNOWLEDGE_BASE = [
"Founded in 2020.",
"Pricing: Free, Pro ($29/mo), Enterprise.",
"Support: 9am-5pm EST, Mon-Fri.",
]
def retrieve(query: str, top_k: int = 2) -> list[str]:
return KNOWLEDGE_BASE[:top_k]
def answer(question: str) -> str:
context = "\n".join(f"- {c}" for c in retrieve(question))
response = client.chat.completions.create(
model=settings.model_name,
messages=[
{"role": "system", "content": f"Answer based on:\n{context}"},
{"role": "user", "content": question},
],
max_tokens=500,
)
return response.choices[0].message.content
API
# app/main.py
from fastapi import FastAPI
from mangum import Mangum
from pydantic import BaseModel
from .agent import answer
app = FastAPI(title="AI Agent")
class Query(BaseModel):
question: str
class Response(BaseModel):
answer: str
@app.post("/query", response_model=Response)
def query(q: Query) -> Response:
return Response(answer=answer(q.question))
@app.get("/health")
def health():
return {"status": "healthy"}
# Lambda entry point
handler = Mangum(app, lifespan="off")
Build Script
Implements the Clean Room Pattern:
#!/bin/bash
# build.sh
set -e
rm -rf package deployment.zip requirements.txt
# Export dependencies
uv export --frozen --no-dev --no-editable -o requirements.txt
# Build in Lambda environment (Clean Room)
docker run --rm \
-v "$(pwd)":/var/task \
-w /var/task \
public.ecr.aws/lambda/python:3.12 \
bash -c "pip install -r requirements.txt -t package/ -q"
# Package
cd package && zip -rq ../deployment.zip . && cd ..
zip -rq deployment.zip app/
rm -rf package requirements.txt
echo "Created deployment.zip ($(du -h deployment.zip | cut -f1))"
Terraform
Main configuration:
# terraform/main.tf
terraform {
required_providers {
aws = { source = "hashicorp/aws", version = "~> 5.0" }
}
}
provider "aws" {
region = var.aws_region
}
# IAM Role
resource "aws_iam_role" "lambda" {
name = "${var.function_name}-role"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Action = "sts:AssumeRole"
Effect = "Allow"
Principal = { Service = "lambda.amazonaws.com" }
}]
})
}
resource "aws_iam_role_policy_attachment" "basic" {
role = aws_iam_role.lambda.name
policy_arn = "arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole"
}
# Lambda Function
resource "aws_lambda_function" "agent" {
filename = "${path.module}/../deployment.zip"
function_name = var.function_name
role = aws_iam_role.lambda.arn
handler = "app.main.handler"
runtime = "python3.12"
timeout = 30
memory_size = 256
source_code_hash = filebase64sha256("${path.module}/../deployment.zip")
environment {
variables = {
OPENAI_API_KEY = var.openai_api_key
OPENAI_BASE_URL = var.openai_base_url
MODEL_NAME = var.model_name
}
}
}
# Public URL
resource "aws_lambda_function_url" "agent" {
function_name = aws_lambda_function.agent.function_name
authorization_type = "NONE"
}
Variables:
# terraform/variables.tf
variable "aws_region" { default = "us-west-2" }
variable "function_name" { default = "ai-agent" }
variable "openai_api_key" { sensitive = true }
variable "openai_base_url" { default = "" }
variable "model_name" { default = "gpt-4o-mini" }
Outputs:
# terraform/outputs.tf
output "endpoint" {
value = aws_lambda_function_url.agent.function_url
}
Secrets (git-ignored):
# terraform/terraform.tfvars
openai_api_key = "sk-..."
Deploy:
cd terraform
terraform init
terraform apply
Switching to Bedrock
The OpenAI SDK supports Bedrock through compatible API endpoints.
Update terraform.tfvars:
openai_api_key = "your-bedrock-api-key"
openai_base_url = "https://bedrock-runtime.us-west-2.amazonaws.com/openai/v1"
model_name = "anthropic.claude-3-haiku-20240307-v1:0"
Add Bedrock permissions to main.tf:
resource "aws_iam_role_policy" "bedrock" {
name = "${var.function_name}-bedrock"
role = aws_iam_role.lambda.id
policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Effect = "Allow"
Action = ["bedrock:InvokeModel"]
Resource = "*"
}]
})
}
No code changes required — same agent, different provider.
Zip vs Container
| Approach | Size Limit | Cold Start | Use When |
|---|---|---|---|
| Zip | 250 MB | Fast (~100-500ms) | API-calling agents (most cases) |
| Container | 10 GB | Slower (~500ms-2s) | Bundled ML models, heavy deps |
Typical AI agent deployment: <20 MB. Zip is the right choice.
Summary
| Step | Action |
|---|---|
| 1 | Structure code: app/ for logic, root for infra |
| 2 | Build in Clean Room: docker run with Lambda image |
| 3 | Deploy with Terraform: terraform apply |
| 4 | Switch providers: update terraform.tfvars |