Deploying an AI Agent to AWS: OpenAI Agents SDK + FastAPI + Lambda
Deploy a production-ready AI agent to AWS Lambda using OpenAI Agents SDK, FastAPI, and Terraform.
This post is a short, focused implementation summary of Pattern E (Single Agent) from my AI orchestration series.
- Full conceptual background:
https://mossgreen.github.io/Booking-system-ai-orchestration/- Full implementation:
https://github.com/mossgreen/ai-orchestration-patterns/tree/main/pattern-e-single-agent- Terraform deployment:
https://github.com/mossgreen/ai-orchestration-patterns/tree/main/terraform/pattern_e
Architecture Overview
Here’s what we’re building:
┌──────────┐ ┌─────────────────┐ ┌──────────────┐
│ User │──────▶│ API Gateway │──────▶│ Lambda │
└──────────┘ └─────────────────┘ │ │
│ ┌────────┐ │
│ │FastAPI │ │
│ └────┬───┘ │
│ │ │
│ ┌────▼───┐ │
│ │ Agent │ │
│ │ SDK │ │
│ └────┬───┘ │
│ │ │
│ ┌────▼─────┐│
│ │ Booking ││
│ │ Service ││
│ └──────────┘│
└──────────────┘
Flow:
- User sends message to API Gateway
- Gateway triggers Lambda (via Mangum adapter)
- FastAPI routes to agent
- Agent autonomously:
- Calls check_availability if needed
- Calls book_slot if ready
- Asks clarifying questions
- Returns final response
The Code
We’ll build the agent in four layers:
- Tools - Functions the agent can call (
check_availability,book_slot) - Agent - OpenAI Agents SDK instance with tools and instructions
- FastAPI - REST API wrapper around the agent
- Lambda Handler - Mangum adapter to run FastAPI on AWS Lambda
1. Define Tools with @function_tool
The @function_tool decorator tells the agent what functions it can call:
from agents import function_tool
from shared import create_booking_service
booking_service = create_booking_service()
@function_tool
def check_availability(date: str, time: Optional[str] = None) -> str:
"""
Check available tennis court slots for a given date.
Args:
date: Date in YYYY-MM-DD format (e.g., "2024-12-15")
time: Optional specific time in HH:MM format (e.g., "14:00")
Returns:
Available slots or a message if none found
"""
slots = booking_service.check_availability(date, time)
if not slots:
return f"No available slots found for {date}"
result = f"Available slots for {date}:\n"
for slot in slots:
result += f" - {slot.court} at {slot.time} (ID: {slot.slot_id})\n"
return result
@function_tool
def book_slot(slot_id: str) -> str:
"""
Book a specific tennis court slot.
Args:
slot_id: The slot ID from check_availability results
Returns:
Booking confirmation or error message
"""
try:
booking = booking_service.book(slot_id)
return (
f"Booking confirmed!\n"
f" Booking ID: {booking.booking_id}\n"
f" Court: {booking.court}\n"
f" Date: {booking.date}\n"
f" Time: {booking.time}"
)
except Exception as e:
return f"Booking failed: {e}"
Key points:
- Docstrings become the agent’s understanding of what each tool does
- Return strings (agents work best with text, not complex objects)
- Type hints help the agent understand parameters
2. Create the Agent
from agents import Agent, Runner
from datetime import datetime
def get_instructions(context, agent) -> str:
"""Generate dynamic instructions with current datetime."""
now = datetime.now()
current_datetime = now.strftime("%Y-%m-%d %H:%M (%A)")
return f"""You are a helpful tennis court booking assistant.
CURRENT DATETIME: {current_datetime}
WORKFLOW:
- When a user wants to book, FIRST check availability for their preferred date/time
- Present the available options clearly
- If they confirm a slot, book it using the slot_id
- Always confirm the booking details
GUIDELINES:
- Convert relative dates ("tomorrow", "next Monday") to YYYY-MM-DD format
- If no time is specified, show all available slots for that day
- Be concise but friendly
IMPORTANT: You control the conversation flow. Decide autonomously when to check availability vs when to book."""
# Create the agent
booking_agent = Agent(
name="Tennis Court Booking Agent",
instructions=get_instructions,
tools=[check_availability, book_slot],
)
Why dynamic instructions? The agent needs to know the current date to convert “tomorrow” to “2024-12-16”. Using a function instead of a string keeps this fresh.
3. Wrap with FastAPI
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
app = FastAPI(title="Pattern E: Single Agent")
class ChatRequest(BaseModel):
message: str
class ChatResponse(BaseModel):
response: str
@app.post("/chat", response_model=ChatResponse)
async def chat(request: ChatRequest) -> ChatResponse:
"""Send a message to the booking agent."""
try:
result = await Runner.run(booking_agent, request.message)
return ChatResponse(response=result.final_output)
except Exception as e:
raise HTTPException(status_code=500, detail=str(e))
@app.get("/health")
async def health() -> dict:
return {"status": "healthy", "pattern": "E"}
Why FastAPI?
- Async-native (matches OpenAI Agents SDK)
- Auto-generates OpenAPI docs
- Works seamlessly with Mangum for Lambda
4. Lambda Adapter
# lambda_handler.py
from mangum import Mangum
from .api import app
handler = Mangum(app, lifespan="off")
That’s it. 3 lines to make FastAPI work on Lambda.
AWS Deployment
Prerequisites
Required tools:
- Python 3.12+
- UV (package manager)
- Docker (for Lambda builds)
- AWS CLI configured
- Terraform 1.5+
Step 1: Project Structure
pattern-e-single-agent/
├── src/
│ ├── agent.py # Agent definition + tools
│ ├── api.py # FastAPI wrapper
│ ├── lambda_handler.py # Mangum adapter
│ ├── models.py # Pydantic models
│ └── settings.py # Configuration
├── pyproject.toml # Dependencies
└── sequence.puml # Architecture diagram
Step 2: Define Dependencies
pyproject.toml:
[project]
name = "pattern-e-single-agent"
requires-python = ">=3.11"
dependencies = [
"openai-agents>=0.0.3",
"fastapi>=0.115.0",
"uvicorn>=0.32.0",
"mangum>=0.19.0",
"pydantic>=2.0.0",
"pydantic-settings>=2.0.0",
]
Step 3: Build Lambda Package
# Build with Docker (ensures Linux compatibility)
python scripts/package_lambda.py pattern-e-single-agent
# Output: pattern-e-single-agent/dist/lambda.zip (~79MB)
Why Docker? Python packages with C extensions (like pydantic) need to be compiled for Linux x86_64 (Lambda’s runtime), not macOS.
Step 4: Deploy with Terraform
terraform/pattern_e/main.tf:
resource "aws_lambda_function" "main" {
function_name = "ai-patterns-pattern-e"
handler = "src.lambda_handler.handler"
runtime = "python3.12"
filename = "../../pattern-e-single-agent/dist/lambda.zip"
timeout = 60
memory_size = 512
environment {
variables = {
OPENAI_API_KEY = var.openai_api_key
}
}
}
resource "aws_apigatewayv2_api" "api" {
name = "ai-patterns-pattern-e"
protocol_type = "HTTP"
}
resource "aws_apigatewayv2_integration" "lambda" {
api_id = aws_apigatewayv2_api.api.id
integration_type = "AWS_PROXY"
integration_uri = aws_lambda_function.main.invoke_arn
}
Deploy:
cd terraform/pattern_e
cp terraform.tfvars.example terraform.tfvars
# Edit terraform.tfvars: add your OpenAI API key
terraform init
terraform apply
Output:
api_endpoint = "https://abc123.execute-api.us-east-1.amazonaws.com"
Step 5: Test
# Health check
curl https://abc123.execute-api.us-east-1.amazonaws.com/health
# Chat
curl -X POST https://abc123.execute-api.us-east-1.amazonaws.com/chat \
-H "Content-Type: application/json" \
-d '{"message": "What courts are available tomorrow at 3pm?"}'
Response:
{
"response": "Here are the available courts for tomorrow at 3pm:\n- Court A (ID: 2024-12-16_CourtA_1500)\n- Court B (ID: 2024-12-16_CourtB_1500)\n- Court C (ID: 2024-12-16_CourtC_1500)\n\nWould you like to book one of these?"
}
When to Use This Pattern
| Use Case | Recommended? |
|---|---|
| Customer support bot (unpredictable questions) | ✅ Perfect fit |
| Booking system (check → book workflow) | ✅ Good (if users ask questions) |
| Data extraction (fixed schema) | ❌ Use function calling instead |
| Multi-step research (needs reasoning) | ✅ Perfect fit |
| Simple Q&A (no tools needed) | ❌ Overkill, use basic chat |
Rule of thumb: If you can’t write the workflow as a flowchart, use agents.
Trade-offs
Pros
| Benefit | Why It Matters |
|---|---|
| Less code | No manual loop management |
| Better UX | Agent adapts to user’s conversational style |
| Easier to extend | Add tools with @function_tool, done |
| Natural reasoning | LLM decides when to call what |
Cons
| Drawback | Impact |
|---|---|
| Less control | Can’t enforce “always check before booking” |
| Higher latency | Multiple LLM calls (reasoning loops) |
| Higher cost | More tokens per request than function calling |
| Debugging harder | Agent’s internal reasoning is opaque |
Cost Comparison
Function calling (Pattern D):
- Average: 2-3 LLM calls per booking
- ~$0.002 per request (GPT-4o-mini)
Agent (Pattern E):
- Average: 3-5 LLM calls per booking
- ~$0.004 per request (GPT-4o-mini)
When it’s worth it: User asks clarifying questions → agent’s natural flow saves engineering time.
Next Steps
- Try the live demo: https://ok1ro2wdf1.execute-api.us-east-1.amazonaws.com/health
- Clone the repo: https://github.com/mossgreen/ai-orchestration-patterns
- Read the blog series: https://mossgreen.github.io/Booking-system-ai-orchestration/
What’s next?
- Pattern F: Multi-Agent (Manager routes to specialists)
- Pattern G: Multi-Agent Multi-Process (Each agent = separate Lambda)
- Pattern H: AWS Bedrock Agents (Fully managed)
Conclusion
Deploying an AI agent to AWS doesn’t require complex orchestration frameworks. With OpenAI Agents SDK + FastAPI + Lambda, you get:
- Production-ready API in ~150 lines of code
- Serverless scaling (0 → 1000s RPS)
- <100ms cold start (with provisioned concurrency)
The key insight: Agents aren’t magic. They’re just LLMs with autonomy over their reasoning loop. Use them when the workflow is conversational, not deterministic.
Remember: No magic. Start simple, add complexity only when needed.