Deploying an AI Agent to AWS: OpenAI Agents SDK + FastAPI + Lambda

Deploy a production-ready AI agent to AWS Lambda using OpenAI Agents SDK, FastAPI, and Terraform.

This post is a short, focused implementation summary of Pattern E (Single Agent) from my AI orchestration series.

Full conceptual background:
https://mossgreen.github.io/Booking-system-ai-orchestration/

Full implementation:
https://github.com/mossgreen/ai-orchestration-patterns/tree/main/pattern-e-single-agent

Terraform deployment:
https://github.com/mossgreen/ai-orchestration-patterns/tree/main/terraform/pattern_e

Architecture Overview

Here’s what we’re building:

┌──────────┐       ┌─────────────────┐       ┌──────────────┐
│   User   │──────▶│  API Gateway    │──────▶│   Lambda     │
└──────────┘       └─────────────────┘       │              │
                                             │  ┌────────┐  │
                                             │  │FastAPI │  │
                                             │  └────┬───┘  │
                                             │       │      │
                                             │  ┌────▼───┐  │
                                             │  │ Agent  │  │
                                             │  │  SDK   │  │
                                             │  └────┬───┘  │
                                             │       │      │
                                             │  ┌────▼─────┐│
                                             │  │ Booking  ││
                                             │  │ Service  ││
                                             │  └──────────┘│
                                             └──────────────┘

Flow:

User sends message to API Gateway
Gateway triggers Lambda (via Mangum adapter)
FastAPI routes to agent
Agent autonomously:
- Calls check_availability if needed
- Calls book_slot if ready
- Asks clarifying questions
Returns final response

The Code

We’ll build the agent in four layers:

Tools - Functions the agent can call (check_availability, book_slot)
Agent - OpenAI Agents SDK instance with tools and instructions
FastAPI - REST API wrapper around the agent
Lambda Handler - Mangum adapter to run FastAPI on AWS Lambda

1. Define Tools with @function_tool

The @function_tool decorator tells the agent what functions it can call:

from agents import function_tool
from shared import create_booking_service

booking_service = create_booking_service()

@function_tool
def check_availability(date: str, time: Optional[str] = None) -> str:
    """
    Check available tennis court slots for a given date.

    Args:
        date: Date in YYYY-MM-DD format (e.g., "2024-12-15")
        time: Optional specific time in HH:MM format (e.g., "14:00")

    Returns:
        Available slots or a message if none found
    """
    slots = booking_service.check_availability(date, time)

    if not slots:
        return f"No available slots found for {date}"

    result = f"Available slots for {date}:\n"
    for slot in slots:
        result += f"  - {slot.court} at {slot.time} (ID: {slot.slot_id})\n"

    return result


@function_tool
def book_slot(slot_id: str) -> str:
    """
    Book a specific tennis court slot.

    Args:
        slot_id: The slot ID from check_availability results

    Returns:
        Booking confirmation or error message
    """
    try:
        booking = booking_service.book(slot_id)
        return (
            f"Booking confirmed!\n"
            f"  Booking ID: {booking.booking_id}\n"
            f"  Court: {booking.court}\n"
            f"  Date: {booking.date}\n"
            f"  Time: {booking.time}"
        )
    except Exception as e:
        return f"Booking failed: {e}"

Key points:

Docstrings become the agent’s understanding of what each tool does
Return strings (agents work best with text, not complex objects)
Type hints help the agent understand parameters

2. Create the Agent

from agents import Agent, Runner
from datetime import datetime

def get_instructions(context, agent) -> str:
    """Generate dynamic instructions with current datetime."""
    now = datetime.now()
    current_datetime = now.strftime("%Y-%m-%d %H:%M (%A)")

    return f"""You are a helpful tennis court booking assistant.

CURRENT DATETIME: {current_datetime}

WORKFLOW:
- When a user wants to book, FIRST check availability for their preferred date/time
- Present the available options clearly
- If they confirm a slot, book it using the slot_id
- Always confirm the booking details

GUIDELINES:
- Convert relative dates ("tomorrow", "next Monday") to YYYY-MM-DD format
- If no time is specified, show all available slots for that day
- Be concise but friendly

IMPORTANT: You control the conversation flow. Decide autonomously when to check availability vs when to book."""

# Create the agent
booking_agent = Agent(
    name="Tennis Court Booking Agent",
    instructions=get_instructions,
    tools=[check_availability, book_slot],
)

Why dynamic instructions? The agent needs to know the current date to convert “tomorrow” to “2024-12-16”. Using a function instead of a string keeps this fresh.

3. Wrap with FastAPI

from fastapi import FastAPI, HTTPException
from pydantic import BaseModel

app = FastAPI(title="Pattern E: Single Agent")

class ChatRequest(BaseModel):
    message: str

class ChatResponse(BaseModel):
    response: str

@app.post("/chat", response_model=ChatResponse)
async def chat(request: ChatRequest) -> ChatResponse:
    """Send a message to the booking agent."""
    try:
        result = await Runner.run(booking_agent, request.message)
        return ChatResponse(response=result.final_output)
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

@app.get("/health")
async def health() -> dict:
    return {"status": "healthy", "pattern": "E"}

Why FastAPI?

Async-native (matches OpenAI Agents SDK)
Auto-generates OpenAPI docs
Works seamlessly with Mangum for Lambda

4. Lambda Adapter

# lambda_handler.py
from mangum import Mangum
from .api import app

handler = Mangum(app, lifespan="off")

That’s it. 3 lines to make FastAPI work on Lambda.

AWS Deployment

Prerequisites

Required tools:

Python 3.12+
UV (package manager)
Docker (for Lambda builds)
AWS CLI configured
Terraform 1.5+

Step 1: Project Structure

pattern-e-single-agent/
├── src/
│   ├── agent.py           # Agent definition + tools
│   ├── api.py             # FastAPI wrapper
│   ├── lambda_handler.py  # Mangum adapter
│   ├── models.py          # Pydantic models
│   └── settings.py        # Configuration
├── pyproject.toml         # Dependencies
└── sequence.puml          # Architecture diagram

Step 2: Define Dependencies

pyproject.toml:

[project]
name = "pattern-e-single-agent"
requires-python = ">=3.11"
dependencies = [
    "openai-agents>=0.0.3",
    "fastapi>=0.115.0",
    "uvicorn>=0.32.0",
    "mangum>=0.19.0",
    "pydantic>=2.0.0",
    "pydantic-settings>=2.0.0",
]

Step 3: Build Lambda Package

# Build with Docker (ensures Linux compatibility)
python scripts/package_lambda.py pattern-e-single-agent

# Output: pattern-e-single-agent/dist/lambda.zip (~79MB)

Why Docker? Python packages with C extensions (like pydantic) need to be compiled for Linux x86_64 (Lambda’s runtime), not macOS.

Step 4: Deploy with Terraform

terraform/pattern_e/main.tf:

resource "aws_lambda_function" "main" {
    function_name = "ai-patterns-pattern-e"
    handler       = "src.lambda_handler.handler"
    runtime       = "python3.12"
    filename      = "../../pattern-e-single-agent/dist/lambda.zip"

    timeout     = 60
    memory_size = 512

    environment {
        variables = {
            OPENAI_API_KEY = var.openai_api_key
        }
    }
}

resource "aws_apigatewayv2_api" "api" {
    name          = "ai-patterns-pattern-e"
    protocol_type = "HTTP"
}

resource "aws_apigatewayv2_integration" "lambda" {
    api_id           = aws_apigatewayv2_api.api.id
    integration_type = "AWS_PROXY"
    integration_uri  = aws_lambda_function.main.invoke_arn
}

Deploy:

cd terraform/pattern_e
cp terraform.tfvars.example terraform.tfvars
# Edit terraform.tfvars: add your OpenAI API key

terraform init
terraform apply

Output:

api_endpoint = "https://abc123.execute-api.us-east-1.amazonaws.com"

Step 5: Test

# Health check
curl https://abc123.execute-api.us-east-1.amazonaws.com/health

# Chat
curl -X POST https://abc123.execute-api.us-east-1.amazonaws.com/chat \
  -H "Content-Type: application/json" \
  -d '{"message": "What courts are available tomorrow at 3pm?"}'

Response:

{
  "response": "Here are the available courts for tomorrow at 3pm:\n- Court A (ID: 2024-12-16_CourtA_1500)\n- Court B (ID: 2024-12-16_CourtB_1500)\n- Court C (ID: 2024-12-16_CourtC_1500)\n\nWould you like to book one of these?"
}

When to Use This Pattern

Use Case	Recommended?
Customer support bot (unpredictable questions)	✅ Perfect fit
Booking system (check → book workflow)	✅ Good (if users ask questions)
Data extraction (fixed schema)	❌ Use function calling instead
Multi-step research (needs reasoning)	✅ Perfect fit
Simple Q&A (no tools needed)	❌ Overkill, use basic chat

Rule of thumb: If you can’t write the workflow as a flowchart, use agents.

Trade-offs

Pros

Benefit	Why It Matters
Less code	No manual loop management
Better UX	Agent adapts to user’s conversational style
Easier to extend	Add tools with @function_tool, done
Natural reasoning	LLM decides when to call what

Cons

Drawback	Impact
Less control	Can’t enforce “always check before booking”
Higher latency	Multiple LLM calls (reasoning loops)
Higher cost	More tokens per request than function calling
Debugging harder	Agent’s internal reasoning is opaque

Cost Comparison

Function calling (Pattern D):

Average: 2-3 LLM calls per booking
~$0.002 per request (GPT-4o-mini)

Agent (Pattern E):

Average: 3-5 LLM calls per booking
~$0.004 per request (GPT-4o-mini)

When it’s worth it: User asks clarifying questions → agent’s natural flow saves engineering time.

Next Steps

Try the live demo: https://ok1ro2wdf1.execute-api.us-east-1.amazonaws.com/health
Clone the repo: https://github.com/mossgreen/ai-orchestration-patterns
Read the blog series: https://mossgreen.github.io/Booking-system-ai-orchestration/

What’s next?

Pattern F: Multi-Agent (Manager routes to specialists)
Pattern G: Multi-Agent Multi-Process (Each agent = separate Lambda)
Pattern H: AWS Bedrock Agents (Fully managed)

Conclusion

Deploying an AI agent to AWS doesn’t require complex orchestration frameworks. With OpenAI Agents SDK + FastAPI + Lambda, you get:

Production-ready API in ~150 lines of code
Serverless scaling (0 → 1000s RPS)
<100ms cold start (with provisioned concurrency)

The key insight: Agents aren’t magic. They’re just LLMs with autonomy over their reasoning loop. Use them when the workflow is conversational, not deterministic.

Remember: No magic. Start simple, add complexity only when needed.

Share on

Twitter Facebook Google+ LinkedIn

Moss GU