The Control Spectrum: 8 AI Orchestration Patterns from Full Control to Full Autonomy
AI architecture isn’t binary. It’s a spectrum.
The Control Spectrum: A New Mental Model
Most teams treat AI architecture as a binary choice: “use agents or don’t.” After implementing 8 production-ready patterns—from “AI as a service” to multi-agent orchestration—I found a better mental model: the Control Spectrum.
CONTROL ←─────────────────────────────────────→ AUTONOMY
A B C D E F G
No Agent Workflow Workflow Function Single Multi Multi
(Shared) (Indep.) Calling Agent Agent Agent
The trade-off: Moving right increases AI capability but decreases predictability, debuggability, and control. This post maps the entire spectrum so you can position your system correctly.
What’s inside: All 8 patterns implement the same booking system (check_availability, book) with identical OpenAI/Claude/Bedrock integrations. The difference: who decides which function to call and when.
Business impact: 40% of multi-agent projects fail due to insufficient state management and over-engineering. Choosing the right position on the spectrum means shipping faster, debugging easier, and scaling reliably.
The Use Case
A tennis court booking system with two functions:
- check_availability — Given date/time, return open slots
- book — Reserve the selected slot, return confirmation
All 8 patterns implement these same 2 functions. The difference: who decides which function to call and when.
Pattern A: AI as Service (No Agent)
Style: None — AI just generates/responds
Runtime: Shared
Architecture
User → API Gateway → Lambda → LLM → Lambda → DB → User
You control everything. The LLM is just a text utility—no decision-making. It performs discriminative tasks only: parsing, classifying, extracting. The reasoning happens in your code.
Pseudo Code
from openai import OpenAI
client = OpenAI()
# Two functions: check_availability, book
def check_availability(date, time):
return db.query_available_slots(date, time)
def book(slot_id, user_id):
return db.reserve_slot(slot_id, user_id)
# Lambda handler - YOU control all logic
def handler(event):
user_input = event["body"]
session = get_session(event) # your state store
# Use LLM to parse natural language
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "Extract intent and params. Return JSON: {intent, date, time, slot_id}"},
{"role": "user", "content": user_input}
]
)
parsed = json.loads(response.choices[0].message.content)
# e.g., {intent: "check", date: "2025-12-04", time: "15:00"}
# YOU decide which function to call
if parsed["intent"] == "check":
slots = check_availability(parsed["date"], parsed["time"])
session["available_slots"] = slots
return f"Available slots: {slots}"
elif parsed["intent"] == "book":
result = book(parsed["slot_id"], session["user_id"])
return f"Booked! Confirmation: {result}"
else:
return "Please tell me if you want to check availability or book."
Key point: LLM parses text. Your code decides which function to call.
Handling Multi-Turn Conversations
What if booking requires multiple inputs: date, time, slot?
You manage the state:
User: "Book a court for tomorrow"
↓
Lambda
├──→ LLM parse → {date: "2025-12-04", time: ?, slot: ?}
├──→ Check: missing time, slot
↓
System: "What time would you like?"
User: "3pm"
↓
Lambda
├──→ LLM parse → {time: "15:00"}
├──→ Merge state → {date: "2025-12-04", time: "15:00", slot: ?}
├──→ DB: get available slots
↓
System: "Slot A and B are available. Which one?"
User: "Slot A"
↓
Lambda
├──→ Merge state → {date: "2025-12-04", time: "15:00", slot: "A"}
├──→ All fields complete → DB book
↓
System: "Booked! Court A, Dec 4 at 3pm"
You need to:
- Store conversation state (DynamoDB, session, etc.)
- Check what’s missing after each parse
- Prompt user for missing fields
- Merge new input into existing state
This is where Pattern A gets painful — you’re coding a state machine manually.
Patterns B–G handle this more naturally.
Pros
- Full control
- Predictable behavior
- Easy to debug
Cons
- Rigid — every flow must be coded
- No reasoning capability
- Multi-turn conversations require manual state management
When to Use
- Fixed, predictable workflows
- AI only needed for text parsing/formatting
- You want full control over logic
- Single-turn or simple interactions
Pattern B: Workflow (Shared Runtime)
Pattern B introduces a workflow engine that explicitly controls step sequencing and state transitions. The application predefines the steps, while the workflow engine manages how they execute within a shared runtime.
Style: Workflow — Predefined sequence of steps
Runtime: Shared — all steps run in one process
Architecture
User → Step 1 → Step 2 → Step 3 → Response
│ │ │
↓ ↓ ↓
LLM LLM LLM
(any) (any) (any)
Steps execute in a predefined order. No dynamic routing — the sequence is fixed. Each step can use any LLM vendor for its specific task.
What Can a Step Be?
A “step” isn’t just an LLM call. Steps can be anything:
| Type | What It Does | Example |
|---|---|---|
| LLM call | Reasoning, parsing, generation | Parse intent, summarize, classify |
| API call | External service | Payment gateway, weather API |
| Database op | Read/write data | Check availability, save booking |
| Validation | Check rules | Is date in future? Is slot valid? |
| Transformation | Convert format | JSON → XML, normalize data |
| Notification | Alert someone | Send email, SMS, Slack |
| Human-in-the-loop | Wait for approval | Manager approval for large bookings |
A more complex booking workflow might look like:
Parse (LLM) → Validate (code) → Check (DB) → Select (LLM) → Book (DB) → Notify (API)
Not every step needs AI. Many are pure code, database queries, or API calls. The power of workflows is mixing AI and traditional code in a predictable sequence. For this demo, we keep it simple with 3 steps.
Difference from AI as Service (Pattern A)
| Pattern A (AI as Service) | Pattern B (Workflow) |
|---|---|
| Single LLM call for parsing | Multiple steps, each can use LLM |
| You code the state machine | Steps are clearly separated |
| All logic intertwined | Each step is isolated and testable |
Pseudo Code
from openai import OpenAI
import anthropic
openai_client = OpenAI()
claude_client = anthropic.Anthropic()
# Step 1: Parse input (using OpenAI)
def parse_input(user_input: str) -> dict:
response = openai_client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": """
Extract booking details from user input.
Return JSON: {date, time, preferences}
If information is missing, set as null.
"""},
{"role": "user", "content": user_input}
]
)
return json.loads(response.choices[0].message.content)
# Step 2: Check availability (direct DB call)
def get_availability(parsed: dict) -> list:
slots = db.query_slots(parsed["date"], parsed.get("time"))
return slots
# Step 3: Select best slot (using Claude)
def select_slot(slots: list, preferences: dict) -> dict:
response = claude_client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[{
"role": "user",
"content": f"Select the best slot based on preferences. Slots: {slots}, Preferences: {preferences}. Return JSON: "
}]
)
return json.loads(response.content[0].text)
# Step 4: Make booking (direct DB call)
def make_booking(slot_id: str, user_id: str) -> dict:
return db.reserve(slot_id, user_id)
# Step 5: Generate confirmation (using OpenAI)
def generate_confirmation(booking: dict) -> str:
response = openai_client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "Generate a friendly booking confirmation message."},
{"role": "user", "content": f"Booking details: {booking}"}
]
)
return response.choices[0].message.content
# Workflow: Fixed sequence
def booking_workflow(user_input: str, user_id: str) -> str:
# Step 1: Parse (OpenAI)
parsed = parse_input(user_input)
# Step 2: Check availability (DB)
slots = get_availability(parsed)
if not slots:
return "Sorry, no slots available for that date/time."
# Step 3: Select best slot (Claude)
selection = select_slot(slots, parsed.get("preferences", {}))
# Step 4: Book (DB)
booking = make_booking(selection["slot_id"], user_id)
# Step 5: Confirm (OpenAI)
return generate_confirmation(booking)
# Run
response = booking_workflow("Book me a court for tomorrow at 3pm", "user-123")
Pros
- Predictable execution flow
- Easy to debug (fixed sequence)
- Each step is isolated and testable
- Simple to understand
- Custom logic between steps
- Can use multiple AI vendors in same workflow
Cons
- Inflexible — can’t skip steps
- May be inefficient for simple queries
- Must handle all cases in predefined flow
When to Use
- Well-defined, sequential processes
- Compliance/audit requirements (need to know exact flow)
- Each step has clear input/output
- Predictability over flexibility
Pattern C: Workflow (Independent Runtime)
Style: Workflow — Predefined sequence of steps
Runtime: Independent — each step runs in its own service
Architecture
User → Service 1 → Service 2 → Service 3 → Response
│ │ │
↓ ↓ ↓
Agent A Agent B Agent C
(any vendor)
Same predefined sequence as Pattern B, but each step runs in its own service (Lambda, container, etc.). Enables independent deployment and scaling.
Difference from Pattern B
| Pattern B (Shared Runtime) | Pattern C (Independent Runtime) |
|---|---|
| All steps in one process | Each step in its own service |
| Deploy together | Deploy independently |
| Shared memory | Pass data via events/API |
| Fast | Network latency |
| Single failure point | Step failure is isolated |
Pseudo Code
# Service 1: Parse Input (using OpenAI)
# Deployed as Lambda, container, or separate service
def parse_service_handler(event):
user_input = event["input"]
client = OpenAI()
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "Extract booking details. Return JSON: {date, time, preferences}"},
{"role": "user", "content": user_input}
]
)
return {"parsed": json.loads(response.choices[0].message.content)}
# Service 2: Check Availability (using Claude)
# Deployed separately
def availability_service_handler(event):
parsed = event["parsed"]
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-sonnet-4-20250514",
messages=[{
"role": "user",
"content": f"Check availability for: {parsed}"
}],
tools=[{
"name": "check_availability",
"description": "Check available slots for date/time",
"input_schema": {
"type": "object",
"properties": {
"date": {"type": "string"},
"time": {"type": "string"}
},
"required": ["date"]
}
}]
)
# Execute tool and return
if response.stop_reason == "tool_use":
tool_input = response.content[1].input
slots = db.query_slots(tool_input["date"], tool_input.get("time"))
return {"available_slots": slots}
return {"available_slots": []}
# Service 3: Book Slot (using Bedrock)
# Deployed separately
def booking_service_handler(event):
slots = event["available_slots"]
user_id = event["user_id"]
if not slots:
return {"error": "No slots available"}
# Use Bedrock to select best slot
bedrock = boto3.client("bedrock-runtime")
response = bedrock.invoke_model(
modelId="anthropic.claude-3-sonnet-20240229-v1:0",
body=json.dumps({
"anthropic_version": "bedrock-2023-05-31",
"max_tokens": 1024,
"messages": [{
"role": "user",
"content": f"Select the best slot from: {slots}. Return JSON: "
}]
})
)
result = json.loads(response["body"].read())
selected = json.loads(result["content"][0]["text"])
# Execute booking
booking = db.reserve(selected["slot_id"], user_id)
return {"confirmation": booking}
# Orchestrator (Step Functions, or simple coordinator service)
def workflow_orchestrator(user_input: str, user_id: str) -> str:
# Step 1: Call parse service
parsed = invoke_service("parse-service", {"input": user_input})
# Step 2: Call availability service
availability = invoke_service("availability-service", parsed)
# Step 3: Call booking service
result = invoke_service("booking-service", {**availability, "user_id": user_id})
return result["confirmation"]
Pros
- Step failure doesn’t crash the whole flow
- Can deploy/update steps independently
- Mix AI vendors freely per step
- Better for large teams (each team owns a step)
- Custom pre/post processing per step
- Easier to debug (isolate which step failed)
Cons
- More infrastructure to manage
- Network latency between steps
- Data passing overhead
- More complex deployment and monitoring
When to Use
- Steps have different scaling requirements
- Want independent deployment per step
- Large team with ownership boundaries
- Compliance requires step-level isolation
Pattern D: Function Calling (You Control the Loop)
Style: Function Call — LLM suggests, YOU execute and control loop
Runtime: Shared
Architecture
User → Your Code → OpenAI SDK → [suggests function] → Your Code → DB
↑______________________ you decide next step ___________|
OpenAI SDK suggests which function to call. You execute it and decide what happens next.
Difference from Workflow (Pattern B/C)
| Workflow (B, C) | Function Calling (D) |
|---|---|
| You define the sequence | LLM suggests which function |
| Fixed steps, always same order | Dynamic based on context |
| Predictable | More flexible |
| No loop | Loop until LLM says “done” |
How the Loop Works
User: "Book a court for tomorrow at 3pm"
Loop 1:
┌─────────────────────────────────────────────────────────────┐
│ messages = [{role: "user", content: "Book a court..."}] │
│ ↓ │
│ OpenAI SDK (with tools defined) │
│ ↓ │
│ Response: tool_calls = [{name: "check_availability", │
│ args: {date: "2025-12-04"}}] │
│ ↓ │
│ Has tool_calls? YES → YOU execute check_availability() │
│ ↓ │
│ Append to messages: │
│ - assistant msg (with tool_call) │
│ - tool result: [{slot_id: "A", time: "3pm"}, ...] │
│ ↓ │
│ Continue loop │
└─────────────────────────────────────────────────────────────┘
Loop 2:
┌─────────────────────────────────────────────────────────────┐
│ messages = [user msg, assistant tool_call, tool result] │
│ ↓ │
│ OpenAI SDK (sees availability result) │
│ ↓ │
│ Response: tool_calls = [{name: "book_slot", │
│ args: {slot_id: "A"}}] │
│ ↓ │
│ Has tool_calls? YES → YOU execute book_slot() │
│ ↓ │
│ Append to messages: │
│ - assistant msg (with tool_call) │
│ - tool result: {confirmation: "Booked!"} │
│ ↓ │
│ Continue loop │
└─────────────────────────────────────────────────────────────┘
Loop 3:
┌─────────────────────────────────────────────────────────────┐
│ messages = [user, tool_call, result, tool_call, result] │
│ ↓ │
│ OpenAI SDK (sees booking confirmed) │
│ ↓ │
│ Response: tool_calls = None │
│ content = "Your court is booked for..." │
│ ↓ │
│ Has tool_calls? NO → return content → EXIT LOOP │
└─────────────────────────────────────────────────────────────┘
Who controls what:
| What | Who |
|---|---|
| Which function to call | LLM suggests |
| Actually calling the function | You |
| Continue or stop loop | You |
| What to do with result | You |
Pseudo Code
from openai import OpenAI
client = OpenAI()
# Your functions - direct DB calls
def check_availability(date: str, time: str = None) -> dict:
return db.query_slots(date, time)
def book_slot(slot_id: str, user_id: str) -> dict:
return db.reserve(slot_id, user_id)
# YOU control the loop
def handle_booking_request(user_input: str, user_id: str) -> str:
messages = [{"role": "user", "content": user_input}]
tools = [
{
"type": "function",
"function": {
"name": "check_availability",
"description": "Check available slots",
"parameters": {
"type": "object",
"properties": {
"date": {"type": "string"},
"time": {"type": "string"}
},
"required": ["date"]
}
}
},
{
"type": "function",
"function": {
"name": "book_slot",
"description": "Book a slot",
"parameters": {
"type": "object",
"properties": {
"slot_id": {"type": "string"}
},
"required": ["slot_id"]
}
}
}
]
# Loop controlled by YOU
while True:
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=messages,
tools=tools
)
msg = response.choices[0].message
# No function call? Done.
if not msg.tool_calls:
return msg.content
# Process each tool call
messages.append(msg)
for tool_call in msg.tool_calls:
fn_name = tool_call.function.name
args = json.loads(tool_call.function.arguments)
# YOU execute the function directly
if fn_name == "check_availability":
result = check_availability(args["date"], args.get("time"))
elif fn_name == "book_slot":
result = book_slot(args["slot_id"], user_id)
else:
result = {"error": "Unknown function"}
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"content": json.dumps(result)
})
# Loop continues until LLM returns no tool_calls
Pros
- More flexible than fixed workflows
- LLM can adapt to different user intents
- Can add validation/logging between steps
- You still control execution
Cons
- Less predictable than workflows
- More code to write
- You manage the loop logic
When to Use
- User intents vary and can’t be fixed to one sequence
- Need flexibility but want to keep control
- Want to add custom validation/logic per step
- Building vendor-agnostic solution
Pattern E: Single Agent
Style: Agent — Autonomous reasoning + execution
Runtime: Shared
Architecture
User → Agent → [Reasons + Acts autonomously] → DB
↑_____________ loops until done _______|
The agent manages the loop autonomously. You define tools and instructions; it decides what to do and when to stop.
This pattern uses the OpenAI Agents SDK (not the basic OpenAI SDK used in Patterns A–D).
Difference from Function Calling (Pattern D)
| Pattern D (Function Calling) | Pattern E (Single Agent) |
|---|---|
| You control the loop | Agent controls the loop |
| You decide when to stop | Agent decides when done |
| More control | More autonomous |
openai library |
openai-agents library |
Difference from Workflow (Pattern B/C)
| Workflow (B, C) | Single Agent (E) |
|---|---|
| Fixed step sequence | Agent decides order |
| Always runs all steps | May skip steps |
| Predictable | Flexible |
| You define flow | Agent reasons about flow |
Pseudo Code
from agents import Agent, Runner, function_tool
# Define tools using decorators
@function_tool
def check_availability(date: str, time: str = None) -> dict:
"""Check available tennis court slots for a given date and optional time."""
return db.query_slots(date, time)
@function_tool
def book_slot(slot_id: str, user_id: str) -> dict:
"""Book a specific tennis court slot."""
return db.reserve(slot_id, user_id)
# Create agent with tools
agent = Agent(
name="BookingAgent",
instructions="""
You help users book tennis courts.
When a user wants to book:
1. First check availability for their requested date/time
2. Present available options
3. Book their chosen slot
4. Confirm the booking
Always be helpful and confirm details before booking.
""",
tools=[check_availability, book_slot]
)
# Run - Agent handles the loop autonomously
result = Runner.run(agent, "Book me a court for tomorrow at 3pm")
print(result.final_output)
# Agent autonomously: reasons → calls tools → loops → responds
How it works internally
User: "Book me a court for tomorrow at 3pm"
↓
Agent receives input
↓
┌──────────────────────────────────┐
│ Agent Loop │
│ ┌─────────────────────────────┐ │
│ │ 1. Reason: "Need to check │ │
│ │ availability first" │ │
│ │ 2. Call: check_availability │ │
│ │ 3. Observe: slots A, B, C │ │
│ │ 4. Reason: "Should book │ │
│ │ slot A at 3pm" │ │
│ │ 5. Call: book_slot │ │
│ │ 6. Observe: confirmed │ │
│ │ 7. Reason: "Done, respond" │ │
│ └─────────────────────────────┘ │
└──────────────────────────────────┘
↓
"Your court is booked! Court A,
tomorrow at 3pm. Confirmation #123"
Pros
- Clean, minimal code
- Agent handles complexity
- Good balance of power and simplicity
- Handles multi-turn naturally
Cons
- Less control than Pattern D
- Depends on agent framework behavior
- Less predictable execution path
When to Use
- Want agent capabilities without managing loops
- Trust the agent framework to handle execution
- Rapid prototyping
- Simple to moderately complex tasks
Pattern F: Multi-Agent (Shared Runtime)
Style: Multi-Agent — Manager routes dynamically to specialists
Runtime: Shared — all agents run in one process
Architecture
User → Manager Agent → [Decides which specialist]
↓
┌───────────┴───────────┐
↓ ↓
Availability Agent Booking Agent
↓ ↓
DB DB
Manager dynamically decides which specialist to call based on user input. All agents run in the same process.
Difference from Single Agent (Pattern E)
| Single Agent (E) | Multi-Agent (F) |
|---|---|
| One agent, multiple tools | Multiple specialized agents |
| Agent does everything | Agents have focused domains |
| Simpler | Better separation of concerns |
Difference from Workflow (Pattern B/C)
| Workflow (B, C) | Multi-Agent (F, G) |
|---|---|
| Fixed: Step 1 → 2 → 3 | Dynamic: Manager decides |
| Always runs all steps | May skip agents |
| Predictable | Flexible |
Pseudo Code
from agents import Agent, Runner, function_tool
# --- Tool definitions ---
@function_tool
def check_availability(date: str, time: str = None) -> dict:
"""Check available tennis court slots."""
return db.query_slots(date, time)
@function_tool
def book_slot(slot_id: str, user_id: str) -> dict:
"""Book a specific slot."""
return db.reserve(slot_id, user_id)
# --- Specialist Agents ---
availability_agent = Agent(
name="AvailabilityAgent",
instructions="""
You are a specialist in checking tennis court availability.
Use the check_availability tool to find open slots.
Return a clear summary of available options.
""",
tools=[check_availability]
)
booking_agent = Agent(
name="BookingAgent",
instructions="""
You are a specialist in booking tennis courts.
Use the book_slot tool to reserve courts.
Always confirm the booking details.
""",
tools=[book_slot]
)
# --- Handoff functions ---
@function_tool
def handoff_to_availability(task: str) -> str:
"""Delegate to availability specialist for checking open slots."""
result = Runner.run(availability_agent, task)
return result.final_output
@function_tool
def handoff_to_booking(task: str) -> str:
"""Delegate to booking specialist for reserving a slot."""
result = Runner.run(booking_agent, task)
return result.final_output
# --- Manager Agent ---
manager_agent = Agent(
name="ManagerAgent",
instructions="""
You are a manager that routes user requests to specialists.
Available specialists:
- Availability specialist: for checking open slots
- Booking specialist: for reserving slots
For a complete booking:
1. First handoff to availability specialist
2. Then handoff to booking specialist
Synthesize responses before returning to user.
""",
tools=[handoff_to_availability, handoff_to_booking]
)
# --- Run ---
result = Runner.run(manager_agent, "Book me a court for tomorrow at 3pm")
print(result.final_output)
# Manager: analyzes → hands off to availability → hands off to booking → responds
Pros
- Flexible routing based on user intent
- Specialists can be optimized per domain
- Manager handles complex multi-step requests
- Single codebase, easy debugging
Cons
- Less predictable than workflow
- One crash affects all agents
- Single process limits
When to Use
- User requests vary significantly
- Need dynamic decision-making
- Want simple deployment
- Moderate complexity
Pattern G: Multi-Agent (Independent Runtime)
Style: Multi-Agent — Manager routes dynamically to specialists
Runtime: Independent — each agent runs in its own service
Architecture
User → Service C (Manager Agent) → [Routes dynamically]
↓
┌───────────┴───────────┐
↓ ↓
Service A Service B
(Availability Agent) (Booking Agent)
↓ ↓
Agent logic Agent logic
(any vendor) (any vendor)
↓ ↓
DB DB
Three independent services: Manager receives user requests and routes to specialists. Each service wraps its own agent with full isolation.
Difference from Pattern F
| Pattern F (Shared Runtime) | Pattern G (Independent Runtime) |
|---|---|
| All agents in one process | Each agent in its own service |
| Single vendor typically | Mix vendors freely |
| Shared memory | Pass data via API |
| Fast | Network latency |
| One crash affects all | Failures are isolated |
Pseudo Code
from agents import Agent, Runner, function_tool
# --- Service A: Availability Agent (uses OpenAI) ---
# Deployed as separate service
def availability_service_handler(event):
task = event["task"]
# Pre-processing (custom logic)
task = sanitize_input(task)
@function_tool
def check_availability(date: str, time: str = None) -> dict:
"""Check available slots."""
return db.query_slots(date, time)
agent = Agent(
name="AvailabilityAgent",
instructions="Check tennis court availability. Return available slots.",
tools=[check_availability]
)
result = Runner.run(agent, task)
# Post-processing (custom logic)
return format_response(result.final_output)
# --- Service B: Booking Agent (uses Claude) ---
# Deployed as separate service
def booking_service_handler(event):
task = event["task"]
user_id = event["user_id"]
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
system="You book tennis court slots. Extract slot_id and confirm booking.",
messages=[{"role": "user", "content": task}],
tools=[{
"name": "book_slot",
"description": "Reserve a tennis court slot",
"input_schema": {
"type": "object",
"properties": {
"slot_id": {"type": "string"}
},
"required": ["slot_id"]
}
}]
)
# Execute tool if called
if response.stop_reason == "tool_use":
tool_input = response.content[1].input
booking = db.reserve(tool_input["slot_id"], user_id)
return {"confirmation": booking}
return {"message": response.content[0].text}
# --- Service C: Manager Agent (Entry Point) ---
# Deployed as Lambda - receives user requests and routes to specialists
def manager_service_handler(event):
user_input = event["input"]
user_id = event["user_id"]
@function_tool
def invoke_availability_agent(task: str) -> str:
"""Delegate to availability service for checking slots."""
response = invoke_service("availability-service", {"task": task})
return response
@function_tool
def invoke_booking_agent(task: str) -> str:
"""Delegate to booking service for reserving a slot."""
# user_id captured from handler scope
response = invoke_service("booking-service", {"task": task, "user_id": user_id})
return response
manager = Agent(
name="ManagerAgent",
instructions="""
Route user requests to specialist services:
- Checking availability → invoke_availability_agent
- Making a reservation → invoke_booking_agent
For a complete booking:
1. First call availability agent
2. Then call booking agent with the chosen slot
Synthesize responses before returning to user.
""",
tools=[invoke_availability_agent, invoke_booking_agent]
)
result = Runner.run(manager, user_input)
return {"response": result.final_output}
# Invocation: API Gateway → Manager Lambda → Specialist Lambdas
# invoke_service("manager-service", {"input": "Book me a court for tomorrow", "user_id": "user-123"})
Pros
- Mix AI vendors per agent (OpenAI, Claude, Bedrock, Mistral)
- Full isolation (one agent fails independently)
- Custom pre/post processing per agent
- Independent deployment and scaling
Cons
- Most complex to build
- Network latency
- More infrastructure to manage
- Higher operational overhead
When to Use
- Need to mix AI vendors per domain
- Strict isolation required (compliance, security)
- Different agents need different resources
- Enterprise / production systems
Pattern H: Bedrock Agent (AWS Managed)
Style: Agent — AWS-managed reasoning + action loop
Runtime: Managed — AWS handles everything
This pattern is an AWS-native alternative to Pattern E (Single Agent). Instead of managing the agent yourself, AWS Bedrock handles everything.
Architecture
User → Bedrock Agent → [Decides] → Lambda (Action Group) → DB
↑___________ observes result ___________|
Bedrock Agent reasons about what to do, picks actions, executes, and loops until done.
Pseudo Code
import boto3
# Agent definition (configured in Bedrock console or via API)
agent_config = {
"agentName": "TennisBookingAgent",
"instruction": """
You help users book tennis courts.
When a user wants to book:
1. Check availability for their requested date/time
2. Present options
3. Book their chosen slot
4. Confirm the booking
""",
"foundationModel": "anthropic.claude-3-sonnet-20240229-v1:0",
"actionGroups": [
{
"actionGroupName": "BookingActions",
"actionGroupExecutor": {
"lambda": "arn:aws:lambda:...:booking-handler"
},
"apiSchema": {
"actions": [
{
"name": "check_availability",
"description": "Check available tennis court slots",
"parameters": {
"date": {"type": "string", "required": True},
"time": {"type": "string", "required": False}
}
},
{
"name": "book_slot",
"description": "Book a specific slot",
"parameters": {
"slot_id": {"type": "string", "required": True},
"user_id": {"type": "string", "required": True}
}
}
]
}
}
]
}
# Lambda handles the actual DB work
def booking_handler(event):
action = event["actionGroup"]["name"]
params = event["parameters"]
if action == "check_availability":
return db.query_slots(params["date"], params.get("time"))
elif action == "book_slot":
return db.reserve(params["slot_id"], params["user_id"])
# Invocation - agent handles the rest
bedrock_agent = boto3.client("bedrock-agent-runtime")
response = bedrock_agent.invoke_agent(
agentId="your-agent-id",
agentAliasId="your-alias-id",
sessionId="user-session-123",
inputText="Book me a court for tomorrow at 3pm"
)
# Agent autonomously: checks availability → picks slot → books → confirms
for event in response["completion"]:
if "chunk" in event:
print(event["chunk"]["bytes"].decode())
Pros
- Fully managed — AWS handles scaling, reasoning loop
- Built-in session management
- Integrates with AWS ecosystem (CloudWatch, IAM, etc.)
- Knowledge bases and guardrails available
- No agent framework code to maintain
Cons
- AWS vendor lock-in
- Less control over agent behavior
- Debugging through AWS console
- Latency can be higher
- Limited customization of agent loop
When to Use
- Already AWS-native infrastructure
- Want fully managed solution
- Need built-in AWS integrations
- Team familiar with AWS services
Comparison with Pattern E
| Aspect | Pattern E (Single Agent) | Pattern H (Bedrock) |
|---|---|---|
| Control | You own the code | AWS manages |
| Vendor | Any (OpenAI, Claude SDK, etc.) | AWS only |
| Debugging | Your logs, your tools | AWS Console/CloudWatch |
| Scaling | You manage | AWS manages |
| Cost model | Pay per API call | Pay per agent invocation |
| Customization | Full control | Limited to Bedrock features |
Side-by-Side Comparison
| Pattern | Style | Who Decides Flow | Runtime | Complexity |
|---|---|---|---|---|
| A | No Agent | You | Shared | Low |
| B | Workflow | You (fixed steps) | Shared | Medium |
| C | Workflow | You (fixed steps) | Independent | Medium-High |
| D | Function Call | LLM suggests, you execute | Shared | Medium |
| E | Single Agent | Agent | Shared | Low |
| F | Multi-Agent | Manager Agent | Shared | Medium |
| G | Multi-Agent | Manager Agent | Independent | High |
| H | Bedrock Agent | AWS | Managed | Low-Medium |
Runtime explained:
- Shared — All runs together in one process
- Independent — Each step/agent runs in its own service
- Managed — Cloud provider handles it
Decision Guide
Do you need AI to make decisions (not just parse)?
│
No → Pattern A (AI as Service)
│
Yes
│
Do you want AWS to manage everything? → Yes → Pattern H (Bedrock Agent)
│
No
│
Is the flow predictable (fixed sequence)?
│
Yes → Need independent scaling/deployment? → No → Pattern B (Workflow, Shared)
│ → Yes → Pattern C (Workflow, Independent)
│
No (dynamic flow needed)
│
Do you want to control the loop yourself?
│
Yes → Pattern D (Function Calling)
│
No (let agent handle it)
│
Do you need multiple specialized agents?
│
No → Pattern E (Single Agent)
│
Yes → Need independent scaling/deployment? → No → Pattern F (Multi-Agent, Shared)
→ Yes → Pattern G (Multi-Agent, Independent)
Quick Reference
| If you need… | Use Pattern |
|---|---|
| Full control, AI just parses | A |
| Fixed steps, shared runtime | B |
| Fixed steps, independent runtime | C |
| LLM suggests functions, you control loop | D |
| Autonomous agent, minimal code | E |
| Dynamic routing, shared runtime | F |
| Dynamic routing, independent runtime | G |
| AWS-managed agent | H |
The Spectrum
Control ←————————————————————————————————→ Autonomy
A B C D E F G
│ │ │ │ │ │ │
No Workflow Workflow Function Agent Multi Multi
Agent (Shared) (Indep.) Calling Agent Agent
│ │ │ │ │ │ │
You Fixed Fixed LLM Agent Manager Manager
control steps steps suggests controls routes routes
all (shared) (indep.) you loop loop
control
H
│
Bedrock
(AWS Managed)
Conclusion
There’s no silver bullet. The right pattern depends on:
- How much control do you need?
- Is the flow predictable or dynamic?
- Do you need independent scaling/deployment?
- How complex is your system?
- What’s your tolerance for unpredictability?
The Workflow Sweet Spot
Patterns B and C (Workflow) occupy a unique middle ground:
| What you get | Comparable to |
|---|---|
| Deterministic step order | Like A (AI as Service) |
| AI reasoning within each step | Like E (Single Agent) |
| Custom logic between steps | Unique to Workflow |
When you need predictable sequences but still want AI flexibility within each step, Workflow patterns are your answer.
This is why many production systems start with Workflow (B/C) rather than jumping straight to autonomous agents (D/E/F/G) — you get AI power with predictable behavior.
How AI’s Role Evolves
Notice how AI’s job changes across patterns:
| Pattern | AI Task |
|---|---|
| A | Discriminative only — parse, classify, extract |
| B–H | Discriminative + Generative — reason, plan, respond |
In Pattern A, you could theoretically replace the LLM with a simpler NLU tool (though multilingual inputs make LLM worthwhile). The AI just converts messy input to structured data.
In Patterns B–H, the AI must think:
- “What’s missing? I should ask.”
- “Two slots available. I should present options.”
- “Booking failed. I should explain and suggest alternatives.”
This shift from parsing to reasoning is why agent patterns feel more powerful — but also less predictable.
Progression Path
Start simple, evolve as needed:
A (No Agent)
↓ need multi-step with AI
B (Workflow, Shared) — fixed steps, simple deployment
C (Workflow, Independent) — fixed steps, need scaling/isolation
↓ need dynamic flow
D (Function Calling) — LLM suggests, you control loop
E (Single Agent) — agent controls the loop
↓ need specialized agents
F (Multi-Agent, Shared) — manager routes, simple deployment
G (Multi-Agent, Independent) — enterprise scale, full isolation
H (Bedrock) — AWS alternative to E
Start simple, add complexity only when the problem demands it.