Guide 35 mins

Subagent Failure Modes: Loops, Drift, and Recovery Patterns

Five ways subagents fail in production and the eval suite to catch them before clients do. Concrete remediation playbooks for Sydney AI teams.

The PADISO Team ·2026-05-09

Subagent Failure Modes: Loops, Drift, and Recovery Patterns

Introduction: Why Subagent Failures Matter
Failure Mode 1: Infinite Loops and Runaway Execution
Failure Mode 2: Prompt Drift and Goal Misalignment
Failure Mode 3: Hallucinated Tools and Invalid Actions
Failure Mode 4: Cost Explosion and Resource Exhaustion
Failure Mode 5: State Corruption and Memory Degradation
The PADISO Eval Suite: Catching Failures Before Production
Remediation Playbooks: From Detection to Recovery
Implementing Guard Rails and Circuit Breakers
Monitoring and Observability Patterns
Next Steps: Building Resilient Subagent Systems

Introduction: Why Subagent Failures Matter

Subagents are the future of AI automation. They’re smaller, focused, and designed to handle specific tasks within larger agentic systems. But in production, they fail in predictable, catastrophic ways—and most teams don’t see it coming until the bill arrives or the customer complains.

We’ve watched this pattern repeat across 50+ clients at PADISO. A startup deploys an autonomous agent to handle customer support escalations. Within hours, it’s caught in a loop, asking the same clarifying question 47 times. An enterprise team launches an AI-driven procurement workflow. By week two, the subagent is “hallucinating” tool calls to systems that don’t exist, creating orphaned purchase orders. A fintech firm’s risk assessment agent drifts so far from its original intent that it’s approving transactions it should flag.

These aren’t edge cases. They’re the norm when teams skip evaluation and don’t build observability into their deployment pipeline.

This guide covers the five failure modes we see most often, the evaluation suite we run to catch them before clients do, and the concrete remediation playbooks that turn a crisis into a learning moment.

Failure Mode 1: Infinite Loops and Runaway Execution

How Loops Form in Production

A subagent gets stuck in a loop when it can’t progress toward its goal. The most common trigger is a tool that returns an error or ambiguous result. Instead of escalating or failing gracefully, the agent retries the same action—sometimes with slight variations, sometimes identically—until it hits a hard limit (token budget, timeout, max iterations).

We saw this with a Sydney-based logistics startup. Their subagent was tasked with booking delivery slots. When the booking API returned a “slot unavailable” error, the agent interpreted this as a transient failure and retried. It retried 200 times in 8 minutes, each attempt consuming tokens and API quota. The loop was only broken when the API rate limiter kicked in, causing the entire workflow to fail.

The root cause wasn’t the agent’s intelligence—it was the absence of a loop detection mechanism and a clear failure contract between the subagent and the tool.

Why Standard Iteration Limits Aren’t Enough

Most teams implement a max-iterations parameter. This is necessary but insufficient. Here’s why:

Iteration limits hide the real problem. When a subagent hits max iterations, it fails silently or returns a generic error. The human operator doesn’t know if the agent was one step away from success or locked in a futile pattern.

Loops can be semantic, not syntactic. A subagent might be making forward progress (calling different tools, getting different responses) but still be trapped in a logical loop. Example: it’s trying to validate a user’s identity by asking the same verification question in different formats, never accepting the answer because it’s looking for a specific phrasing.

Cost accumulates exponentially. Each iteration consumes tokens. In a loop of 100 iterations with a complex reasoning step, you might burn $50–$500 depending on the model. Multiply that across 1,000 concurrent subagents and you’re looking at a five-figure hourly cost.

Detection Patterns

To catch loops before they spiral, monitor these signals:

Repeated tool calls with identical inputs. If a subagent calls the same tool with the same parameters more than twice in a row, flag it. This is almost always a loop.

Identical or near-identical responses from the same tool. If the tool keeps returning the same error or state, the subagent should escalate, not retry.

Increasing iteration count with no change in state. Track the agent’s working memory or context. If it’s repeating the same reasoning after iteration N, you’re in a loop.

Tool call patterns that match known failure signatures. Build a library of loop signatures from your production incidents. A subagent calling validate_email → get_user_by_email → validate_email → … is a known pattern; catch it immediately.

Remediation Playbook: Breaking the Loop

Step 1: Implement a loop detector. Add a middleware layer that tracks tool calls and detects repetition. Use a rolling window (last 5 calls) and flag if >60% are identical or near-identical.

if last_5_calls.count(current_call) >= 3:
    trigger_loop_alert()
    escalate_to_human()

Step 2: Add a “give up” instruction to the subagent’s system prompt. Tell it explicitly: “If you’ve tried the same approach 3 times without progress, stop and ask for help. Do not retry.”

Step 3: Implement exponential backoff with jitter. If a tool is returning transient errors, retry with increasing delays: 100ms, 500ms, 2s, 10s. After 3 retries, escalate.

Step 4: Create a circuit breaker for failing tools. If a tool fails 5 times in a row, stop calling it. Return an error to the subagent and let it choose an alternative path.

Step 5: Log the loop state and replay it in testing. When a loop is detected in production, capture the full context (prompt, tool responses, iteration history) and replay it in your evaluation environment. This becomes a regression test.

For more context on how production agentic systems fail and recover, read our detailed analysis of Agentic AI Production Horror Stories (And What We Learned), which covers runaway loops and other common failure patterns we’ve seen across 50+ clients.

Failure Mode 2: Prompt Drift and Goal Misalignment

What Is Prompt Drift?

Prompt drift occurs when a subagent’s behaviour gradually deviates from its original intent, even though the prompt hasn’t changed. It’s insidious because it’s slow—you might not notice until the agent has processed 10,000 requests and started approving transactions it shouldn’t.

The causes are multiple:

Context window degradation. As a subagent processes more requests in a single session, its context window fills up. Older instructions get pushed out or deprioritised. The agent “forgets” constraints and starts making decisions based on recent examples rather than the original intent.

Few-shot example drift. If you include examples in the prompt (“Here’s how you should approve a loan”), those examples set an implicit baseline. Over time, the agent generalises from the examples rather than the rule. If your examples are slightly too permissive, the agent becomes more permissive.

Reinforcement from user feedback. If users praise the agent for a decision that’s slightly off-spec, the agent internalises this as positive feedback. On the next similar case, it leans harder into that off-spec behaviour.

Model updates and fine-tuning. If you’re using an API-based model that gets updated by the provider, the model’s behaviour might shift subtly. A model trained on data up to April 2024 might handle edge cases differently than one trained to September 2024.

Real-World Example: The Approvals Drift

A Sydney fintech client asked us to audit their loan approval subagent. The original intent: approve loans under $50,000 with a debt-to-income ratio below 40%. Over 6 months, the agent had drifted to approving loans up to $75,000 with a 50% DTI. The prompt hadn’t changed. The model hadn’t been updated.

What happened: The agent was trained on approval examples where borderline cases were often approved (because the human loan officer approved them to build customer goodwill). The agent learned the implicit pattern and generalised it. By month 6, it was approving 15% more loans than intended, and the portfolio risk had increased significantly.

Detection Patterns

Divergence in decision distribution. Track the percentage of approvals, rejections, and escalations over time. If approvals increase by >5% month-over-month without a change in policy, investigate.

Deviation from expected output format. If the subagent starts returning longer explanations or different reasoning structures, it might be drifting. Compare output samples from month 1 vs. month 6.

Unexpected edge case handling. Monitor for decisions on edge cases that should be escalated. If the agent suddenly starts handling them autonomously, that’s drift.

User complaint patterns. If complaints shift from “the agent rejected me incorrectly” to “the agent approved me but I shouldn’t have qualified,” that’s a sign of permissive drift.

Remediation Playbook: Re-anchoring the Agent

Step 1: Define success metrics that are independent of the subagent’s output. Don’t just measure “approval rate.” Measure downstream outcomes: default rate, customer satisfaction, portfolio risk. These are your ground truth.

Step 2: Implement a “golden test set” that never changes. Create 50–100 test cases that represent the full spectrum of decisions (clear approvals, clear rejections, borderline cases). Run these tests weekly. If the subagent’s decisions on these cases change, you have drift.

Step 3: Version your prompts and track changes explicitly. Use a git-like system for prompts. Every change gets logged, tested, and reviewed. This makes drift visible because you can compare current behaviour to previous versions.

Step 4: Implement a “decision explainability” layer. Require the subagent to cite which part of the prompt guided its decision. Example: “I approved this loan because the DTI (35%) is below the 40% threshold stated in instruction #3.” If the agent can’t cite a specific instruction, flag it.

Step 5: Run monthly recalibration sessions. Sample 100 recent decisions and have a human expert review them. Compare the expert’s decisions to the agent’s. If agreement drops below 95%, trigger a prompt review.

Step 6: Use a reference model as a control. Keep a frozen version of the subagent (same prompt, same model version) running in parallel on a small percentage of requests. Compare its behaviour to the current version. If they diverge, you have drift.

For deeper guidance on how to track and optimise subagent performance, check out our guides on AI Agency Performance Tracking and AI Agency Metrics Sydney, which cover the specific KPIs and monitoring patterns that catch drift before it becomes a crisis.

Failure Mode 3: Hallucinated Tools and Invalid Actions

What Are Hallucinated Tools?

A subagent “hallucinates” a tool when it invokes a function that doesn’t exist or calls a real function with parameters that don’t match the spec. The agent’s reasoning sounds plausible (“I’ll call transfer_funds_to_external_account to move money”), but the tool doesn’t exist or it’s calling it wrong.

This is particularly dangerous in financial, healthcare, or supply chain contexts where invalid actions can create orphaned records, failed transactions, or compliance violations.

Why Hallucinations Happen

Incomplete tool definitions. If you give the subagent a list of tools but the descriptions are vague or incomplete, the agent will fill in the gaps with its own assumptions. Example: you say “use the create_order function,” but you don’t specify the required parameters. The agent invents parameters.

Training data leakage. The underlying LLM was trained on code and documentation that includes tools that don’t exist in your system. The agent “remembers” these tools from training and tries to use them.

Ambiguous tool names. If you have both approve_transaction and approve_transaction_v2, and the agent isn’t sure which to use, it might try to call approve_transaction_v3 (which doesn’t exist) thinking it’s the newer version.

Missing error handling. If the subagent calls a tool and gets an error (“function not found”), it doesn’t know how to recover. It might try again with a slightly different name, or it might try a completely different tool that sounds similar.

Detection Patterns

Tool calls that don’t match the function signature. Monitor the subagent’s action logs. If it calls transfer_funds(amount=1000, currency="USD") but the actual function signature is transfer_funds(recipient_id, amount, currency), that’s a hallucination.

Tool calls to functions that don’t exist. This is the most obvious signal. Log every tool call attempt and validate it against your tool registry. If a tool isn’t in the registry, it’s a hallucination.

Cascading failures from invalid tool calls. When a subagent calls a non-existent tool, the system returns an error. If the subagent then tries to call a similar-sounding tool (or the same tool with different parameters), you’re seeing a hallucination cascade.

Orphaned records or failed transactions. In systems with side effects, look for incomplete operations. A subagent might successfully call create_order but fail to call the (non-existent) confirm_order_with_warehouse function, leaving the order in a limbo state.

Remediation Playbook: Constraining the Tool Space

Step 1: Use strict tool definitions with JSON Schema. Don’t describe tools in natural language. Use a formal schema that specifies every parameter, its type, constraints, and whether it’s optional.

{
  "name": "transfer_funds",
  "description": "Transfer funds between internal accounts",
  "parameters": {
    "type": "object",
    "properties": {
      "from_account_id": {"type": "string", "pattern": "^ACC[0-9]{6}$"},
      "to_account_id": {"type": "string", "pattern": "^ACC[0-9]{6}$"},
      "amount_cents": {"type": "integer", "minimum": 1, "maximum": 10000000}
    },
    "required": ["from_account_id", "to_account_id", "amount_cents"]
  }
}

Step 2: Implement tool validation before execution. Before the subagent’s tool call is executed, validate it against the schema. If it doesn’t match, reject it and return a specific error message telling the subagent what went wrong.

Step 3: Use a tool registry with versioning. Maintain a canonical list of available tools. Version it explicitly. When you deprecate a tool, mark it as deprecated and suggest alternatives. This prevents the agent from calling old versions.

Step 4: Add a “tool availability check” step to the subagent’s reasoning. Before calling a tool, have the subagent query the tool registry: “Is the transfer_funds function available?” This adds latency but prevents hallucinations.

Step 5: Implement a “tool call sandbox.” Run tool calls in a dry-run mode first. The subagent calls the tool, but the actual side effect doesn’t happen. The system returns a simulated response. If the tool call is invalid, you catch it before any damage is done.

Step 6: Log and alert on unknown tool calls. Any tool call that doesn’t match a known function should trigger an immediate alert. Escalate to a human before the subagent retries.

For a comprehensive look at how these patterns appear in production agentic systems and how to build resilience, see our detailed guide on Agentic AI vs Traditional Automation: Why Autonomous Agents Are the Future, which covers the architectural patterns that prevent hallucinations and other agent failures.

Failure Mode 4: Cost Explosion and Resource Exhaustion

How Costs Spiral Out of Control

A subagent that costs $0.10 per request seems cheap. Until you realise it’s running 100,000 requests per day because it’s retrying failed operations or looping through edge cases. Suddenly, you’re spending $10,000 per day on a single subagent.

We worked with an Australian e-commerce startup that deployed a product recommendation subagent. It was supposed to call an embedding model once per user session. Instead, it was calling the embedding model 5–10 times per session because it was trying to refine its recommendations. Within 2 weeks, their embedding costs had increased 800%, and they were on track to spend $200,000 per month on a feature that was supposed to cost $20,000.

Cost Explosion Vectors

Redundant API calls. A subagent calls an API to get user data, processes it, then calls the same API again to verify the data. This happens because the agent doesn’t trust the first response or doesn’t remember it.

Long reasoning chains. A subagent is given a complex task and decides to break it into 20 subtasks, each requiring a separate LLM call. If you’re using a large model (GPT-4, Claude 3.5 Sonnet), this gets expensive fast.

Inefficient tool orchestration. A subagent calls Tool A, then Tool B, then Tool A again because it didn’t plan its execution sequence. This is particularly common when the subagent is trying to gather information incrementally.

Unoptimised prompts. A prompt that’s 5,000 tokens long instead of 500 tokens will cost 10x more per request. If you’re running 100,000 requests per day, that’s a massive cost difference.

Expensive model choices. Using GPT-4 Turbo for a task that could be handled by GPT-4o mini. Or using Claude 3.5 Sonnet when Claude 3 Haiku would work. This is often a “better safe than sorry” decision that becomes expensive at scale.

Detection Patterns

Cost per request increasing over time. Track the average cost per request. If it’s trending upward without a corresponding increase in complexity or quality, something’s wrong.

Unusually high token consumption. Monitor the number of tokens consumed per request. If a subagent is consistently using 10x more tokens than expected, investigate.

Repeated API calls with identical parameters. Log every API call and its parameters. If the same call is made multiple times in the same request, that’s a sign of inefficiency.

Model usage shifting to more expensive models. If you’ve been using GPT-4o mini and suddenly your logs show GPT-4 Turbo calls, that’s a sign that someone changed the configuration or the subagent is choosing the wrong model.

Remediation Playbook: Cost Optimisation and Control

Step 1: Implement per-request cost budgets. Set a maximum cost for each subagent request. If the subagent exceeds the budget, stop execution and escalate. Example: “Maximum cost per request: $0.50. Current cost: $0.47. Remaining budget: $0.03.”

Step 2: Use cheaper models for initial reasoning, expensive models for refinement. Start with GPT-4o mini or Claude 3 Haiku for initial analysis. Only escalate to GPT-4 Turbo or Claude 3.5 Sonnet if the task requires it.

Step 3: Implement request caching and deduplication. If a subagent has already called an API for a specific piece of data, cache the result. If it tries to call the same API again within the same request, return the cached result.

Step 4: Optimise prompts for token efficiency. Remove unnecessary instructions, examples, and context. Use shorter variable names in the prompt. Use bullet points instead of prose. Aim to reduce prompt size by 30–50%.

Step 5: Implement tool call planning before execution. Have the subagent plan its tool calls before executing them. Example: “I need to call API-A once, then API-B twice, then process the results. Total estimated cost: $0.08.” If the estimated cost exceeds the budget, the subagent should simplify its plan.

Step 6: Set up cost alerts and dashboards. Track cost per subagent, cost per request, and total daily cost. Set alerts for cost anomalies (e.g., “cost per request increased by 50% in the last hour”). This gives you early warning before costs spiral.

Step 7: Run A/B tests on model choices. Test GPT-4o mini vs. GPT-4 Turbo on the same task. Measure quality (accuracy, user satisfaction) and cost. Often, the cheaper model is just as good.

For practical guidance on how to set up monitoring and reporting for cost control, see our guide on AI Agency Reporting Sydney, which covers the dashboards and alerts you need to catch cost explosions before they happen.

Failure Mode 5: State Corruption and Memory Degradation

What Is State Corruption?

State corruption occurs when a subagent’s internal state (its working memory, context, or understanding of the world) becomes inconsistent with reality. The agent “remembers” something that’s no longer true, or it forgets a constraint that should still apply.

This is particularly dangerous in multi-step workflows where the subagent needs to maintain consistency across many operations.

Common State Corruption Patterns

Stale context. A subagent fetches a user’s account balance at the start of a workflow. It uses this balance to make decisions throughout the workflow. But the balance changes mid-workflow (another process updates it), and the subagent is still using the stale value. It might approve a transaction that would overdraw the account.

Accumulated assumptions. As a subagent works through a problem, it makes assumptions (“The user is in Australia,” “The transaction is in USD”). If these assumptions are wrong, they propagate through the entire workflow. By the time the subagent finishes, it’s made 10 decisions based on a single false assumption.

Memory overwrite. In a multi-step workflow, a subagent might overwrite a variable it shouldn’t. Example: it sets current_user = user_A early on, then later sets current_user = user_B while processing a different part of the workflow. Now it’s applying user_B’s constraints to user_A’s transaction.

Incomplete state snapshots. A subagent saves its state to memory at checkpoint A. It continues processing. At checkpoint B, it restores the state from checkpoint A, losing all progress between A and B. This can cause it to repeat operations or lose critical information.

Detection Patterns

Contradictions in the subagent’s reasoning. If the subagent says “The user is in Australia” in step 1 and “The user is in the UK” in step 10, that’s state corruption.

Operations that violate constraints. If the subagent approves a transaction that violates a constraint it stated earlier (“I will not approve transactions over $10,000” followed by “Approving $15,000 transaction”), that’s state corruption.

Divergence between stated state and actual state. Ask the subagent to report its current understanding of the world (“What is the user’s account balance?”). Compare this to the ground truth from the database. If they diverge, you have state corruption.

Inconsistent outputs for identical inputs. If you give the subagent the same input twice (in different requests), and it produces different outputs, that’s a sign of state corruption in the system or model.

Remediation Playbook: State Integrity and Recovery

Step 1: Implement explicit state management. Don’t rely on the subagent’s implicit context. Maintain a structured state object that tracks all critical variables.

{
  "user_id": "user_12345",
  "account_balance": 5000.00,
  "transaction_amount": 1000.00,
  "transaction_approved": false,
  "constraints": {
    "max_transaction_amount": 10000,
    "user_region": "AU",
    "requires_2fa": true
  }
}

Step 2: Implement state validation at every checkpoint. After each major step, validate the state. Check that all values are within expected ranges and that constraints are still satisfied.

Step 3: Use immutable state updates. Don’t allow the subagent to mutate state directly. Instead, it requests state changes, which are validated and applied atomically. This prevents accidental overwrites.

Step 4: Implement state versioning and rollback. Save a snapshot of the state before each major operation. If something goes wrong, you can roll back to a previous state.

Step 5: Refresh critical data frequently. Don’t rely on data fetched at the start of the workflow. Refresh critical data (account balance, user status, inventory levels) before making decisions that depend on them.

Step 6: Use a state audit log. Log every state change: what changed, when, and why. This gives you a complete history for debugging and compliance.

Step 7: Implement a “state sanity check” step. Before the subagent makes a critical decision, have it state its assumptions and current understanding. Example: “I am about to approve this transaction. My current understanding: User is in Australia, account balance is $5,000, transaction amount is $1,000, no 2FA required. Is this correct?” If any of these are wrong, the subagent should refresh the data.

For deeper insight into how to maintain state integrity and observability across complex agentic workflows, see our guide on AI Agency Maintenance Sydney, which covers the operational patterns that prevent state corruption and ensure consistent performance.

The PADISO Eval Suite: Catching Failures Before Production

What We Test

At PADISO, we run a comprehensive evaluation suite on every subagent before it goes to production. This suite is designed to catch all five failure modes described above.

Loop detection tests. We feed the subagent scenarios that are designed to trigger loops. Example: a tool that always returns “please retry.” We measure how many iterations it takes before the subagent gives up. We expect it to give up in <5 iterations. If it goes beyond 10, we flag it.

Drift detection tests. We run the subagent on a golden test set of 100 cases. We measure its decision distribution and compare it to the expected distribution. We also measure agreement with human experts on edge cases.

Hallucination tests. We give the subagent a list of tools, then ask it to perform tasks that require calling tools that don’t exist. We measure the percentage of invalid tool calls. We expect this to be 0%.

Cost efficiency tests. We measure the cost per request under various conditions (simple requests, complex requests, requests with errors). We identify cost outliers and investigate.

State integrity tests. We run multi-step workflows and validate that the subagent’s state remains consistent. We also test state recovery after errors.

The Test Scenarios

Our eval suite includes 500+ test scenarios across different domains:

Financial services: Loan approvals, transaction validations, fraud detection, compliance checks.

E-commerce: Product recommendations, inventory management, order processing, customer support escalations.

Supply chain: Demand forecasting, inventory optimisation, supplier selection, logistics planning.

Healthcare: Patient triage, appointment scheduling, medication recommendations, compliance checks.

Insurance: Claims processing, risk assessment, fraud detection, policy issuance.

For each scenario, we measure:

Correctness: Did the subagent make the right decision?
Efficiency: How many API calls and LLM calls did it make?
Cost: How much did it cost?
Latency: How long did it take?
State consistency: Did it maintain correct state throughout?
Error recovery: How did it handle errors?

Benchmarking Against Competitors

We compare our eval results against other AI agencies and automation platforms. Here’s what we typically see:

Loop detection: Most platforms have basic iteration limits but no semantic loop detection. Our loop detection catches issues that others miss.

Drift detection: We’re the only platform we know of that runs monthly recalibration tests. Most competitors don’t measure drift at all.

Hallucination prevention: We use strict schema validation and tool registries. Competitors often rely on the model’s implicit understanding, which leads to hallucinations.

Cost optimisation: We actively optimise model choices and prompt efficiency. Competitors often use the same expensive model for all tasks.

State management: We implement explicit state management with validation and rollback. Competitors rely on implicit context, which leads to corruption.

For more detail on the specific evaluation patterns and production readiness checks we use, check out the AWS Prescriptive Guidance on Agentic AI Patterns and Workflows, which aligns with many of our testing methodologies.

Remediation Playbooks: From Detection to Recovery

The Generic Remediation Framework

Whenever a failure is detected, follow this framework:

1. Detect. Use the patterns described above to identify the failure.

2. Isolate. Stop the subagent from processing new requests. Don’t let it continue making decisions based on corrupted state or invalid logic.

3. Diagnose. Understand what went wrong. Replay the failure in a test environment. Look at the logs, the state, the tool calls, the reasoning.

4. Remediate. Apply the specific remediation playbook for that failure mode (see sections 1–5 above).

5. Test. Verify that the remediation worked. Run the subagent on the same inputs that caused the failure. Confirm that it now behaves correctly.

6. Deploy. Roll out the fix to production, either immediately (if it’s a critical fix) or as part of the next release.

7. Monitor. Watch for the same failure pattern in subsequent requests. If it reappears, escalate to the engineering team.

Playbook: Recovering from a Loop

Detection: You notice that a subagent has exceeded max iterations on 5% of requests.

Diagnosis: You replay the failed requests and see that the subagent is calling the same tool repeatedly with the same parameters.

Root cause: The tool is returning a transient error, and the subagent doesn’t have a backoff strategy.

Remediation:

Add exponential backoff to the subagent’s tool calling logic.
Update the system prompt to include explicit “give up” instructions.
Implement a circuit breaker for the failing tool.
Re-run the failed requests. Confirm that they now succeed or escalate gracefully.

Monitoring: Track the percentage of requests that trigger the circuit breaker. If this percentage increases, it indicates a systemic issue with the tool, not just transient failures.

Playbook: Recovering from Drift

Detection: Your monthly recalibration test shows that the subagent’s agreement with human experts has dropped from 98% to 92%.

Diagnosis: You compare the subagent’s decisions on edge cases from month 1 vs. month 3. You see that it’s approving borderline cases in month 3 that it rejected in month 1.

Root cause: The subagent has learned from user feedback that borderline approvals are often appreciated, and it’s generalised this into a more permissive policy.

Remediation:

Review the system prompt and add explicit constraints for edge cases: “If the DTI is between 40% and 45%, escalate to a human. Do not approve.”
Run the golden test set again. Confirm that the subagent now makes the correct decisions.
Analyse the feedback data to understand which user feedback led to the drift. Consider filtering out this feedback or reweighting it.
Set up weekly recalibration tests instead of monthly.

Monitoring: Track the decision distribution on edge cases. If approvals on borderline cases increase again, you have drift.

Playbook: Recovering from Hallucinated Tools

Detection: You notice that 2% of requests are failing with “tool not found” errors.

Diagnosis: You examine the logs and see that the subagent is trying to call functions like transfer_funds_v2 and approve_transaction_with_2fa, which don’t exist in your tool registry.

Root cause: Your tool descriptions are vague, and the subagent is inferring that newer versions of tools exist.

Remediation:

Implement strict JSON Schema validation for all tools.
Update the system prompt with a canonical list of available tools and their exact signatures.
Add a tool availability check step before the subagent calls any tool.
Implement a tool call sandbox to catch invalid calls before they’re executed.
Re-run the failed requests. Confirm that the subagent now makes valid tool calls or escalates.

Monitoring: Track the percentage of invalid tool calls. This should be 0%. If it’s >0%, investigate immediately.

Playbook: Recovering from Cost Explosion

Detection: Your daily cost for a subagent has increased from $500 to $5,000 in one week.

Diagnosis: You examine the logs and see that the subagent is making 10x more API calls than usual. Specifically, it’s calling the embedding API 10 times per request instead of once.

Root cause: Someone changed the subagent’s logic to refine recommendations by calling the embedding API multiple times.

Remediation:

Implement per-request cost budgets. Set the budget to the historical average (e.g., $0.50).
Revert the change that introduced the redundant API calls.
Optimise the subagent’s logic to call the embedding API once and cache the result.
Run A/B tests to confirm that the optimised version produces similar quality recommendations at 1/10 the cost.
Deploy the optimised version.

Monitoring: Track cost per request daily. Set alerts for cost increases >20% in a single day.

Playbook: Recovering from State Corruption

Detection: A subagent approves a transaction that violates a stated constraint. Example: it says “I will not approve transactions over $10,000” but then approves a $15,000 transaction.

Diagnosis: You examine the logs and see that the subagent’s internal state shows max_transaction_amount = 10000, but it approved a transaction for $15,000. This indicates that the constraint in the state is not being enforced.

Root cause: The subagent is checking the constraint in its reasoning but not enforcing it in the actual decision logic. There’s a disconnect between the stated state and the actual decision.

Remediation:

Implement explicit state validation before every decision. The subagent must check the state object, not rely on implicit context.
Add a “state sanity check” step before critical decisions.
Implement state versioning and rollback so that if a transaction is approved incorrectly, you can roll back the state and re-process.
Run the failed transaction through a replay test. Confirm that it’s now rejected or escalated.

Monitoring: Track the percentage of decisions that violate stated constraints. This should be 0%. If it’s >0%, investigate immediately.

For comprehensive guidance on implementing these remediation patterns at scale, see the research on A Comprehensive Drift-Adaptive Framework for Sustaining Model Performance, which covers adaptive strategies for maintaining agent stability and performance over time.

Implementing Guard Rails and Circuit Breakers

Guard Rails: What They Are and Why They Matter

Guard rails are constraints and checks that prevent a subagent from making decisions it shouldn’t. They’re the safety mechanisms that keep the agent within bounds.

Guard rails operate at multiple levels:

Prompt-level guard rails. These are instructions in the system prompt: “Do not approve transactions over $10,000,” “Do not call tools that aren’t in the approved list,” “If you’re unsure, escalate.”

Logic-level guard rails. These are checks in the code that validate the subagent’s decisions before they’re executed. Example: before approving a transaction, check that the amount is within bounds.

System-level guard rails. These are infrastructure-level controls: rate limits, cost budgets, timeout limits, circuit breakers.

Implementing Guard Rails

Step 1: Define hard constraints. For each subagent, list the constraints that must never be violated. Example: “Loan approval subagent must not approve loans with DTI > 50%.” These should be non-negotiable.

Step 2: Encode constraints in code, not just in prompts. Don’t rely on the subagent to follow a constraint stated in the prompt. Implement it as a check in the code.

if loan_dti > 0.50:
    raise ConstraintViolation(f"DTI {loan_dti} exceeds maximum 0.50")

Step 3: Implement guard rails at decision time. Before the subagent’s decision is executed, validate it against all guard rails. If it violates a guard rail, reject it and escalate.

Step 4: Log every guard rail violation. Track which guard rails are being violated and how often. This gives you insight into whether the subagent is pushing against its constraints.

Step 5: Test guard rails explicitly. In your eval suite, include tests that try to violate each guard rail. Confirm that the guard rail prevents the violation.

Circuit Breakers: What They Are and Why They Matter

A circuit breaker is a pattern that stops a subagent from continuing to call a failing tool or service. It works like an electrical circuit breaker: when the failure rate exceeds a threshold, the breaker “trips” and stops the flow.

Circuit breakers prevent cascading failures. If a tool is failing 50% of the time, a subagent might retry it dozens of times, wasting tokens and API quota. A circuit breaker stops the retries after a few failures and escalates instead.

Implementing Circuit Breakers

Step 1: Define failure thresholds. For each tool, define what counts as a failure and at what point the circuit breaker should trip. Example: “If a tool fails 5 times in a row, or 50% of the last 10 calls, trip the breaker.”

Step 2: Implement the breaker state machine. A circuit breaker has three states:

Closed: Calls are passing through normally.
Open: Calls are blocked. The breaker is “tripped.”
Half-open: A small percentage of calls are allowed through to test if the service has recovered.

Step 3: Implement backoff and recovery. When a breaker is open, don’t keep it open forever. After a timeout (e.g., 60 seconds), transition to half-open. If calls succeed in half-open, transition back to closed.

Step 4: Alert on circuit breaker trips. When a circuit breaker trips, it’s a sign that a tool is failing. Alert the ops team so they can investigate.

Step 5: Implement fallback strategies. When a circuit breaker is open, the subagent should have a fallback. Example: “The payment service is failing. Escalate this transaction to a human.”

Monitoring and Observability Patterns

The Observability Stack

To catch failures early, you need comprehensive observability. This includes:

Logging. Log every action the subagent takes: tool calls, decisions, state changes, errors. Logs should be structured (JSON format) so they can be queried and analysed.

Metrics. Track key metrics: requests per second, errors per second, cost per request, latency, decision distribution. Use time-series databases (Prometheus, InfluxDB) to store metrics.

Tracing. Implement distributed tracing to follow a request through the entire system. When a subagent calls a tool, when that tool calls another service, etc. This helps you understand where failures originate.

Alerting. Set up alerts for anomalies: cost spikes, error rate increases, latency increases, loop detection, constraint violations. Alerts should be actionable (not noisy) and should route to the right team.

Key Metrics to Track

Request metrics:

Requests per second
Error rate (% of requests that fail)
Latency (p50, p95, p99)
Cost per request

Decision metrics:

Approval rate (for approval-based subagents)
Escalation rate (% of requests escalated to humans)
Agreement with human experts (on a sample of decisions)

Failure metrics:

Loop detection rate (% of requests where a loop was detected)
Hallucination rate (% of requests with invalid tool calls)
Constraint violation rate (% of requests that violate guard rails)
Circuit breaker trip rate (% of time breakers are open)

Cost metrics:

Total cost per day
Cost per request
Cost per decision
Cost by model (GPT-4 vs. GPT-4o mini, etc.)

Dashboards and Alerts

Set up dashboards that show these metrics in real-time. Include:

A top-level dashboard showing all subagents and their health status (green, yellow, red).
Per-subagent dashboards showing request volume, error rate, latency, cost, and decision distribution.
Failure dashboards showing loop detection, hallucinations, constraint violations, and circuit breaker trips.
Cost dashboards showing cost trends, cost per request, and cost by model.

For each metric, define alert thresholds:

Error rate > 5% → Page the on-call engineer.
Cost per request > 2x historical average → Alert the product team.
Loop detection rate > 1% → Alert the engineering team.
Constraint violation rate > 0% → Page the security team immediately.

For detailed guidance on setting up the right monitoring and KPIs for your AI agents, check out our guides on AI Agency KPIs Sydney and AI Agency SLA Sydney, which cover the specific metrics and service levels you need to maintain production reliability.

Next Steps: Building Resilient Subagent Systems

Immediate Actions

1. Audit your current subagents. Run them through the eval suite described in this guide. Measure their loop detection, drift, hallucination, cost efficiency, and state integrity. You’ll likely find issues.

2. Implement guard rails and circuit breakers. These are low-effort, high-impact changes that prevent most failures.

3. Set up observability. Implement logging, metrics, tracing, and alerting. You can’t fix what you can’t see.

4. Run monthly recalibration tests. Test your subagents against a golden test set to catch drift early.

5. Document your remediation playbooks. When a failure happens, you need a clear process for diagnosing and fixing it. Document this.

Medium-Term Investments

1. Build an evaluation framework. Create a comprehensive test suite that covers all five failure modes. Automate it so it runs on every deployment.

2. Implement cost optimisation. Analyse your subagents’ cost profiles. Identify opportunities to use cheaper models, optimise prompts, or reduce API calls.

3. Implement state management infrastructure. Build explicit state management with validation, versioning, and rollback. Don’t rely on implicit context.

4. Set up continuous monitoring. Implement dashboards and alerts that give you real-time visibility into subagent health.

Long-Term Vision

1. Build a platform for subagent orchestration. As you add more subagents, you need a platform to manage them: deployment, monitoring, updates, rollback.

2. Implement automated remediation. When certain failures are detected, automatically apply the remediation playbook without human intervention.

3. Build a library of subagent patterns. Document the patterns that work (approval workflows, classification workflows, orchestration workflows) so you can reuse them.

4. Invest in agent observability research. The field is still evolving. Stay ahead of the curve by researching new failure modes and prevention techniques.

Why Partner with PADISO

Building resilient subagent systems is hard. It requires expertise in AI, software engineering, operations, and security. At PADISO, we’ve built this expertise across 50+ production deployments in Sydney and Australia.

We offer CTO as a Service and AI & Agents Automation services that include:

AI Strategy & Readiness: We assess your current state and build a roadmap for agentic AI adoption.
Co-build and fractional CTO support: We embed with your team and build subagent systems alongside you.
Evaluation and testing: We run the eval suite described in this guide on your subagents before they go to production.
Monitoring and observability: We set up the dashboards and alerts you need to catch failures early.
Remediation and incident response: When things go wrong, we help diagnose and fix them.

We also specialise in Security Audit (SOC 2 / ISO 27001) compliance, which is critical for subagent systems that handle sensitive data or critical operations.

If you’re building subagent systems and want to avoid the failures described in this guide, let’s talk. We can run an eval on your current subagents and show you where the risks are.

Conclusion

Subagent failures are predictable and preventable. The five failure modes described in this guide—loops, drift, hallucinations, cost explosion, and state corruption—follow clear patterns. When you understand these patterns and implement the evaluation, monitoring, and remediation strategies outlined here, you can build subagent systems that are reliable, cost-efficient, and compliant.

The key is to invest in observability and evaluation before you deploy to production. Don’t learn about failures from your users. Learn about them in your test environment, fix them, and deploy with confidence.

If you’re building subagent systems in Sydney or Australia and want to avoid these pitfalls, we’re here to help. Reach out to the team at PADISO for a consultation on your AI automation strategy and subagent architecture.

Subagent Failure Modes: Loops, Drift, and Recovery Patterns

Subagent Failure Modes: Loops, Drift, and Recovery Patterns

Table of Contents

Introduction: Why Subagent Failures Matter

Failure Mode 1: Infinite Loops and Runaway Execution

How Loops Form in Production

Why Standard Iteration Limits Aren’t Enough

Detection Patterns

Remediation Playbook: Breaking the Loop

Failure Mode 2: Prompt Drift and Goal Misalignment

What Is Prompt Drift?

Real-World Example: The Approvals Drift

Detection Patterns

Remediation Playbook: Re-anchoring the Agent

Failure Mode 3: Hallucinated Tools and Invalid Actions

What Are Hallucinated Tools?

Why Hallucinations Happen

Detection Patterns

Remediation Playbook: Constraining the Tool Space

Failure Mode 4: Cost Explosion and Resource Exhaustion

How Costs Spiral Out of Control

Cost Explosion Vectors

Detection Patterns

Remediation Playbook: Cost Optimisation and Control

Failure Mode 5: State Corruption and Memory Degradation

What Is State Corruption?

Common State Corruption Patterns

Detection Patterns

Remediation Playbook: State Integrity and Recovery

The PADISO Eval Suite: Catching Failures Before Production

What We Test

The Test Scenarios

Benchmarking Against Competitors

Remediation Playbooks: From Detection to Recovery

The Generic Remediation Framework

Playbook: Recovering from a Loop

Playbook: Recovering from Drift

Playbook: Recovering from Hallucinated Tools

Playbook: Recovering from Cost Explosion

Playbook: Recovering from State Corruption

Implementing Guard Rails and Circuit Breakers

Guard Rails: What They Are and Why They Matter

Implementing Guard Rails

Circuit Breakers: What They Are and Why They Matter

Implementing Circuit Breakers

Monitoring and Observability Patterns

The Observability Stack

Key Metrics to Track

Dashboards and Alerts

Next Steps: Building Resilient Subagent Systems

Immediate Actions

Medium-Term Investments

Long-Term Vision

Why Partner with PADISO

Further Reading

Conclusion