PADISO.ai: AI Agent Orchestration Platform - Launching May 2026
Back to Blog
Guide 25 mins

Claude Code on Cloudflare Workers: Edge-Compute Agentic Builds

Master Claude Code on Cloudflare Workers for edge-native agentic AI. Build stateful agents with KV, Hyperdrive, and R2. Complete guide for Sydney teams.

The PADISO Team ·2026-05-03

Table of Contents

  1. Why Edge-Compute Agentic Builds Matter
  2. Claude Code Fundamentals for Cloudflare Workers
  3. Architecture: KV-Backed Memory and Stateful Agents
  4. Building Claude Code Loops on Workers
  5. R2 File Context and Agent Knowledge
  6. Hyperdrive-Fronted Postgres for Persistent State
  7. Observability, Cost Control, and Production Hardening
  8. Real-World Implementation Patterns
  9. Common Pitfalls and Remediation
  10. Next Steps: Shipping Your First Agent

Why Edge-Compute Agentic Builds Matter

Agentic AI is no longer a research novelty. Teams across Sydney and beyond are deploying autonomous agents into production to handle customer support, supply-chain optimisation, financial reconciliation, and operational workflows. But most deployments still rely on centralised cloud infrastructure—Lambda, Cloud Run, or containerised services—which introduces latency, cost overhead, and operational complexity.

Edge-compute agentic builds flip this paradigm. By running Claude Code–powered agents directly on Cloudflare Workers, you achieve three things simultaneously:

Sub-100ms latency from user to agent decision. Your agents execute at the edge, geographically close to your users. No round-trip to a distant region. For time-sensitive workflows—customer support, real-time data validation, fraud detection—this matters.

Stateful, memory-efficient agent loops. Traditional serverless functions are ephemeral. Cloudflare’s KV store, R2 object storage, and Hyperdrive database connector let you build agents with persistent memory, context awareness, and multi-turn conversations without managing sessions or Redis clusters.

Cost predictability and horizontal scale. Workers charges by CPU milliseconds and request count, not by reserved capacity. Agents that spend 90% of their time waiting for external APIs don’t burn compute. You scale horizontally across Cloudflare’s global network without provisioning.

For Sydney-based founders and operators building AI-native products, this architecture unlocks a new class of product: real-time, intelligent, and globally distributed from day one.


Claude Code Fundamentals for Cloudflare Workers

What Is Claude Code?

Claude Code Official Documentation describes Claude Code as a terminal-based coding agent that understands your entire codebase, edits files, runs commands, and integrates with cloud platforms. Unlike traditional code-generation models, Claude Code maintains context across multiple files, understands your project structure, and can scaffold entire features end-to-end.

For Cloudflare Workers specifically, Claude Code acts as your deployment and iteration partner. You describe your agent’s requirements—“Build a customer support agent that reads tickets from our Postgres database, drafts responses using Claude, and stores them back”—and Claude Code:

  • Scaffolds the Worker project structure
  • Writes type-safe TypeScript bindings for KV, R2, and Hyperdrive
  • Implements the agent loop (receive input → call Claude → store output → return result)
  • Configures wrangler.toml with the correct bindings
  • Tests locally and deploys to your Cloudflare account

This end-to-end capability is critical because edge agents are architecturally different from traditional backends. They’re stateless by default, have no persistent file system, and must delegate all storage to external services. Claude Code understands these constraints and generates code that works.

The Cloudflare Skills Plugin

Claude Code + Cloudflare Agent Setup Documentation outlines the Cloudflare Skills plugin, which extends Claude Code with native bindings for:

  • Wrangler CLI: Deploy and manage Workers without leaving Claude’s interface
  • KV API: Read/write key-value pairs for agent memory
  • R2 API: Upload, download, and query files for knowledge bases
  • Hyperdrive API: Execute SQL against Postgres, MySQL, or other databases
  • D1 API: SQLite queries for lightweight state
  • Durable Objects API: Coordinate state across Workers instances

When you install the Cloudflare Skills plugin in your Claude Code environment, you can ask Claude to “Deploy this agent to production” and it will invoke Wrangler, validate your bindings, and push your code live—all without you touching the CLI.

Why Claude 3.5 Sonnet Matters

Claude 3.5 Sonnet Announcement highlights the model’s coding capabilities: 92% on HumanEval, strong instruction-following, and reliable code generation. For agentic loops on Workers, Sonnet’s instruction-following is critical. Your agent prompt must be precise—“If the user asks for a refund, check the order status in Postgres, then draft a response”—and Sonnet reliably follows multi-step instructions without hallucinating.

Sonnet also has a 200K token context window, meaning your agent can ingest entire codebases, database schemas, or customer conversation histories as context without token exhaustion.


Architecture: KV-Backed Memory and Stateful Agents

The Agent Memory Problem

Traditional agents running on centralised servers maintain conversation state in memory or a session store. The agent receives a user message, recalls previous turns from Redis or a database, generates a response, and stores the updated state.

On serverless Workers, there is no persistent memory between invocations. Each request is a fresh process. If your agent receives a second message from the same user, it has no knowledge of the first conversation.

This is where Cloudflare KV becomes essential. KV is a globally distributed, eventually consistent key-value store. You can store agent state—conversation history, user preferences, intermediate results—as JSON objects keyed by user ID or session ID. When your agent receives a new message, it:

  1. Retrieves the conversation history from KV
  2. Appends the new user message
  3. Calls Claude with the full conversation context
  4. Stores the updated history back to KV
  5. Returns the response

KV has 99.99% uptime SLA, sub-millisecond read latency at the edge, and costs $0.50 per million read operations. For an agent handling 10,000 conversations per day, with 5 turns per conversation, that’s 50,000 reads per day—roughly $0.025 per day in KV costs.

Implementing KV State

Here’s a minimal example. Your Worker receives a POST request with a user message:

export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    if (request.method !== 'POST') return new Response('Not Found', { status: 404 });

    const { userId, message } = await request.json() as { userId: string; message: string };

    // Retrieve conversation history from KV
    const historyKey = `conversation:${userId}`;
    const historyRaw = await env.KV.get(historyKey);
    const history = historyRaw ? JSON.parse(historyRaw) : [];

    // Append user message
    history.push({ role: 'user', content: message });

    // Call Claude API
    const response = await fetch('https://api.anthropic.com/v1/messages', {
      method: 'POST',
      headers: {
        'x-api-key': env.ANTHROPIC_API_KEY,
        'anthropic-version': '2023-06-01',
        'content-type': 'application/json',
      },
      body: JSON.stringify({
        model: 'claude-3-5-sonnet-20241022',
        max_tokens: 1024,
        messages: history,
        system: 'You are a helpful customer support agent.',
      }),
    });

    const { content } = (await response.json()) as { content: Array<{ text: string }> };
    const assistantMessage = content[0].text;

    // Store updated history
    history.push({ role: 'assistant', content: assistantMessage });
    await env.KV.put(historyKey, JSON.stringify(history), { expirationTtl: 86400 * 30 }); // 30 days

    return new Response(JSON.stringify({ message: assistantMessage }), {
      headers: { 'content-type': 'application/json' },
    });
  },
};

This pattern—retrieve → append → call Claude → store—is the foundation of KV-backed agents. You can extend it with:

  • Conversation pruning: Keep only the last 10 turns to stay within token limits
  • Metadata: Store user preferences, intent tags, or routing flags alongside history
  • Expiration: Use KV’s expirationTtl to auto-clean old conversations
  • Versioning: Store a version field to migrate conversation formats over time

For teams building customer support agents, this pattern reduces latency compared to querying a central database. For internal automation—approval workflows, data validation—KV state keeps agents fast and cost-effective.


Building Claude Code Loops on Workers

The Agent Loop Pattern

A Claude Code loop on Workers follows this structure:

  1. Receive input (HTTP request, webhook, scheduled trigger)
  2. Fetch context (from KV, R2, Hyperdrive, or external APIs)
  3. Call Claude with the context and a system prompt defining the agent’s role
  4. Parse the response (extract action, decision, or generated content)
  5. Execute side effects (write to database, call external API, store files)
  6. Store state (update KV, log to Hyperdrive)
  7. Return result to caller

Here’s a more realistic example: an agent that processes customer support tickets.

interface Ticket {
  id: string;
  customerId: string;
  subject: string;
  body: string;
  status: 'open' | 'in_progress' | 'resolved';
  createdAt: string;
}

interface AgentState {
  ticketId: string;
  draftResponse: string;
  sentiment: 'positive' | 'neutral' | 'negative';
  confidence: number;
  timestamp: string;
}

export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    if (request.method !== 'POST') return new Response('Not Found', { status: 404 });

    const { ticketId } = await request.json() as { ticketId: string };

    // Step 1: Fetch ticket from Hyperdrive-fronted Postgres
    const ticketResult = await env.HYPERDRIVE.query(
      'SELECT id, customer_id, subject, body, status, created_at FROM tickets WHERE id = $1',
      [ticketId]
    );
    if (ticketResult.results.length === 0) {
      return new Response(JSON.stringify({ error: 'Ticket not found' }), { status: 404 });
    }
    const ticket = ticketResult.results[0] as Ticket;

    // Step 2: Check if agent has already processed this ticket
    const stateKey = `ticket-state:${ticketId}`;
    const existingState = await env.KV.get(stateKey);
    if (existingState) {
      const state = JSON.parse(existingState) as AgentState;
      return new Response(JSON.stringify(state), { headers: { 'content-type': 'application/json' } });
    }

    // Step 3: Build context for Claude
    const context = `
You are a customer support agent. Analyse the following ticket and draft a response.

Ticket ID: ${ticket.id}
Subject: ${ticket.subject}
Body: ${ticket.body}

Provide:
1. A professional, empathetic response to the customer
2. The sentiment of the ticket (positive, neutral, negative)
3. A confidence score (0-1) on your response quality

Respond in JSON format:
{
  "draftResponse": "...",
  "sentiment": "...",
  "confidence": ...
}
    `;

    // Step 4: Call Claude
    const claudeResponse = await fetch('https://api.anthropic.com/v1/messages', {
      method: 'POST',
      headers: {
        'x-api-key': env.ANTHROPIC_API_KEY,
        'anthropic-version': '2023-06-01',
        'content-type': 'application/json',
      },
      body: JSON.stringify({
        model: 'claude-3-5-sonnet-20241022',
        max_tokens: 1024,
        messages: [{ role: 'user', content: context }],
      }),
    });

    const { content } = (await claudeResponse.json()) as { content: Array<{ text: string }> };
    const responseText = content[0].text;

    // Step 5: Parse response
    let parsedResponse: { draftResponse: string; sentiment: string; confidence: number };
    try {
      parsedResponse = JSON.parse(responseText);
    } catch {
      return new Response(JSON.stringify({ error: 'Failed to parse Claude response' }), { status: 500 });
    }

    // Step 6: Store state in KV
    const state: AgentState = {
      ticketId,
      draftResponse: parsedResponse.draftResponse,
      sentiment: parsedResponse.sentiment as 'positive' | 'neutral' | 'negative',
      confidence: parsedResponse.confidence,
      timestamp: new Date().toISOString(),
    };
    await env.KV.put(stateKey, JSON.stringify(state), { expirationTtl: 86400 * 7 }); // 7 days

    // Step 7: Update ticket status in Postgres
    await env.HYPERDRIVE.query(
      'UPDATE tickets SET status = $1, updated_at = NOW() WHERE id = $2',
      ['in_progress', ticketId]
    );

    return new Response(JSON.stringify(state), { headers: { 'content-type': 'application/json' } });
  },
};

This loop demonstrates several key patterns:

  • Idempotency: Check KV before calling Claude. If the agent has already processed this ticket, return cached state.
  • Context assembly: Fetch all relevant data (ticket details, customer history, product info) before calling Claude.
  • Structured output: Ask Claude for JSON, then parse and validate it.
  • State persistence: Store the agent’s decision in KV for audit trails and replay.
  • Side effects: Update the database to reflect the agent’s action.

Scaling the Loop with Durable Objects

For agents that coordinate across multiple Workers (e.g., a ticket agent that needs to notify a human reviewer), Cloudflare’s Durable Objects provide a coordination primitive. A Durable Object is a long-lived, single-instance server that can maintain state and route requests.

You might use a Durable Object to:

  • Aggregate agent decisions: Multiple Workers process tickets in parallel; the Durable Object collects results and triggers downstream workflows.
  • Rate-limit agent calls: Ensure your agent doesn’t call Claude more than N times per minute.
  • Manage conversation state: For multi-turn agent interactions, the Durable Object keeps the conversation history and routes messages to the appropriate agent.

For most teams starting out, KV-backed agents are sufficient. Durable Objects add complexity and should be introduced only when you need true coordination.


R2 File Context and Agent Knowledge

Why R2 for Agent Knowledge?

Many agents need access to external knowledge: product documentation, customer contracts, financial records, or codebase context. Storing this as text in your Claude prompt works for small documents, but quickly becomes impractical.

Cloudflare R2 is an S3-compatible object store that costs $0.015 per GB per month (1/6 the price of S3). You can store documents, PDFs, images, or structured data in R2, and your agent can fetch and ingest them during inference.

For example, a support agent might:

  1. Receive a customer ticket
  2. Fetch the customer’s contract (PDF) from R2
  3. Fetch the product FAQ (Markdown) from R2
  4. Pass both documents to Claude as context
  5. Claude generates a response tailored to the customer’s contract terms and product features

This pattern keeps your agent’s prompt concise (no massive inline documents) while giving Claude the context it needs to make informed decisions.

Implementing R2 Context Injection

Here’s how to fetch and inject R2 content into a Claude call:

export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    const { customerId, question } = await request.json() as { customerId: string; question: string };

    // Fetch customer contract from R2
    const contractKey = `contracts/${customerId}.pdf`;
    const contractObject = await env.R2.get(contractKey);
    if (!contractObject) {
      return new Response(JSON.stringify({ error: 'Contract not found' }), { status: 404 });
    }
    const contractText = await contractObject.text();

    // Fetch FAQ from R2
    const faqObject = await env.R2.get('faq.md');
    const faqText = faqObject ? await faqObject.text() : 'No FAQ available.';

    // Build context
    const context = `
You are a customer support agent. Use the following information to answer the customer's question.

Customer Contract:
${contractText}

Product FAQ:
${faqText}

Customer Question: ${question}

Provide a helpful, accurate response based on the contract and FAQ.
    `;

    // Call Claude
    const response = await fetch('https://api.anthropic.com/v1/messages', {
      method: 'POST',
      headers: {
        'x-api-key': env.ANTHROPIC_API_KEY,
        'anthropic-version': '2023-06-01',
        'content-type': 'application/json',
      },
      body: JSON.stringify({
        model: 'claude-3-5-sonnet-20241022',
        max_tokens: 1024,
        messages: [{ role: 'user', content: context }],
      }),
    });

    const { content } = (await response.json()) as { content: Array<{ text: string }> };

    return new Response(JSON.stringify({ answer: content[0].text }), {
      headers: { 'content-type': 'application/json' },
    });
  },
};

R2 + Claude Code Integration

Claude Code Official Documentation and Claude Code + Cloudflare Agent Setup Documentation both highlight R2 integration. When you use Claude Code to build your agent, you can ask Claude to:

  • “Upload the product documentation to R2”
  • “Create an R2 bucket for customer contracts”
  • “Generate code that fetches documents from R2 and injects them into Claude prompts”

Claude Code will scaffold the entire workflow: R2 client setup, file upload logic, and prompt injection patterns.

Caching and Versioning

For large documents (contracts, codebases), fetching from R2 on every agent invocation adds latency. Use KV to cache frequently accessed documents:

const cacheKey = `r2-cache:${documentId}`;
let documentText = await env.KV.get(cacheKey);
if (!documentText) {
  const object = await env.R2.get(documentId);
  documentText = await object.text();
  await env.KV.put(cacheKey, documentText, { expirationTtl: 3600 }); // 1 hour
}

This two-tier approach—R2 for authoritative storage, KV for hot caching—balances cost and latency.


Hyperdrive-Fronted Postgres for Persistent State

The Database Problem on Serverless

Serverless functions have no persistent file system, so all durable state must live in external databases. Traditional approaches use connection pooling services (PgBouncer, Supabase, Neon) to avoid exhausting Postgres’ connection limit.

Cloudflare Hyperdrive solves this differently. Hyperdrive is a managed connection pool that sits between your Workers and your Postgres database. It maintains a persistent connection pool, multiplexes requests, and caches frequently accessed data. For Workers calling Postgres, Hyperdrive reduces connection overhead and latency.

Setting Up Hyperdrive

To use Hyperdrive, you:

  1. Create a Hyperdrive instance pointing to your Postgres database
  2. Bind it to your Worker in wrangler.toml:
[[hyperdrive]]
binding = "HYPERDRIVE"
id = "your-hyperdrive-id"
  1. Query it in your Worker code:
const result = await env.HYPERDRIVE.query(
  'SELECT * FROM users WHERE id = $1',
  [userId]
);

Hyperdrive abstracts the connection pooling and caching, so your Worker code is simple and fast.

Agent State in Hyperdrive

While KV is great for ephemeral agent memory (conversation history, draft responses), Hyperdrive is better for permanent records: audit logs, decision history, and structured data.

For example, a financial reconciliation agent might:

  1. Query Hyperdrive to fetch unreconciled transactions
  2. Call Claude to classify and match transactions
  3. Store the matched pairs back in Hyperdrive
  4. Update the reconciliation status
export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    // Fetch unreconciled transactions
    const txnResult = await env.HYPERDRIVE.query(
      'SELECT id, amount, date, description FROM transactions WHERE reconciled = false LIMIT 10'
    );

    const transactions = txnResult.results as Array<{
      id: string;
      amount: number;
      date: string;
      description: string;
    }>;

    // Prepare context for Claude
    const context = `
You are a financial reconciliation agent. Match the following transactions to the appropriate account categories.

Transactions:
${transactions.map((t) => `- ${t.id}: ${t.amount} on ${t.date} (${t.description})`).join('\n')}

Respond with JSON:
{
  "matches": [
    { "transactionId": "...", "category": "...", "confidence": ... }
  ]
}
    `;

    // Call Claude
    const claudeResponse = await fetch('https://api.anthropic.com/v1/messages', {
      method: 'POST',
      headers: {
        'x-api-key': env.ANTHROPIC_API_KEY,
        'anthropic-version': '2023-06-01',
        'content-type': 'application/json',
      },
      body: JSON.stringify({
        model: 'claude-3-5-sonnet-20241022',
        max_tokens: 2048,
        messages: [{ role: 'user', content: context }],
      }),
    });

    const { content } = (await claudeResponse.json()) as { content: Array<{ text: string }> };
    const matches = JSON.parse(content[0].text).matches;

    // Store matches in Hyperdrive
    for (const match of matches) {
      await env.HYPERDRIVE.query(
        'UPDATE transactions SET reconciled = true, category = $1, confidence = $2, reconciled_at = NOW() WHERE id = $3',
        [match.category, match.confidence, match.transactionId]
      );
    }

    return new Response(JSON.stringify({ reconciled: matches.length }), {
      headers: { 'content-type': 'application/json' },
    });
  },
};

This pattern—fetch data, call Claude, store results—is the core of data-processing agents. Hyperdrive keeps the database connection fast and pooled, so your agent can process hundreds of records per second.

Avoiding N+1 Queries

A common pitfall: agents that loop over records and query the database for each one. For 1000 transactions, this becomes 1000 database calls, which is slow and expensive.

Instead, batch your queries:

// ❌ Bad: N+1 queries
for (const txn of transactions) {
  const account = await env.HYPERDRIVE.query('SELECT * FROM accounts WHERE id = $1', [txn.accountId]);
  // ...
}

// ✅ Good: Single batched query
const accountIds = transactions.map((t) => t.accountId);
const accounts = await env.HYPERDRIVE.query(
  'SELECT * FROM accounts WHERE id = ANY($1)',
  [accountIds]
);
const accountMap = new Map(accounts.results.map((a) => [a.id, a]));

For agents processing large datasets, batching is critical to performance and cost.


Observability, Cost Control, and Production Hardening

Logging and Tracing

Production agents fail silently without proper observability. You need to log:

  • Every Claude API call: prompt, response, latency, tokens used
  • Agent decisions: what action the agent took, why, confidence level
  • Errors and retries: when Claude fails, when external APIs timeout, how you recovered
  • Cost: tokens consumed per invocation, per agent, per day

Cloudflare Workers integrates with Logpush to send logs to external services (Datadog, Splunk, S3). For agentic builds, you should:

  1. Log to Workers Analytics Engine (free, real-time)
  2. Push detailed logs to Datadog or similar for long-term analysis
interface AgentLog {
  timestamp: string;
  ticketId: string;
  claudeTokens: number;
  claudeLatency: number;
  decision: string;
  confidence: number;
  error?: string;
}

export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    const startTime = Date.now();
    const log: AgentLog = {
      timestamp: new Date().toISOString(),
      ticketId: '',
      claudeTokens: 0,
      claudeLatency: 0,
      decision: '',
      confidence: 0,
    };

    try {
      // ... agent logic ...
      const claudeStart = Date.now();
      const claudeResponse = await fetch('https://api.anthropic.com/v1/messages', {
        // ...
      });
      log.claudeLatency = Date.now() - claudeStart;

      const { usage } = (await claudeResponse.json()) as { usage: { input_tokens: number; output_tokens: number } };
      log.claudeTokens = usage.input_tokens + usage.output_tokens;

      // ... process response ...

      // Write to Analytics Engine
      env.ANALYTICS.writeDataPoint({
        indexes: [log.ticketId],
        blobs: [JSON.stringify(log)],
      });
    } catch (error) {
      log.error = error instanceof Error ? error.message : String(error);
      env.ANALYTICS.writeDataPoint({
        indexes: [log.ticketId],
        blobs: [JSON.stringify(log)],
      });
      throw error;
    }

    return new Response('OK');
  },
};

Cost Control: Prompt Caching and Token Budgets

Claude API charges by tokens. A large prompt (contract + FAQ + conversation history) might consume 5,000 tokens per call. For 10,000 agent invocations per day, that’s 50M tokens—roughly $1.50 at Sonnet pricing.

To control costs:

  1. Use prompt caching: For documents that don’t change (FAQ, product specs), use Anthropic’s prompt caching to reduce token cost by 90%.
  2. Compress context: Use summaries instead of full documents. If your customer has a 100-page contract, use Claude to generate a 1-page summary and cache that.
  3. Batch agent calls: Instead of calling Claude for every record, batch 10 records per call.
  4. Monitor token spend: Log tokens per invocation and set alerts if spend exceeds budget.

Reliability: Retries and Fallbacks

Edge agents are distributed systems. Claude API might timeout, your Postgres connection might fail, R2 might be slow. You need retries and fallbacks:

async function callClaudeWithRetry(
  env: Env,
  messages: Array<{ role: string; content: string }>,
  maxRetries = 3
): Promise<string> {
  for (let attempt = 1; attempt <= maxRetries; attempt++) {
    try {
      const response = await fetch('https://api.anthropic.com/v1/messages', {
        method: 'POST',
        headers: {
          'x-api-key': env.ANTHROPIC_API_KEY,
          'anthropic-version': '2023-06-01',
          'content-type': 'application/json',
        },
        body: JSON.stringify({
          model: 'claude-3-5-sonnet-20241022',
          max_tokens: 1024,
          messages,
        }),
      });

      if (!response.ok) {
        throw new Error(`Claude API error: ${response.status}`);
      }

      const { content } = (await response.json()) as { content: Array<{ text: string }> };
      return content[0].text;
    } catch (error) {
      if (attempt === maxRetries) throw error;
      // Exponential backoff: 100ms, 200ms, 400ms
      await new Promise((resolve) => setTimeout(resolve, 100 * Math.pow(2, attempt - 1)));
    }
  }
  throw new Error('Retries exhausted');
}

For fallbacks, if Claude fails, you might:

  • Return a cached previous response
  • Use a simpler rule-based decision
  • Queue the request for manual review

Choose based on your SLA. For customer support, a fallback response is better than no response.

Security: API Key Management and Rate Limiting

Your Worker needs access to the Claude API key. Never hardcode secrets. Use Cloudflare’s Secrets to store sensitive values:

wrangler secret put ANTHROPIC_API_KEY

Then access via env.ANTHROPIC_API_KEY in your Worker code.

Also implement rate limiting to prevent abuse:

async function isRateLimited(env: Env, userId: string): Promise<boolean> {
  const key = `ratelimit:${userId}`;
  const count = await env.KV.get(key);
  if (!count) {
    await env.KV.put(key, '1', { expirationTtl: 3600 }); // 1 hour
    return false;
  }
  const currentCount = parseInt(count, 10);
  if (currentCount >= 100) {
    return true; // 100 requests per hour
  }
  await env.KV.put(key, String(currentCount + 1), { expirationTtl: 3600 });
  return false;
}

Real-World Implementation Patterns

Pattern 1: Customer Support Triage Agent

Incoming tickets are routed to Claude, which:

  1. Extracts intent (billing, technical, feature request)
  2. Fetches relevant documentation from R2
  3. Drafts a response
  4. Assigns a priority (urgent, high, normal)
  5. Stores the decision in Hyperdrive
  6. Returns the draft for human review

This is ideal for high-volume support teams. Agents handle 80% of routine questions; humans review the remaining 20%.

Pattern 2: Data Validation and Reconciliation

Your data pipeline has gaps: missing values, inconsistent formats, duplicate records. An agent can:

  1. Fetch a batch of records from Hyperdrive
  2. Call Claude to identify issues and suggest fixes
  3. Apply fixes automatically (with confidence > 0.9) or queue for manual review
  4. Log all decisions for audit

For financial or compliance-sensitive data, this agent runs on a schedule (nightly, weekly) and flags anomalies.

Pattern 3: Real-Time Workflow Automation

A customer submits a form requesting a refund. Your agent:

  1. Fetches the order from Hyperdrive
  2. Checks the refund policy (R2)
  3. Validates the refund eligibility (no refunds > 30 days old)
  4. Drafts an approval or rejection
  5. Calls a webhook to trigger the refund or send a rejection email

All within 100ms, at the edge, globally.

Pattern 4: Agentic Automation with Existing Tools

Many teams already have automation tools: Zapier, Make, Retool. An agent can orchestrate these:

  1. Receive a request
  2. Call Claude to decide which tools to invoke
  3. Call those tools via webhooks
  4. Aggregate results
  5. Return the final output

This pattern lets you upgrade existing automations to agentic, without replacing them entirely. For more details on how agentic AI compares to traditional automation, see Agentic AI vs Traditional Automation: Why Autonomous Agents Are the Future and Agentic AI vs Traditional Automation: Which AI Strategy Actually Delivers ROI for Your Startup.


Common Pitfalls and Remediation

Pitfall 1: Runaway Claude Loops

If your agent prompt is poorly written, Claude might call itself recursively or loop indefinitely. For example:

Prompt: "If the user asks for a refund, decide whether to approve. If you're unsure, ask the user for more info and call yourself again."

This can spiral: Claude asks for info, user responds, Claude asks again, etc. You’ll blow through your token budget and hit timeouts.

Remediation:

  1. Set a maximum loop depth: Track how many times the agent has called Claude in a single request. After 3-5 calls, stop and escalate to a human.
  2. Use structured outputs: Instead of asking Claude to “call itself”, ask it to output a JSON decision with a single call.
  3. Test with small datasets: Before deploying to production, run your agent on 10 test cases and verify it terminates cleanly.

For more on production failures, read Agentic AI Production Horror Stories (And What We Learned).

Pitfall 2: Hallucinated Tools and APIs

Claude might generate code that calls non-existent APIs or tools. For example:

Claude: "I'll call env.PAYMENT_API.refund(customerId, amount) to process the refund."

But env.PAYMENT_API doesn’t exist in your Worker. Your code crashes.

Remediation:

  1. Define available tools explicitly: In your system prompt, list exactly which APIs and tools are available.
  2. Validate before executing: If Claude generates a function call, check that it exists before invoking it.
  3. Use tool use: Anthropic’s tool use API lets Claude request tools, and you decide whether to grant them. This is safer than free-form code generation.

Pitfall 3: Token Exhaustion

You start with a small prompt (1,000 tokens), but over time add more context: customer history, FAQ, product docs. Now your prompt is 10,000 tokens. At 10,000 invocations per day, that’s 100M tokens—expensive.

Remediation:

  1. Monitor token spend: Log tokens per invocation and track trends.
  2. Use prompt caching: For static content (FAQ, product specs), use Anthropic’s prompt caching to reduce cost by 90%.
  3. Compress context: Summarise long documents before passing to Claude.
  4. Batch invocations: Process 10 records per Claude call instead of 1 record per call.

Pitfall 4: Inconsistent Agent Decisions

You run the same request through your agent twice and get different answers. This breaks trust and creates audit problems.

Remediation:

  1. Use deterministic prompts: Avoid phrases like “be creative” or “surprise me”. Use specific instructions.
  2. Set temperature to 0: For deterministic outputs, use "temperature": 0 in your Claude API call.
  3. Cache decisions: If you’ve already decided on a request, return the cached decision instead of re-running the agent.
  4. Log all decisions: Store every agent decision in Hyperdrive with timestamps and user IDs, so you can audit and replay.

Next Steps: Shipping Your First Agent

Step 1: Set Up Your Development Environment

  1. Install Wrangler: npm install -g @cloudflare/wrangler
  2. Create a new Worker project: wrangler init my-agent
  3. Install dependencies: npm install anthropic
  4. Create a .env.local file with your Anthropic API key

Step 2: Choose Your First Use Case

Start small. Don’t build a complex multi-turn agent with 5 integrations. Instead, pick one of these:

  • Customer support ticket triage: Classify tickets by intent and priority
  • Data validation: Identify missing or inconsistent fields in a dataset
  • Email classification: Categorise incoming emails (spam, urgent, etc.)
  • Content moderation: Flag potentially problematic user-generated content

These are bounded problems with clear success metrics.

Step 3: Implement KV-Backed State

Add KV bindings to your wrangler.toml:

[[kv_namespaces]]
binding = "KV"
id = "your-kv-namespace-id"

Then implement the conversation pattern from earlier: fetch state, append input, call Claude, store state.

Step 4: Integrate with Your Data

Connect your agent to a real data source:

  • Hyperdrive: For Postgres databases
  • R2: For documents and knowledge bases
  • External APIs: For third-party data (Stripe, Slack, etc.)

Start with read-only access. Once you’re confident, add write access.

Step 5: Add Observability

Log every Claude call, every decision, every error. Use Cloudflare’s Analytics Engine or push to Datadog. You won’t know what’s broken until you can see it.

Step 6: Test and Deploy

  1. Local testing: Use wrangler dev to test locally
  2. Staging: Deploy to a staging Worker and test with real data
  3. Gradual rollout: Deploy to production with a small percentage of traffic (5%, 10%, 25%) and monitor
  4. Monitoring: Set alerts for errors, latency, and token spend

Step 7: Iterate and Optimise

Once live:

  1. Collect feedback from users and systems
  2. Identify failing cases and add examples to your prompt
  3. Measure accuracy, latency, and cost
  4. Optimise the prompt, context, and architecture based on data

For teams new to agentic AI, this iterative approach beats trying to build the “perfect” agent upfront. Real data reveals the gaps in your design.

Resources and Further Reading

For deeper dives into Claude Code and Cloudflare Workers, refer to Claude Code Official Documentation and Claude Code + Cloudflare Agent Setup Documentation.

For broader context on agentic AI in production, PADISO has published extensive guides. Explore Agentic AI + Apache Superset: Letting Claude Query Your Dashboards for integrating agents with analytics tools, and The $2 Trillion Renaissance: Enterprise IT’s Agentic Reinvention for the broader market context.

For teams building AI-native products, AI Agency Sydney: Everything Sydney Business Owners Need to Know in 2026 and AI Automation Agency Sydney: The Complete Guide for Sydney Businesses in 2026 offer practical guidance on partnering with specialists.

If you’re automating specific domains, PADISO has published industry-specific guides: AI Automation for Retail: Inventory Management and Customer Experience, AI Automation for Supply Chain: Demand Forecasting and Inventory Management, AI Automation for Insurance: Claims Processing and Risk Assessment, AI Automation for Construction: Project Management and Safety Monitoring, and AI Automation for Agriculture: Precision Farming and Crop Management.


Summary

Claude Code on Cloudflare Workers unlocks a new class of AI application: edge-native, stateful, globally distributed agents that execute in sub-100ms latency.

The architecture—KV-backed memory, R2 file context, Hyperdrive-fronted Postgres, and Claude Code loops—is production-ready today. Teams at PADISO and across the industry are shipping agents that handle customer support, data validation, workflow automation, and complex decision-making.

Start small. Pick a bounded use case. Implement KV state, add observability, and iterate based on real data. Within weeks, you’ll have an agent that reduces manual work, improves consistency, and scales globally.

The future of software is agentic. The infrastructure is ready. Build.


Getting Help

If you’re building agents on Cloudflare Workers and need guidance on architecture, security, or scaling, PADISO offers CTO as a Service and fractional engineering leadership for seed-to-Series-B startups. We’ve shipped agentic AI systems for customer support, financial reconciliation, and operational automation. Reach out to discuss your project.