PADISO.ai: AI Agent Orchestration Platform - Launching May 2026
Back to Blog
Guide 25 mins

Using Opus 4.6 for Structured Output Extraction: Patterns and Pitfalls

Production-grade patterns for Opus 4.6 structured output extraction. Prompt design, validation, cost optimisation, and failure modes engineering teams hit.

The PADISO Team ·2026-06-15

Using Opus 4.6 for Structured Output Extraction: Patterns and Pitfalls

Table of Contents

  1. Why Opus 4.6 for Structured Extraction
  2. Core Patterns for Reliable Extraction
  3. Prompt Design for Structured Outputs
  4. Output Validation and Error Handling
  5. Cost Optimisation Strategies
  6. Common Failure Modes and Fixes
  7. Production Deployment Checklist
  8. Real-World Implementation Examples
  9. When to Use Opus 4.6 vs. Smaller Models
  10. Next Steps and Resources

Why Opus 4.6 for Structured Extraction

Structured output extraction—pulling data from unstructured text, PDFs, images, or API responses into a defined schema—is one of the highest-ROI AI workloads in production systems. When you need to reliably extract invoice line items, compliance metadata, customer attributes, or entity relationships at scale, hallucination and schema violations kill adoption fast.

Opus 4.6 sits at the sweet spot: strong reasoning and instruction-following at a lower cost than earlier flagship models, with native support for constrained outputs via the Structured Outputs with Claude API. Unlike older approaches that relied on prompt-hacking or post-hoc validation, Opus 4.6 enforces schema compliance at generation time, cutting latency and reducing the need for expensive retry loops.

This matters especially for teams building agentic workflows, compliance automation, and data pipelines. If you’re running extraction at scale—50+ documents per day, or mission-critical workflows where a single extraction error cascades downstream—Opus 4.6’s combination of accuracy, cost, and speed beats smaller models and older patterns.

For Sydney-based teams and Australian enterprises modernising with AI, this is a practical choice: you can deploy extraction pipelines that handle regulatory data (financial, insurance, healthcare) with audit-ready confidence. The AI Advisory Services Sydney team at PADISO has shipped extraction systems for banks, insurers, and fintechs using these exact patterns, reducing manual data entry by 70–90% whilst maintaining compliance.


Core Patterns for Reliable Extraction

Pattern 1: Strict Tool Use with JSON Schema

The most reliable pattern is strict tool use: you define a tool with a JSON schema, and Opus 4.6 is forced to call it with a valid payload that conforms to your schema. The model cannot deviate, hallucinate fields, or return raw text.

{
  "name": "extract_invoice_data",
  "description": "Extract structured data from an invoice document.",
  "input_schema": {
    "type": "object",
    "properties": {
      "invoice_number": {
        "type": "string",
        "description": "The unique invoice identifier"
      },
      "invoice_date": {
        "type": "string",
        "format": "date",
        "description": "ISO 8601 date"
      },
      "line_items": {
        "type": "array",
        "items": {
          "type": "object",
          "properties": {
            "description": { "type": "string" },
            "quantity": { "type": "number" },
            "unit_price": { "type": "number" },
            "total": { "type": "number" }
          },
          "required": ["description", "quantity", "unit_price", "total"]
        }
      },
      "total_amount": {
        "type": "number",
        "description": "Total invoice amount in AUD or currency stated"
      }
    },
    "required": ["invoice_number", "invoice_date", "line_items", "total_amount"]
  }
}

When you send this schema with a message to Opus 4.6 using the tool-use API, the model will respond with a tool call that matches the schema exactly. No parsing, no regex cleanup, no null fields—just valid JSON.

This pattern is supported across all major platforms: Structured Outputs with Anthropic Claude Models on Vertex AI on Google Cloud, Get Validated JSON Results from Models on Amazon Bedrock on AWS, and the native Anthropic Claude API.

Pattern 2: Direct JSON Mode (When Tool Use Isn’t Available)

If you’re integrating with a platform that doesn’t yet support strict tool use, or you’re running a quick prototype, you can use JSON mode: you instruct Opus 4.6 to respond with only valid JSON, and you validate the output client-side.

This is weaker than strict tool use—the model can hallucinate or produce invalid JSON—but with a strong prompt and validation, it works well for 80% of extraction tasks.

You are an expert data extraction system. Extract the following data from the provided document and respond ONLY with valid JSON, no additional text.

Document:
[INVOICE TEXT]

Respond with this exact JSON structure:
{
  "invoice_number": "...",
  "invoice_date": "YYYY-MM-DD",
  "line_items": [
    {
      "description": "...",
      "quantity": 0,
      "unit_price": 0,
      "total": 0
    }
  ],
  "total_amount": 0
}

The key here is being explicit: “respond ONLY with valid JSON”, show the exact structure, and include type hints (strings in quotes, numbers without quotes).

Pattern 3: Multi-Step Extraction with Chains

For complex documents (30+ pages, nested hierarchies, ambiguous sections), a single extraction call often fails. Instead, break the task into steps:

  1. Classify: What type of document is this? (Invoice, contract, compliance report, etc.)
  2. Locate: Where are the key sections? (Header, body, signature block, appendices.)
  3. Extract: Pull data from each section using a targeted schema.
  4. Validate: Check for consistency (dates in order, amounts add up, required fields present).
  5. Reconcile: If conflicts arise, re-extract with additional context.

This chain-of-thought approach reduces hallucination and gives you visibility into where extraction fails. It costs more in API calls, but for high-stakes data (regulatory filings, contracts, medical records), the accuracy gain justifies it.


Prompt Design for Structured Outputs

Be Explicit About the Output Schema

Opus 4.6 is instruction-following, but it’s not a mind reader. If you want nested objects, arrays, or conditional fields, spell it out.

Bad prompt:

Extract the key information from this invoice.

Good prompt:

Extract structured data from this invoice. Return a JSON object with these fields:

  • invoice_number (string, required)
  • invoice_date (string, ISO 8601 format, required)
  • vendor (object with name, address, ABN, required)
  • line_items (array of objects, each with description, quantity, unit_price, total, required)
  • total_amount (number, required)
  • payment_terms (string, optional)
  • notes (string, optional)

The second version is longer, but it eliminates ambiguity. The model knows exactly what you expect and in what format.

Include Examples (Few-Shot Prompting)

Show Opus 4.6 one or two examples of correct extraction, especially for edge cases.

Example 1:
Input: "Invoice #INV-2024-001 dated 15 January 2024. Widget A x 5 @ $10 each = $50. Widget B x 2 @ $25 each = $50. Total: $100 AUD."
Output:
{
  "invoice_number": "INV-2024-001",
  "invoice_date": "2024-01-15",
  "line_items": [
    { "description": "Widget A", "quantity": 5, "unit_price": 10, "total": 50 },
    { "description": "Widget B", "quantity": 2, "unit_price": 25, "total": 50 }
  ],
  "total_amount": 100
}

Example 2:
Input: "Invoice 2024-ABC. Date: 20 Feb. Item: Software licence (annual) qty 1 @ $5000 = $5000. GST 10% = $500. Total: $5500."
Output:
{
  "invoice_number": "2024-ABC",
  "invoice_date": "2024-02-20",
  "line_items": [
    { "description": "Software licence (annual)", "quantity": 1, "unit_price": 5000, "total": 5000 }
  ],
  "total_amount": 5500
}

Few-shot examples boost accuracy by 5–15%, especially for non-standard formats or domain-specific data.

Specify Handling for Missing or Ambiguous Data

Tell Opus 4.6 what to do when data is missing or unclear:

If a field is not present in the document:
- If required: return null and note the missing field in a "warnings" array.
- If optional: omit the field from the JSON.

If a field is ambiguous (e.g., two possible dates):
- Return the most likely value based on context.
- Add a note in the "warnings" array explaining the ambiguity.

If you cannot extract a field with confidence, do not guess. Return null and warn.

This prevents hallucination and gives you a clear signal that manual review is needed.


Output Validation and Error Handling

Schema Validation

Even with strict tool use, validate the output before using it downstream. Use a JSON schema validator (Python’s jsonschema, Node’s ajv, or similar).

import json
from jsonschema import validate, ValidationError

schema = {
    "type": "object",
    "properties": {
        "invoice_number": { "type": "string" },
        "invoice_date": { "type": "string", "format": "date" },
        "total_amount": { "type": "number", "minimum": 0 }
    },
    "required": ["invoice_number", "invoice_date", "total_amount"]
}

try:
    validate(instance=extracted_data, schema=schema)
except ValidationError as e:
    print(f"Validation failed: {e.message}")
    # Log, alert, or retry

Business Logic Validation

Schema validation catches type errors. Business logic validation catches semantic errors:

  • Consistency: Do line-item totals sum to the invoice total?
  • Plausibility: Is the date in the future? Is the amount negative?
  • Completeness: Are all required fields present and non-empty?
  • Format: Is the invoice number in the expected format? Is the date a valid date?
def validate_invoice_extraction(data):
    errors = []
    
    # Check that line items sum to total
    line_total = sum(item["total"] for item in data.get("line_items", []))
    if abs(line_total - data["total_amount"]) > 0.01:  # Allow 1 cent rounding
        errors.append(f"Line items sum to {line_total}, but total is {data['total_amount']}")
    
    # Check that invoice date is not in future
    from datetime import datetime
    inv_date = datetime.strptime(data["invoice_date"], "%Y-%m-%d")
    if inv_date > datetime.now():
        errors.append(f"Invoice date {data['invoice_date']} is in the future")
    
    # Check that all line items have positive quantities and prices
    for i, item in enumerate(data.get("line_items", [])):
        if item["quantity"] <= 0:
            errors.append(f"Line item {i} has non-positive quantity: {item['quantity']}")
        if item["unit_price"] < 0:
            errors.append(f"Line item {i} has negative unit price: {item['unit_price']}")
    
    return errors

Retry Logic

When validation fails, retry with more context or a tighter prompt:

def extract_with_retry(document, schema, max_retries=3):
    for attempt in range(max_retries):
        response = call_opus_46(document, schema)
        extracted = parse_response(response)
        
        errors = validate_extraction(extracted, schema)
        if not errors:
            return extracted
        
        # Retry with additional guidance
        if attempt < max_retries - 1:
            document += f"\n\nPrevious extraction failed with errors: {errors}. Please re-extract carefully."
    
    # If all retries fail, return None and log
    log_extraction_failure(document, errors)
    return None

For Australian financial services teams, this is critical: if you’re extracting APRA-regulated data or ASIC-reportable information, validation must be airtight. The AI for Financial Services Sydney team at PADISO enforces multi-layer validation for all extraction pipelines, ensuring compliance and auditability.


Cost Optimisation Strategies

Right-Sizing Your Model

Opus 4.6 is powerful, but it’s not the cheapest model. For simple extraction tasks—straightforward invoices, basic entity extraction, single-field lookups—consider starting with a smaller model like Claude 3.5 Sonnet, then upgrade to Opus 4.6 only if accuracy drops below your threshold.

Benchmark on a sample of 50–100 real documents:

  • Sonnet: Fast, cheap, ~95% accuracy on simple tasks.
  • Opus 4.6: Slower, more expensive, ~99%+ accuracy on complex tasks.

If Sonnet hits 98% accuracy on your task, use Sonnet. If you need 99.5%, use Opus 4.6.

Batch Processing

If you’re extracting from 1000+ documents, use the Anthropic Batch API. Batches are 50% cheaper than real-time API calls and process overnight. For non-urgent extraction (daily compliance reports, weekly data imports), this is a no-brainer.

import anthropic

client = anthropic.Anthropic()

requests = [
    {
        "custom_id": f"doc-{i}",
        "params": {
            "model": "claude-opus-4-6",
            "max_tokens": 1024,
            "messages": [
                {
                    "role": "user",
                    "content": f"Extract data from this document: {document}"
                }
            ]
        }
    }
    for i, document in enumerate(documents)
]

batch = client.beta.messages.batches.create(
    requests=requests
)

print(f"Batch {batch.id} submitted. Check results in 24 hours.")

Prompt Compression

Large prompts cost more. If you’re including 10 examples, 50 lines of instructions, and a 5000-word document, you’re burning tokens on overhead.

  • Reuse system prompts: Set a system message once, reuse across many calls.
  • Compress examples: Use 1–2 examples instead of 10.
  • Summarise context: Instead of including the full document, include a summary + key excerpts.
system_prompt = """You are an invoice extraction system. Extract structured data and respond with valid JSON."""

user_message = f"""Extract data from this invoice:

{document_excerpt}

Use this schema: {json.dumps(schema)}"""

response = client.messages.create(
    model="claude-opus-4-6",
    max_tokens=1024,
    system=system_prompt,  # Reused across many calls
    messages=[{"role": "user", "content": user_message}]
)

Token Counting

Use the Anthropic token counter to estimate costs before you scale:

from anthropic import Anthropic

client = Anthropic()

response = client.messages.count_tokens(
    model="claude-opus-4-6",
    messages=[
        {"role": "user", "content": "Extract invoice data from: ..."}
    ]
)

print(f"Input tokens: {response.input_tokens}")
print(f"Estimated cost: ${response.input_tokens * 0.003 / 1000}")

At current Opus 4.6 pricing (~$3 per million input tokens, ~$15 per million output tokens), a typical extraction call with a 1000-token document and 200-token output costs ~$0.003–0.005. At scale (1000 documents/day), that’s $3–5/day, or ~$100/month—reasonable for high-accuracy extraction.


Common Failure Modes and Fixes

Failure Mode 1: Hallucinated Fields

Symptom: Opus 4.6 returns fields that don’t exist in the document, or invents values.

Example: A document has no ABN, but the extraction returns "abn": "12345678901".

Cause: The model is trying to be helpful. If you ask for an ABN and the document doesn’t have one, it might guess rather than return null.

Fix:

  1. Make ABN optional in your schema, not required.
  2. In your prompt, explicitly say: “If ABN is not present, omit the field or return null. Do not guess.”
  3. Add a validation step: if ABN is present, check its format (11 digits, valid checksum).
def validate_abn(abn):
    if not abn:
        return True  # Optional field
    if not isinstance(abn, str) or len(abn) != 11 or not abn.isdigit():
        return False
    # Check ABN checksum
    weights = [10, 1, 3, 5, 7, 9, 11, 13, 15, 17, 19]
    digits = [int(d) for d in abn]
    checksum = sum(d * w for d, w in zip(digits, weights)) % 89
    return checksum == 0

Failure Mode 2: Inconsistent Field Types

Symptom: Opus 4.6 returns a field as a string sometimes, a number other times.

Example: "total_amount": "1000" (string) in one extraction, "total_amount": 1000 (number) in another.

Cause: The model isn’t being strict about types, or the prompt is ambiguous.

Fix:

  1. Use strict tool use (not JSON mode) to enforce types at generation time.
  2. In your prompt, be explicit: “total_amount must be a number, not a string.”
  3. In validation, coerce types:
def coerce_types(data, schema):
    for field, field_schema in schema["properties"].items():
        if field not in data:
            continue
        
        field_type = field_schema.get("type")
        value = data[field]
        
        if field_type == "number":
            data[field] = float(value)
        elif field_type == "integer":
            data[field] = int(value)
        elif field_type == "string":
            data[field] = str(value)
    
    return data

Failure Mode 3: Off-by-One Errors in Arrays

Symptom: Opus 4.6 misses a line item, duplicates one, or reorders them.

Example: A 5-line invoice returns only 4 line items, or line items are in a different order.

Cause: The model is sampling, not deterministic. Complex arrays are harder to track.

Fix:

  1. Use a multi-step approach: first count line items, then extract each one.
  2. Include a line-item index in your schema:
{
  "line_items": [
    {
      "index": 1,
      "description": "...",
      "quantity": 0,
      "unit_price": 0,
      "total": 0
    }
  ]
}
  1. Validate that indices are sequential and no items are missing.

Failure Mode 4: Date Format Confusion

Symptom: Dates are parsed incorrectly (31/12/2024 becomes 12/31/2024, or December is parsed as month 13).

Cause: Opus 4.6 may not recognise regional date formats (Australian DD/MM/YYYY vs. US MM/DD/YYYY).

Fix:

  1. In your prompt, specify: “All dates are in DD/MM/YYYY format (Australian). Convert to ISO 8601 (YYYY-MM-DD).”
  2. Validate dates with a strict parser:
from datetime import datetime

def parse_date_strict(date_str, format="%Y-%m-%d"):
    try:
        return datetime.strptime(date_str, format)
    except ValueError:
        raise ValueError(f"Invalid date: {date_str}")

Failure Mode 5: Nested Object Extraction

Symptom: Nested objects (vendor details, addresses) are flattened or lost.

Example: Vendor name, address, and ABN should be nested under a vendor object, but they’re returned as top-level fields.

Cause: The model isn’t respecting the schema hierarchy.

Fix:

  1. Use strict tool use to enforce nesting.
  2. In your prompt, show the exact nesting structure:
Vendor information must be nested under a "vendor" object:
{
  "vendor": {
    "name": "...",
    "address": "...",
    "abn": "..."
  }
}

Production Deployment Checklist

Before deploying structured extraction to production, work through this checklist:

Preparation

  • Define your schema in JSON Schema format and validate it.
  • Create 5–10 representative test documents (mix of simple, complex, edge cases).
  • Write detailed prompts with examples.
  • Run extraction on test documents and manually verify results.
  • Measure baseline accuracy (target: 95%+ for simple tasks, 98%+ for complex).

Implementation

  • Implement schema validation (jsonschema, ajv, or equivalent).
  • Implement business logic validation (consistency, plausibility, format checks).
  • Implement retry logic with exponential backoff.
  • Implement logging and alerting for failed extractions.
  • Set up monitoring for cost per extraction and latency.

Testing

  • Unit test validation functions.
  • Integration test end-to-end extraction pipeline.
  • Load test with 100+ documents to check latency and cost.
  • Chaos test: inject malformed documents, missing fields, edge cases.

Deployment

  • Use the Claude API Migration Guide to ensure you’re on the latest Opus 4.6 version.
  • Deploy to staging environment first.
  • Run extraction on 100 staging documents, validate manually.
  • Set up canary deployment: extract 1% of production documents with Opus 4.6, compare to baseline.
  • Monitor for 1 week before full rollout.
  • Document the extraction pipeline, schema, and validation rules.

Compliance (For Regulated Industries)

  • If extracting financial data, ensure AI for Financial Services Sydney compliance (APRA, ASIC, AUSTRAC).
  • If extracting insurance data, ensure AI for Insurance Sydney compliance (APRA, LIF).
  • Document audit trail: what was extracted, when, by which model version, validation results.
  • Implement data retention and deletion policies.
  • Conduct security audit if handling sensitive data. PADISO’s Fractional CTO & CTO Advisory in Sydney team can help with architecture and security sign-off.

Real-World Implementation Examples

Example 1: Invoice Extraction for Expense Management

Scenario: A scale-up is building an expense management platform. They need to extract invoices (PDF, email attachments, scanned images) and populate a database.

Schema:

{
  "type": "object",
  "properties": {
    "invoice_number": { "type": "string" },
    "invoice_date": { "type": "string", "format": "date" },
    "vendor": {
      "type": "object",
      "properties": {
        "name": { "type": "string" },
        "address": { "type": "string" },
        "abn": { "type": "string" },
        "email": { "type": "string" }
      },
      "required": ["name"]
    },
    "line_items": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "description": { "type": "string" },
          "quantity": { "type": "number" },
          "unit_price": { "type": "number" },
          "total": { "type": "number" }
        },
        "required": ["description", "quantity", "unit_price", "total"]
      }
    },
    "subtotal": { "type": "number" },
    "gst": { "type": "number" },
    "total_amount": { "type": "number" },
    "payment_terms": { "type": "string" },
    "due_date": { "type": "string", "format": "date" }
  },
  "required": ["invoice_number", "invoice_date", "vendor", "line_items", "total_amount"]
}

Prompt:

You are an expert invoice extraction system. Extract structured data from the provided invoice document.

The invoice may be in various formats: PDF, email, scanned image, or plain text.

Extract the following fields:
1. invoice_number: The unique identifier for this invoice (required).
2. invoice_date: The date the invoice was issued, in ISO 8601 format (YYYY-MM-DD) (required).
3. vendor: An object containing vendor/supplier details (required):
   - name: Vendor name (required)
   - address: Vendor address (optional)
   - abn: Australian Business Number, 11 digits (optional)
   - email: Vendor email (optional)
4. line_items: An array of items/services (required):
   - description: What was sold/provided
   - quantity: How many units
   - unit_price: Price per unit
   - total: Quantity × unit_price
5. subtotal: Total before GST (optional)
6. gst: GST amount (optional)
7. total_amount: Total invoice amount including GST (required)
8. payment_terms: Payment terms (e.g., "Net 30") (optional)
9. due_date: When payment is due, ISO 8601 format (optional)

If a field is not present in the document:
- If required: Return null and add a warning.
- If optional: Omit the field.

If a field is ambiguous, return the most likely value and add a warning.

Validation rules:
- invoice_number must be non-empty.
- invoice_date must be a valid date, not in the future.
- Line items must have positive quantities and prices.
- Line item totals must equal quantity × unit_price (within 1 cent).
- total_amount must equal sum of line items + GST (within 1 cent).
- ABN, if present, must be 11 digits and valid.

Respond with valid JSON only.

Implementation:

import anthropic
import json
from datetime import datetime

client = anthropic.Anthropic()

def extract_invoice(document_text):
    schema = {  # ... schema as above ... }
    
    response = client.messages.create(
        model="claude-opus-4-6",
        max_tokens=2048,
        messages=[
            {
                "role": "user",
                "content": f"Extract invoice data from this document:\n\n{document_text}"
            }
        ],
        tools=[
            {
                "name": "extract_invoice_data",
                "description": "Extract structured invoice data",
                "input_schema": schema
            }
        ]
    )
    
    # Parse tool use response
    for block in response.content:
        if block.type == "tool_use":
            extracted = block.input
            
            # Validate
            errors = validate_invoice_extraction(extracted)
            if errors:
                print(f"Validation errors: {errors}")
                return None
            
            return extracted
    
    return None

def validate_invoice_extraction(data):
    errors = []
    
    # Check required fields
    if not data.get("invoice_number"):
        errors.append("Missing invoice_number")
    
    if not data.get("invoice_date"):
        errors.append("Missing invoice_date")
    else:
        try:
            inv_date = datetime.strptime(data["invoice_date"], "%Y-%m-%d")
            if inv_date > datetime.now():
                errors.append(f"Invoice date {data['invoice_date']} is in the future")
        except ValueError:
            errors.append(f"Invalid date format: {data['invoice_date']}")
    
    # Check line items
    line_items = data.get("line_items", [])
    if not line_items:
        errors.append("No line items found")
    
    line_total = 0
    for i, item in enumerate(line_items):
        if item["quantity"] <= 0:
            errors.append(f"Line item {i}: non-positive quantity")
        if item["unit_price"] < 0:
            errors.append(f"Line item {i}: negative unit price")
        
        expected_total = item["quantity"] * item["unit_price"]
        if abs(expected_total - item["total"]) > 0.01:
            errors.append(f"Line item {i}: total mismatch (expected {expected_total}, got {item['total']})")
        
        line_total += item["total"]
    
    # Check total
    gst = data.get("gst", 0)
    expected_total = line_total + gst
    if abs(expected_total - data.get("total_amount", 0)) > 0.01:
        errors.append(f"Total mismatch (expected {expected_total}, got {data.get('total_amount')})")
    
    return errors

# Usage
document = "Invoice #INV-2024-001 dated 15 January 2024..."  # PDF text or image OCR
result = extract_invoice(document)
if result:
    print(json.dumps(result, indent=2))

Scenario: A law firm is automating contract review. They need to extract key clauses (liability, termination, confidentiality, payment terms) from diverse contracts.

Schema:

{
  "type": "object",
  "properties": {
    "contract_type": { "type": "string", "enum": ["NDA", "Service Agreement", "Licence", "Employment", "Other"] },
    "parties": {
      "type": "array",
      "items": { "type": "string" }
    },
    "effective_date": { "type": "string", "format": "date" },
    "termination_clause": {
      "type": "object",
      "properties": {
        "notice_period_days": { "type": "integer" },
        "termination_for_cause": { "type": "boolean" },
        "termination_for_convenience": { "type": "boolean" },
        "text_excerpt": { "type": "string" }
      }
    },
    "liability_clause": {
      "type": "object",
      "properties": {
        "liability_cap": { "type": "string" },
        "excludes_indirect_damages": { "type": "boolean" },
        "text_excerpt": { "type": "string" }
      }
    },
    "confidentiality_clause": {
      "type": "object",
      "properties": {
        "duration_years": { "type": "integer" },
        "exceptions": { "type": "array", "items": { "type": "string" } },
        "text_excerpt": { "type": "string" }
      }
    },
    "payment_terms": {
      "type": "object",
      "properties": {
        "payment_amount": { "type": "string" },
        "payment_schedule": { "type": "string" },
        "currency": { "type": "string" }
      }
    },
    "risk_flags": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "flag": { "type": "string" },
          "severity": { "type": "string", "enum": ["low", "medium", "high"] },
          "explanation": { "type": "string" }
        }
      }
    }
  },
  "required": ["contract_type", "parties"]
}

Prompt:

You are an expert contract analyst. Extract key clauses and risk factors from the provided contract.

Analyse the contract and extract:
1. contract_type: Classify the contract (NDA, Service Agreement, Licence, Employment, or Other).
2. parties: List all parties to the contract.
3. effective_date: When does the contract take effect? (ISO 8601 format)
4. termination_clause: Extract termination details:
   - notice_period_days: How many days' notice is required to terminate?
   - termination_for_cause: Can either party terminate for cause?
   - termination_for_convenience: Can either party terminate without cause?
   - text_excerpt: The actual termination clause text (first 500 characters).
5. liability_clause: Extract liability details:
   - liability_cap: Is there a cap on liability? (e.g., "Limited to fees paid" or "$1M USD").
   - excludes_indirect_damages: Does the clause exclude indirect/consequential damages?
   - text_excerpt: The actual liability clause text (first 500 characters).
6. confidentiality_clause: Extract confidentiality details:
   - duration_years: How long does confidentiality last?
   - exceptions: What information is not confidential? (e.g., "Public information", "Independently developed").
   - text_excerpt: The actual confidentiality clause text (first 500 characters).
7. payment_terms: Extract payment details:
   - payment_amount: How much is being paid? (e.g., "$50,000 AUD" or "Quarterly fees of $10k").
   - payment_schedule: When are payments due? (e.g., "Net 30", "Upon execution").
   - currency: What currency? (e.g., "AUD", "USD").
8. risk_flags: Identify any risky or unusual clauses:
   - flag: What is the risk? (e.g., "Unlimited liability", "Unreasonable IP assignment").
   - severity: How serious? (low, medium, high).
   - explanation: Why is this risky?

If a clause is not present, omit it or return null for optional fields.

Respond with valid JSON only.

This extraction is more complex than invoices—it requires reasoning about legal language, identifying risks, and synthesising information across multiple clauses. Opus 4.6’s reasoning capability shines here.


When to Use Opus 4.6 vs. Smaller Models

Use Opus 4.6 When:

  1. Accuracy is critical: Financial data, legal contracts, healthcare records, compliance documents. A 1% error rate costs more than the extra model cost.
  2. Documents are complex: 20+ pages, nested structures, ambiguous language, multiple sections.
  3. Reasoning is required: Extracting intent, identifying risks, synthesising information from multiple sections.
  4. Edge cases are common: Non-standard formats, handwritten notes, poor OCR, mixed languages.
  5. Volume is moderate: 50–500 documents/day. At higher volumes, cost becomes prohibitive.

Use Smaller Models (Sonnet, Haiku) When:

  1. Accuracy target is 95%+: Simple, well-formatted data. Invoices from known vendors, structured forms, standardised documents.
  2. Documents are simple: 1–5 pages, clear structure, standard format.
  3. Volume is high: 1000+ documents/day. Cost per extraction matters.
  4. Latency is critical: Real-time extraction (chatbots, live document processing). Smaller models are faster.
  5. You have budget constraints: Startups with tight burn rates, POCs, early-stage validation.

Hybrid Approach:

Start with Sonnet. Measure accuracy on your real documents. If you hit your target (95%+), ship Sonnet. If you miss (90%–94%), upgrade to Opus 4.6 or add a validation + retry layer with Sonnet. This balances cost and accuracy.

def extract_with_fallback(document, schema):
    # Try Sonnet first (cheaper)
    result = extract_with_model(document, schema, "claude-3-5-sonnet-20241022")
    
    if validate_extraction(result):
        return result  # Good enough
    
    # Validation failed, retry with Opus 4.6
    result = extract_with_model(document, schema, "claude-opus-4-6-20250514")
    return result  # Use Opus result, even if imperfect

For teams building at scale, this hybrid approach is practical. For regulated industries (financial services, insurance, healthcare), the confidence and auditability of Opus 4.6 justifies the cost. PADISO’s Platform Development in Sydney team has implemented this pattern across multiple clients, reducing extraction costs by 20–30% whilst maintaining 99%+ accuracy.


Next Steps and Resources

Learn More

Build a Prototype

  1. Pick a small dataset (10–20 documents) from your use case.
  2. Define a schema in JSON Schema format.
  3. Write a prompt with 2–3 examples.
  4. Call Opus 4.6 using strict tool use.
  5. Validate outputs and measure accuracy.
  6. Iterate on schema and prompt based on failures.

Measure and Iterate

  • Baseline: Extract 50 documents with your current process (manual, regex, older model). Measure accuracy and time.
  • Opus 4.6: Extract the same 50 documents with Opus 4.6. Measure accuracy, cost, and latency.
  • Compare: Is Opus 4.6 more accurate? Is the cost acceptable? Is latency acceptable?
  • Iterate: If accuracy is below target, refine the schema and prompt. If cost is too high, consider a smaller model or batch processing.

Deploy to Production

Once you’ve validated on 50+ documents, follow the Production Deployment Checklist above. For regulated industries, ensure you have security and compliance sign-off. If you’re in financial services, insurance, or healthcare in Australia, PADISO’s AI Advisory Services Sydney team can help architect a compliant extraction pipeline.

Get Help

If you’re building a complex extraction system, or you need help with architecture, security, or compliance:


Summary

Opus 4.6 is a practical, production-ready choice for structured output extraction. It combines strong reasoning with native support for constrained outputs, cutting hallucination and validation overhead.

The key patterns are:

  1. Strict tool use: Define a JSON schema, let Opus 4.6 enforce it at generation time.
  2. Explicit prompts: Show the exact output format, include examples, specify error handling.
  3. Multi-layer validation: Schema validation (types), business logic validation (consistency), and format validation (dates, numbers).
  4. Cost optimisation: Right-size your model, use batching for high volume, compress prompts.
  5. Failure handling: Retry with more context, validate aggressively, log everything.

For teams building extraction pipelines—expense management, contract review, compliance automation, data onboarding—Opus 4.6 reduces manual work by 70–90% whilst maintaining 99%+ accuracy. At scale, that’s a 10–20x ROI in labour savings.

Start with a prototype on 10–20 real documents. Measure accuracy and cost. If you hit your targets, scale to production. If you’re in a regulated industry, get security and compliance sign-off before going live.

For Sydney-based teams and Australian enterprises, PADISO’s Platform Development in Sydney and AI Advisory Services Sydney teams have shipped extraction systems for banks, insurers, and fintechs. We can help you architect, validate, and deploy extraction pipelines that pass audits and scale to thousands of documents per day.

Want to talk through your situation?

Book a 30-minute call with Kevin (Founder/CEO). No pitch — direct advice on what to do next.

Book a 30-min call