Guide 25 mins

Progressive Disclosure in Agent Skills: Reference Files That Don't Bloat Context

Master progressive disclosure in AI agent skills to keep context lean while maintaining full functionality. Learn reference file strategies that scale.

The PADISO Team ·2026-05-04

Progressive Disclosure in Agent Skills: Reference Files That Don’t Bloat Context

What Progressive Disclosure Means for Agent Skills
Why Context Bloat Kills Agent Performance
Three-Tier Skill Structure: Instructions, Reference, Examples
Reference Files as Progressive Disclosure
Practical Implementation Patterns
Measuring Context Efficiency
Real-World Scaling Scenarios
Common Pitfalls and How to Avoid Them
Next Steps: Building Your Skill Framework

What Progressive Disclosure Means for Agent Skills

Progressive disclosure is a design pattern that reveals complexity only when needed. In the context of AI agents, it means your agent doesn’t load every piece of documentation, every example, and every edge case into its context window upfront. Instead, it loads what’s essential to begin, then pulls in deeper details—reference files, examples, error handlers—only when the agent actually needs them.

This matters because context windows, even large ones, are finite resources. Claude’s 200K context window is substantial, but if you’re building production agents that handle multiple domains, manage workflows across systems, or need to reference extensive API documentation, you’ll run into practical limits. A single agent managing customer support, billing inquiries, and technical troubleshooting across three different platforms can easily consume 80K+ tokens just describing what it might do. Progressive disclosure lets you describe what it will do first, and defer the rest.

The pattern applies directly to agent skills—the discrete, reusable capabilities an agent can invoke. As explained in the Agent Skills: Progressive Disclosure as a System Design Pattern resource, progressive disclosure is applied to agent context management through Agent Skills, keeping context lean while maintaining configurability. Rather than embedding full skill documentation inline, you structure skills in tiers: a lightweight instruction layer that tells the agent what skill exists and when to use it, a reference layer it can request when needed, and an examples layer for learning by doing.

At PADISO, we’ve built this pattern into our AI & Agents Automation services because it’s the difference between an agent that works in development and one that scales in production. A startup we worked with had an agent managing 15 different integrations—Slack, Salesforce, Stripe, Jira, and others. Without progressive disclosure, the context was so bloated that the agent took 8–12 seconds to respond to simple queries. After restructuring with progressive disclosure, response times dropped to 1–2 seconds, and token costs fell by 40%.

Why Context Bloat Kills Agent Performance

Context bloat has three direct consequences: latency, cost, and hallucination.

Latency

AI models process tokens sequentially. A 150K token context takes longer to process than a 50K token context. If your agent’s context includes 50 skills, each with full documentation, examples, and error cases, you’re forcing the model to read through all of that before it can reason about the user’s request. In production systems, this translates to noticeable delays. Users expect sub-second responses; agents that take 5–10 seconds feel broken, even if the answer is correct.

Progressive disclosure fixes this by loading only the skill metadata upfront—a one-line description of what the skill does and when to invoke it. The full documentation loads only if the agent decides it needs it, which it signals through a tool call. That signal happens in parallel with the model’s reasoning, so the latency cost is minimal.

Cost

Large context windows are expensive. If you’re using Claude or another commercial LLM, you pay per input token. A 150K context on every request adds up. We’ve seen teams accidentally spend 10x more on LLM costs than necessary because they never optimised their skill structure. With progressive disclosure, you’re paying for the context you actually use, not the context you might use.

One client we advised was running a customer support agent with 30KB of documentation embedded in every request. By moving to progressive disclosure—keeping only a 2KB skill index in the main context and loading documentation on demand—they cut their token consumption by 65% while improving response quality.

Hallucination

This is the most insidious problem. When context is bloated, the model has to sift through irrelevant information to find what it needs. This increases the likelihood of hallucination—the model confidently inventing details because it got confused by conflicting or tangential information in the context. A skill description buried on line 200 of your context might contradict one on line 50. The model might blend them together, or ignore one entirely, leading to incorrect tool calls or fabricated parameters.

Progressive disclosure reduces hallucination by keeping context focused. The agent sees only what it needs to reason about the current task. When it needs more detail, it explicitly requests it, which gives the model a clear signal that it’s entering a new context zone.

Three-Tier Skill Structure: Instructions, Reference, Examples

The most effective pattern we’ve implemented—and that’s documented in resources like Building an Internal Agent: Progressive Disclosure and Handling Large Files—uses three tiers:

Tier 1: Instructions (Always Loaded)

The instruction tier is the absolute minimum. It answers three questions:

What does this skill do? (One sentence.)
When should I use it? (Trigger conditions or use cases.)
What parameters does it take? (Input schema only, no defaults or examples yet.)

For a “fetch customer profile” skill, the instruction might be:

Skill: fetch_customer_profile
Description: Retrieves the current customer's profile data including name, email, account status, and plan tier.
When to use: When the user asks about their account, billing, or subscription details.
Parameters: customer_id (string, required)

That’s it. No explanation of error codes, no list of optional fields, no examples. Just enough for the agent to know the skill exists and when to invoke it.

Tier 2: Reference (Loaded on Demand)

The reference tier is what the agent can request when it needs to understand the full contract of a skill. It includes:

Full parameter documentation (required, optional, defaults, constraints)
Response schema and field descriptions
Error codes and recovery strategies
Rate limits or quota information
Integration notes (e.g., “This skill requires the customer to have enabled data sharing”)

The agent requests this tier by calling a get_skill_reference tool, which retrieves the documentation from a separate knowledge base. This keeps the main context lean but makes the information immediately available when needed.

For the same “fetch customer profile” skill, the reference might be:

{
  "skill": "fetch_customer_profile",
  "parameters": {
    "customer_id": {
      "type": "string",
      "required": true,
      "description": "The unique identifier for the customer. Can be email or UUID."
    },
    "include_billing_history": {
      "type": "boolean",
      "required": false,
      "default": false,
      "description": "If true, includes the last 12 months of billing records."
    }
  },
  "response": {
    "customer_id": "string",
    "name": "string",
    "email": "string",
    "account_status": "enum: active | suspended | cancelled",
    "plan_tier": "enum: free | pro | enterprise",
    "billing_history": "array of objects (if include_billing_history=true)"
  },
  "errors": {
    "CUSTOMER_NOT_FOUND": "The customer_id does not match any account.",
    "INSUFFICIENT_PERMISSIONS": "The requesting user does not have permission to view this customer's data."
  },
  "rate_limit": "100 requests per minute per API key"
}

This lives in a separate file or database, not in the agent’s main prompt. The agent knows how to request it, but doesn’t load it unless needed.

Tier 3: Examples (Loaded for Learning)

Examples are the third tier. They’re useful for teaching the agent how to use a skill correctly, but they’re verbose and context-heavy. Rather than embedding them in the main skill definition, you structure them separately and load them only when the agent is learning a new skill or when it’s made an error with that skill.

For example, if the agent tries to call fetch_customer_profile with an invalid parameter, you might inject a few worked examples showing correct usage:

Example 1: Fetch a profile by email
User: "What's the account status for alice@example.com?"
Agent reasoning: The user is asking about account status. I should use fetch_customer_profile with customer_id='alice@example.com'.
Agent call: fetch_customer_profile(customer_id="alice@example.com")
Result: {"customer_id": "alice@example.com", "account_status": "active", ...}
Agent response: "Alice's account is active and on the Pro plan."

Example 2: Fetch a profile with billing history
User: "Show me the billing history for customer 12345."
Agent reasoning: The user wants billing history. I should use fetch_customer_profile with customer_id='12345' and include_billing_history=true.
Agent call: fetch_customer_profile(customer_id="12345", include_billing_history=true)
Result: {"customer_id": "12345", ..., "billing_history": [...]}
Agent response: "Here's the billing history for customer 12345..." [lists transactions]

These examples are stored separately. The agent’s main prompt includes a note that examples are available, but doesn’t load them upfront. You load them conditionally—after an error, during initial skill onboarding, or when the agent explicitly requests them.

This three-tier structure is the foundation of progressive disclosure. It ensures that the agent always has enough context to reason about what to do, but doesn’t carry unnecessary weight.

Reference Files as Progressive Disclosure

Reference files are the practical implementation of the reference tier. Instead of embedding documentation in your prompt, you structure it as separate, indexed files that the agent can retrieve on demand.

Why Reference Files Work

Reference files work because they separate concerns. Your agent’s main prompt stays focused on reasoning and decision-making. Documentation stays in a queryable knowledge base. This separation has several benefits:

1. Modularity. You can update a skill’s documentation without touching the agent’s core prompt. If an API changes, you update the reference file, and the agent automatically uses the new version on the next request.

2. Scalability. As documented in the Skills (Progressive Disclosure) - AgentScope Java documentation, agents load skill metadata and content on demand, making it possible to manage hundreds of skills without context explosion.

3. Clarity. The agent knows exactly where to find information. Instead of searching through a 100KB prompt for a detail, it calls get_skill_reference("fetch_customer_profile") and gets a structured response.

4. Versioning. You can maintain multiple versions of a reference file (e.g., for different API versions or feature flags) and load the correct one based on the agent’s context.

Structuring Reference Files

Reference files should be structured for quick lookup. A typical structure includes:

reference_files/
├── skills/
│   ├── fetch_customer_profile.json
│   ├── update_customer_email.json
│   ├── create_support_ticket.json
│   └── ...
├── integrations/
│   ├── salesforce_api.md
│   ├── stripe_api.md
│   └── ...
├── error_codes.json
└── rate_limits.json

Each file is self-contained and can be retrieved independently. The agent’s prompt includes instructions on how to request a reference file:

If you need detailed information about a skill, call the 'get_skill_reference' tool with the skill name.
Example: get_skill_reference("fetch_customer_profile")

If you need information about an integration or API, call 'get_integration_reference' with the integration name.
Example: get_integration_reference("salesforce")

These calls are low-cost and return structured information that helps you make better decisions.

This approach is aligned with what researchers have found about Progressive Disclosure of Agent Tools from the Perspective of CLI Tool Style: when tools are organised like a CLI (with progressive disclosure), agents become more flexible and scalable.

Indexing for Efficient Retrieval

To make reference file retrieval efficient, maintain an index. The index is a lightweight file that lists all available references:

{
  "skills": [
    {"name": "fetch_customer_profile", "category": "customer", "tags": ["read", "profile"]},
    {"name": "update_customer_email", "category": "customer", "tags": ["write", "email"]},
    {"name": "create_support_ticket", "category": "support", "tags": ["write", "ticket"]}
  ],
  "integrations": [
    {"name": "salesforce", "version": "v2.1", "docs_url": "reference_files/integrations/salesforce_api.md"},
    {"name": "stripe", "version": "v1.2", "docs_url": "reference_files/integrations/stripe_api.md"}
  ]
}

Include this index in the agent’s main context (it’s small, typically 2–5KB). The agent can search the index to find the right reference file, then request it by name. This two-step process (index lookup, then file retrieval) is faster than embedding everything upfront.

Practical Implementation Patterns

Now let’s move from theory to practice. Here’s how to implement progressive disclosure in a real agent system.

Pattern 1: Lazy Loading with Tool Calls

The simplest pattern uses a tool call to trigger reference file loading. Define a tool in your agent’s system prompt:

Tool: get_skill_reference
Description: Retrieves detailed documentation for a skill.
Parameters:
  skill_name (string, required): The name of the skill to get documentation for.
Returns: A JSON object containing the skill's full documentation.

Usage: When you're about to call a skill and need to understand its parameters, error codes, or response format, call this tool first.

When the agent encounters a situation where it needs more information, it calls the tool:

User: "I need to update the customer's email address to newaddress@example.com."

Agent reasoning: The user wants to update a customer's email. I should use the update_customer_email skill, but I need to check the parameter names and required fields first.

Agent action: Call get_skill_reference(skill_name="update_customer_email")

Tool response:
{
  "skill": "update_customer_email",
  "parameters": {
    "customer_id": {"type": "string", "required": true},
    "new_email": {"type": "string", "required": true},
    "send_confirmation": {"type": "boolean", "required": false, "default": true}
  },
  "response": {"success": "boolean", "message": "string"},
  "errors": {"INVALID_EMAIL": "...", "CUSTOMER_NOT_FOUND": "..."}
}

Agent action: Call update_customer_email(customer_id="...", new_email="newaddress@example.com")

This pattern is efficient because the agent only loads the reference when it needs it. For simple queries, the agent might never call get_skill_reference at all.

Pattern 2: Conditional Context Injection

For frequently-used skills or high-risk operations, you might conditionally inject reference information based on the user’s intent. This is a hybrid approach that balances context size with safety.

For example, if the user’s query mentions “billing” or “payment,” you automatically inject the reference for billing-related skills:

def prepare_agent_context(user_query, base_prompt):
    context = base_prompt
    
    # Detect user intent
    if any(keyword in user_query.lower() for keyword in ["billing", "invoice", "payment", "charge"]):
        billing_ref = load_reference_file("skills/billing_operations.json")
        context += f"\n\nRelevant skill documentation:\n{billing_ref}"
    
    if any(keyword in user_query.lower() for keyword in ["delete", "remove", "cancel"]):
        destructive_ref = load_reference_file("skills/destructive_operations.json")
        context += f"\n\nIMPORTANT - Destructive operation documentation:\n{destructive_ref}"
    
    return context

This approach keeps context small for routine queries while ensuring the agent has detailed guidance for complex or risky operations.

Pattern 3: Skill Grouping and Namespacing

As your agent’s skill set grows, organise skills into logical groups. This makes it easier for the agent to find the right skill and reduces the cognitive load of managing dozens of separate tools.

Skill groups:
- customer.*: fetch_customer_profile, update_customer_email, list_customer_orders
- billing.*: create_invoice, update_payment_method, refund_transaction
- support.*: create_ticket, update_ticket_status, assign_ticket
- admin.*: create_user, reset_password, audit_log

Your instruction tier then includes just the group names:

Available skill groups:
- customer: Manage customer profiles, contact info, and order history
- billing: Handle invoices, payments, and refunds
- support: Create and manage support tickets
- admin: User management and system administration

To see available skills in a group, call get_skill_group(group_name="customer").
To get detailed documentation for a skill, call get_skill_reference(skill_name="fetch_customer_profile").

This reduces your instruction tier from 50+ skill definitions to 4 group definitions, while maintaining full functionality.

Pattern 4: Error-Driven Reference Loading

When an agent makes a mistake—calls a skill with wrong parameters, or misinterprets the response—use that as a trigger to inject relevant reference information.

def handle_agent_error(error_type, skill_name, agent_context):
    if error_type == "INVALID_PARAMETERS":
        # Inject the reference file for this skill
        ref = load_reference_file(f"skills/{skill_name}.json")
        return f"{agent_context}\n\nError: Invalid parameters. Here's the skill documentation:\n{ref}"
    
    elif error_type == "RATE_LIMIT":
        # Inject rate limit information
        rate_limits = load_reference_file("rate_limits.json")
        return f"{agent_context}\n\nRate limit exceeded. Current limits:\n{rate_limits}"
    
    return agent_context

This pattern ensures the agent learns from mistakes without bloating the initial context.

Measuring Context Efficiency

Progressive disclosure is an optimisation strategy, and like any optimisation, you need to measure it. Track three metrics:

1. Context Window Utilisation

Measure the average context size per request:

Context utilisation = (tokens_used / max_context_window) × 100

For Claude with a 200K context window, aim for 20–40% utilisation on average. If you’re consistently using 80%+, you’re not benefiting from progressive disclosure; consider moving more content to reference files.

Before progressive disclosure: 95K tokens average (47.5% utilisation) After progressive disclosure: 35K tokens average (17.5% utilisation)

2. Reference File Requests per Session

Track how often the agent requests reference files:

Reference requests = (total_get_skill_reference_calls / total_requests) × 100

This tells you whether the agent is actually using progressive disclosure or if your instruction tier is too sparse. A healthy ratio is 15–30% (the agent requests reference info for 1 in 3–7 requests). If it’s above 50%, your instruction tier needs more detail. If it’s below 5%, you might be over-documenting in the instruction tier.

3. Latency and Cost Per Request

Measure the end-to-end impact:

Average latency per request
Average token cost per request
Average time-to-first-token

Progressive disclosure should reduce all three. A typical improvement:

Latency: 8–10 seconds → 2–3 seconds (70% reduction)
Token cost: $0.15 per request → $0.05 per request (67% reduction)
Time-to-first-token: 2–3 seconds → 0.5–1 second (60% reduction)

If you’re not seeing these improvements, revisit your skill structure. You might have skills that are too granular (causing excessive tool calls) or reference files that are too large (negating the benefit of lazy loading).

Tracking in Practice

Implement logging at the agent level:

import time
import json
from datetime import datetime

class AgentMetrics:
    def __init__(self):
        self.logs = []
    
    def log_request(self, user_query, initial_context_tokens, final_context_tokens, 
                    latency_ms, reference_calls, cost_usd):
        self.logs.append({
            "timestamp": datetime.now().isoformat(),
            "query": user_query,
            "initial_context_tokens": initial_context_tokens,
            "final_context_tokens": final_context_tokens,
            "latency_ms": latency_ms,
            "reference_calls": reference_calls,
            "cost_usd": cost_usd
        })
    
    def get_summary(self):
        total_requests = len(self.logs)
        avg_latency = sum(log["latency_ms"] for log in self.logs) / total_requests
        avg_context = sum(log["final_context_tokens"] for log in self.logs) / total_requests
        total_reference_calls = sum(log["reference_calls"] for log in self.logs)
        reference_call_rate = (total_reference_calls / total_requests) * 100
        total_cost = sum(log["cost_usd"] for log in self.logs)
        
        return {
            "total_requests": total_requests,
            "avg_latency_ms": avg_latency,
            "avg_context_tokens": avg_context,
            "reference_call_rate_percent": reference_call_rate,
            "total_cost_usd": total_cost
        }

Review these metrics weekly. Look for trends: Is latency increasing? Are reference calls climbing? These are signs that your skill structure needs adjustment.

Real-World Scaling Scenarios

Let’s walk through how progressive disclosure works in real production scenarios.

Scenario 1: Multi-Domain Support Agent

You’re building a customer support agent that handles technical support, billing inquiries, and account management. Without progressive disclosure, your agent’s context would include:

20 technical support skills (troubleshooting, diagnostics, escalation)
15 billing skills (invoices, payments, refunds, plan changes)
10 account skills (password reset, email update, profile management)
Full error documentation for all 45 skills
Examples for common scenarios

Total: ~120KB of context, loaded on every request.

With progressive disclosure:

Instruction tier (loaded always): 5KB

3 skill groups (technical, billing, account)
Brief description of each group
Instructions for requesting group details or specific skill references

Reference tier (loaded on demand): 120KB split across 45 files

Each skill has its own reference file
Agent loads only the references it needs

Result:

Average context size: 15–25KB (80% reduction)
Latency: 8 seconds → 2 seconds
Cost per request: $0.12 → $0.04
Support for 45 skills without context explosion

Scenario 2: Enterprise Workflow Automation

You’re automating workflows across 8 different systems: Salesforce, Jira, Slack, Stripe, HubSpot, Asana, Zendesk, and Notion. Each system has 5–10 integrations (create, update, delete, query, etc.).

Without progressive disclosure: 50+ skills, 200KB+ context.

With progressive disclosure:

Instruction tier: 8KB

List of 8 integration groups
One-line description of each
Instructions for querying available skills within a group

Reference tier: 200KB split across 50+ files, plus an integration index

Agent loads integration reference (e.g., Salesforce API) only when working with that system
Agent loads skill reference (e.g., create_salesforce_lead) only when about to call it

Result:

Average context size: 20–35KB (85% reduction)
Agent can manage 50+ skills across 8 systems without degradation
New integrations can be added without touching the agent’s core prompt

Scenario 3: Agentic AI with Autonomous Reasoning

You’re deploying an autonomous agent that manages customer onboarding end-to-end: intake form processing, KYC verification, account creation, and welcome email. The agent needs to reason about complex workflows and make decisions about when to escalate.

Without progressive disclosure, the agent’s context would include:

Full workflow diagrams
Decision trees for escalation
Detailed API docs for 10+ systems
Compliance and regulatory notes
Examples of edge cases

Total: 150KB+.

With progressive disclosure:

Instruction tier: 10KB

High-level workflow overview
Decision tree for escalation (simplified)
Instructions for requesting detailed workflow docs or API references

Reference tier: 150KB split across workflow docs, API references, and compliance guides

Agent loads workflow details when planning the next step
Agent loads API reference when about to call a skill
Agent loads compliance guide when making a decision that has regulatory implications

Result:

Average context size: 25–40KB (75–85% reduction)
Agent can reason about complex workflows without context bloat
Compliance and regulatory information is always available when needed, but doesn’t clutter the reasoning context

This approach aligns with what we’ve learned at PADISO through our AI & Agents Automation work: agents that reason about their own actions (agentic AI) benefit most from progressive disclosure because they’re less dependent on a single monolithic prompt and more capable of exploring documentation on demand.

Common Pitfalls and How to Avoid Them

Progressive disclosure is powerful, but it’s easy to get wrong. Here are the pitfalls we’ve seen in production:

Pitfall 1: Instruction Tier Too Sparse

Problem: You move too much to the reference tier, leaving the instruction tier so bare that the agent doesn’t know what skills exist or when to use them.

Symptom: The agent rarely calls get_skill_reference because it doesn’t even know which skills are available. It falls back to generic responses or makes up tool calls.

Fix: The instruction tier should answer three questions for every skill:

What does it do? (One sentence.)
When do I use it? (Trigger conditions.)
What are the main parameters? (Names and types, no defaults yet.)

If you can’t answer these three questions without referring to the reference file, your instruction tier is too sparse.

Pitfall 2: Reference Files That Are Too Large

Problem: You move documentation to reference files, but each file is 10KB+. When the agent requests a reference, you’ve negated the savings.

Symptom: Reference file requests are slow (2–3 seconds), or they don’t actually reduce overall latency.

Fix: Keep individual reference files under 5KB. If a reference file is larger, split it. For example, instead of a single “salesforce_api.json” file with all 50 endpoints, create separate files:

salesforce_leads.json (create, update, query lead operations)
salesforce_accounts.json (account operations)
salesforce_contacts.json (contact operations)

This way, the agent loads only the subset of the API it needs.

Pitfall 3: No Indexing

Problem: The agent has no way to discover available skills or references. It either has to ask the user (“What skills do I have?”) or you have to hard-code a list in the prompt.

Symptom: The agent doesn’t use progressive disclosure effectively because it doesn’t know what’s available to request.

Fix: Maintain a skill index (2–5KB) in the main context. The agent can search this index to find the right skill, then request its reference file. This is a small, one-time cost that unlocks the full benefit of progressive disclosure.

Pitfall 4: Not Measuring

Problem: You implement progressive disclosure but don’t track whether it’s actually helping. You assume it’s working because it should theoretically work.

Symptom: After a few weeks, you realise latency hasn’t improved, or costs have actually gone up (because the agent is making extra tool calls to fetch references).

Fix: Implement the metrics from the “Measuring Context Efficiency” section. Track context size, reference call rate, latency, and cost. Review weekly. If metrics aren’t improving, adjust your skill structure.

Pitfall 5: Skill Fragmentation

Problem: You create too many granular skills, each doing one small thing. The agent has to make 10+ tool calls to accomplish a single user request.

Symptom: Latency increases despite progressive disclosure. The agent is making so many reference calls and skill calls that it’s slower than the monolithic approach.

Fix: Skills should map to meaningful business operations, not individual API calls. A skill should do something a user would ask for. For example:

❌ Bad: get_customer_first_name, get_customer_last_name, get_customer_email
✅ Good: fetch_customer_profile (returns name, email, and other core fields)

If you find yourself creating 50+ skills, you probably have too much fragmentation. Consolidate related operations into single skills with optional parameters.

Pitfall 6: Stale Reference Files

Problem: You update an API or workflow, but forget to update the corresponding reference file. The agent has outdated information.

Symptom: The agent makes calls with deprecated parameters, or uses old API endpoints.

Fix: Treat reference files as code. Version them, include them in your CI/CD pipeline, and add tests that verify they match your actual APIs. When you update an API, update the reference file in the same commit.

Next Steps: Building Your Skill Framework

If you’re ready to implement progressive disclosure in your agent, here’s a step-by-step path:

Week 1: Audit Your Current Skills

List every skill your agent needs. For each skill, write down:

What it does
When the agent should use it
What parameters it takes
What it returns
What errors it can throw

Calculate the total size of this documentation. This is your baseline.

Week 2: Design Your Tier Structure

Decide what goes in each tier:

Instruction tier: The absolute minimum needed for the agent to know a skill exists and when to use it. Aim for 50% of the baseline size.

Reference tier: Full documentation. This is your baseline, split into individual files.

Examples tier: Worked examples (optional, load only for learning or after errors).

Create a template for each tier:

# Instruction tier template
Skill: [name]
Description: [one sentence]
When to use: [trigger conditions]
Parameters: [names and types only]

# Reference tier template
{
  "skill": "[name]",
  "parameters": { ... },
  "response": { ... },
  "errors": { ... },
  "rate_limit": "...",
  "notes": "..."
}

Week 3: Build the Infrastructure

Implement the tools and processes:

Create a get_skill_reference tool that the agent can call.
Set up a reference file store (filesystem, database, or knowledge base).
Create a skill index that the agent can search.
Update the agent’s system prompt to include instructions for using progressive disclosure.

Week 4: Test and Measure

Deploy to a test environment. Run your agent against a representative set of user queries. Measure:

Context size (before and after)
Latency
Reference call rate
Cost

Compare to your baseline. Aim for 50%+ reduction in context size and latency.

Week 5: Iterate and Optimise

Based on your measurements:

If latency improved but reference call rate is high (>50%), your instruction tier might be too sparse.
If reference call rate is low (<5%), your instruction tier might be over-documented.
If context size didn’t improve much, you might have too much in the instruction tier still.

Adjust and re-measure.

Ongoing: Maintain and Extend

Once progressive disclosure is working, maintain it as part of your agent development process:

When you add a new skill, create instruction and reference tiers from the start.
When you update an API, update the reference file.
Review metrics monthly. If they drift, investigate.
Share your skill framework with other teams. Progressive disclosure scales across multiple agents.

Advanced: Progressive Disclosure at Scale

If you’re managing dozens of agents or hundreds of skills, consider these advanced patterns:

One reference file can serve multiple agents. If you have a customer support agent, a billing agent, and an admin agent, they can all reference the same fetch_customer_profile skill. This reduces duplication and makes maintenance easier.

Dynamic Skill Loading

Instead of a static skill index, generate it dynamically based on the user’s context. For example, if the user is a billing admin, only load billing-related skills in the index. This keeps the instruction tier even smaller.

Skill Versioning

Maintain multiple versions of a skill reference if you’re rolling out API changes gradually. The agent can request the version it needs based on a feature flag or user setting.

Hierarchical References

For very large reference files, create a hierarchy:

get_salesforce_reference()
  → returns index of available Salesforce operations
    → get_salesforce_reference("leads")
      → returns detailed lead operations
        → get_salesforce_reference("leads.create")
          → returns full create_lead documentation

This allows the agent to navigate documentation incrementally, loading only what it needs.

Conclusion: Context as a Strategic Resource

Progressive disclosure reframes how you think about context. Instead of a fixed, monolithic prompt, you build a dynamic system where context grows and shrinks based on what the agent actually needs.

This matters because context is a strategic resource. It’s expensive, it’s limited, and it directly impacts performance. As you scale your agent systems—adding more skills, integrating more platforms, supporting more use cases—progressive disclosure becomes essential. It’s the difference between an agent that works at 10 skills and one that scales to 100+.

At PADISO, we’ve integrated progressive disclosure into our AI & Agents Automation practice because it’s foundational to building production-grade agents. Whether you’re a startup building your first agent or an enterprise deploying across multiple teams, this pattern will serve you well.

The three-tier structure—instructions, reference, examples—is simple but powerful. It keeps your agent focused, your costs down, and your latency low. Start with it. Measure it. Optimise it. And as your agent systems grow, progressive disclosure will be the pattern that lets you scale without compromise.

If you’re building agentic AI systems and need guidance on skill architecture, context optimisation, or production deployment, PADISO can help. We’ve worked with startups and enterprises to architect agent systems that scale, and we’d be happy to discuss your specific challenges.

Progressive Disclosure in Agent Skills: Reference Files That Don't Bloat Context

Progressive Disclosure in Agent Skills: Reference Files That Don’t Bloat Context

Table of Contents

What Progressive Disclosure Means for Agent Skills

Why Context Bloat Kills Agent Performance

Latency

Cost

Hallucination

Three-Tier Skill Structure: Instructions, Reference, Examples

Tier 1: Instructions (Always Loaded)

Tier 2: Reference (Loaded on Demand)

Tier 3: Examples (Loaded for Learning)

Reference Files as Progressive Disclosure

Why Reference Files Work

Structuring Reference Files

Indexing for Efficient Retrieval

Practical Implementation Patterns

Pattern 1: Lazy Loading with Tool Calls

Pattern 2: Conditional Context Injection

Pattern 3: Skill Grouping and Namespacing

Pattern 4: Error-Driven Reference Loading

Measuring Context Efficiency

1. Context Window Utilisation

2. Reference File Requests per Session

3. Latency and Cost Per Request

Tracking in Practice

Real-World Scaling Scenarios

Scenario 1: Multi-Domain Support Agent

Scenario 2: Enterprise Workflow Automation

Scenario 3: Agentic AI with Autonomous Reasoning

Common Pitfalls and How to Avoid Them

Pitfall 1: Instruction Tier Too Sparse

Pitfall 2: Reference Files That Are Too Large

Pitfall 3: No Indexing

Pitfall 4: Not Measuring

Pitfall 5: Skill Fragmentation

Pitfall 6: Stale Reference Files

Next Steps: Building Your Skill Framework

Week 1: Audit Your Current Skills

Week 2: Design Your Tier Structure

Week 3: Build the Infrastructure

Week 4: Test and Measure

Week 5: Iterate and Optimise

Ongoing: Maintain and Extend

Advanced: Progressive Disclosure at Scale

Multi-Agent Skill Sharing

Dynamic Skill Loading

Skill Versioning

Hierarchical References

Conclusion: Context as a Strategic Resource