Guide 32 mins

Building Reusable Agent Skills: A Padiso Skill Library Walkthrough

Tour 30+ reusable agent skills Padiso uses across clients: DD, SOC 2, ERP migration. Learn skill architecture, file structures, and triggering tactics.

The PADISO Team ·2026-05-04

Building Reusable Agent Skills: A Padiso Skill Library Walkthrough

Why Reusable Agent Skills Matter
What Makes a Skill Reusable
The Padiso 30-Skill Library: Architecture & Structure
Core Skill Categories
Due Diligence & M&A Skills
SOC 2 & ISO 27001 Compliance Skills
ERP Migration & Platform Engineering Skills
File Structure, Triggering, and Deployment
Skill Composition and Orchestration
Real-World Outcomes: Skill Reuse Across Clients
Building Your Own Skill Library
Next Steps

Why Reusable Agent Skills Matter

Agentic AI is no longer a lab experiment. It’s shipping in production across mid-market and enterprise teams—but most organisations are building agents from scratch every time. They’re hiring contractors to write prompt chains, debugging runaway loops, and watching costs balloon because nobody documented what actually worked.

Reusable agent skills flip that equation. A skill is a self-contained, discoverable, versioned package of knowledge and decision logic that an AI agent can reliably invoke to solve a specific class of problem. Once built and validated, a skill can be reused across dozens of projects, clients, and use cases without rewriting the logic or retraining the agent.

At Padiso, we’ve spent two years building and refining a library of 30+ production-grade agent skills. These skills power our work across three core domains:

Due diligence and M&A: financial health scoring, cap table audits, tech stack assessment, revenue quality scoring
SOC 2 and ISO 27001 audit readiness: evidence drafting, control mapping, policy generation, risk assessment
ERP migration and platform engineering: data mapping, schema validation, dependency analysis, cutover planning

Each skill has been tested across 10+ client engagements. Each has a documented trigger condition, a clear input/output contract, and a failure mode playbook. This guide walks you through the architecture, shows you the real file structures we use, and gives you a framework to build your own reusable skill library.

If you’re running a venture studio, a fractional CTO practice, or leading platform modernisation at scale, this is the playbook that separates one-off consulting from repeatable, scalable AI delivery.

What Makes a Skill Reusable

Not every prompt or workflow deserves to be a skill. Reusable skills share five core properties:

Clear, Narrow Scope

A skill solves one class of problem well. “Assess financial health” is a skill. “Assess financial health, review cap table, and predict runway” is three skills masquerading as one. When scope creeps, reusability dies—you end up with a monolithic prompt that breaks the moment context changes.

At Padiso, we use the “single responsibility principle” borrowed from software engineering. Each skill has exactly one purpose. A skill either extracts data, validates it, scores it, or recommends an action—not all four.

Documented Input/Output Contract

A reusable skill specifies exactly what it expects and what it returns. No surprises. No “it usually works unless the data is messy.” The contract is the covenant between the agent and the skill.

For example, our “Revenue Quality Scorer” skill expects:

Input: {
  "annual_recurring_revenue": number (USD),
  "customer_concentration": number (0-1),
  "churn_rate_annual": number (0-1),
  "net_retention": number (0-2),
  "revenue_growth_yoy": number (-1 to 5)
}

Output: {
  "score": number (0-100),
  "grade": "A" | "B" | "C" | "D" | "F",
  "risk_factors": string[],
  "confidence": number (0-1)
}

The agent knows exactly what to feed in and what to expect back. No ambiguity. No hallucination.

Grounded in Real Expertise

A reusable skill isn’t a generic LLM prompt. It encodes specific heuristics, decision trees, and domain knowledge that only an expert would know. Following the best practices for skill creators, we extract these patterns directly from our operators’ workflows—the shortcuts they use, the red flags they watch for, the exceptions they’ve learned over years.

Our SOC 2 evidence drafting skill, for instance, doesn’t just say “write evidence.” It encodes 12 years of audit experience: which control families need written policies vs. system screenshots, which auditors accept inference vs. require direct proof, which evidence types are reusable across multiple controls.

Versioned and Tested

A reusable skill is versioned like production code. v1.0, v1.1, v2.0. Each version has a changelog. Each version is tested against a corpus of real inputs from past engagements.

When a skill breaks—and they do—you know which version broke and which clients are affected. You can roll back or patch forward with confidence.

Failure Mode Documentation

Every skill has a failure playbook. What happens when the input is incomplete? When the agent hallucinates? When the data contradicts the decision heuristic? A reusable skill doesn’t just succeed—it fails gracefully, with clear signals and remediation steps.

Our ERP migration skills, for example, document exactly what happens when a data mapping fails (halt the migration, log the row, flag for manual review) versus when a schema validation fails (suggest a transformation, offer three alternatives, wait for human approval).

The Padiso 30-Skill Library: Architecture & Structure

Our skill library is organised into three tiers:

Tier 1: Foundational Skills (8 skills)

These are the building blocks. They handle data extraction, validation, scoring, and basic reasoning. Every other skill is built on top of these.

Data Extractor: Pulls structured data from unstructured documents (PDFs, emails, spreadsheets)
Schema Validator: Checks data against a defined schema and reports mismatches
Risk Scorer: Converts qualitative observations into quantitative risk scores
Dependency Mapper: Identifies relationships between entities (systems, teams, data flows)
Gap Analyzer: Compares current state vs. desired state and identifies deltas
Recommendation Engine: Generates prioritised, actionable recommendations
Evidence Synthesiser: Compiles multiple data points into a coherent narrative
Cost Estimator: Predicts effort, timeline, and budget for a given initiative

Tier 2: Domain Skills (15 skills)

These are built from Tier 1 skills and solve specific problems within our three core domains. They’re where domain expertise lives.

Due Diligence Skills:

Financial Health Scorer
Cap Table Auditor
Tech Stack Assessor
Revenue Quality Scorer
Customer Health Analyzer

SOC 2 & Compliance Skills:

Control Mapper (maps business processes to control requirements)
Evidence Drafter (generates audit-ready evidence narratives)
Policy Generator (creates control policies from templates)
Risk Assessor (identifies gaps in control implementation)
Audit Readiness Scorer (predicts likelihood of passing audit)

ERP & Platform Skills:

Data Mapper (defines transformation rules between systems)
Schema Transformer (generates migration scripts)
Cutover Planner (sequences migration steps)
Dependency Analyzer (identifies system and data dependencies)
Post-Migration Validator (tests migrated data quality)

Tier 3: Orchestration Skills (7 skills)

These coordinate multiple Tier 2 skills to solve end-to-end problems. They’re where the magic happens—where a single agent request triggers a choreographed sequence of 5–10 sub-skills.

Due Diligence Engine: Runs a complete DD assessment (financial + tech + customer)
SOC 2 Readiness Audit: Evaluates and drafts evidence for all control families
ERP Migration Planner: Designs a complete migration approach
Post-Acquisition Integration: Plans 100-day integration roadmap
Platform Modernisation Roadmap: Designs multi-year platform engineering strategy
Compliance Remediation Plan: Generates a prioritised remediation roadmap
Venture Studio Diligence: Runs a complete pre-seed/seed investment assessment

This three-tier architecture means:

Reuse is maximised: A Tier 3 skill might invoke 5 Tier 2 skills, which each invoke 2–3 Tier 1 skills. But those Tier 1 skills are shared across all Tier 2 and Tier 3 skills.
Maintenance is centralised: If we improve the Data Extractor, all downstream skills improve automatically.
Reasoning is transparent: Each tier adds a layer of reasoning. You can see exactly where a decision came from.

Following 3 principles for designing agent skills, we ensure each skill is modular, composable, and independently testable. This is not a monolithic prompt. It’s a library of small, focused, well-tested building blocks.

Core Skill Categories

Data Extraction and Validation

Every engagement starts with messy data. Financial statements with inconsistent formatting. Spreadsheets with hidden columns. PDFs where numbers are stored as images. A reusable data extraction skill needs to handle all of this without human intervention.

Our Data Extractor skill works like this:

Accept multiple input formats: PDF, XLSX, CSV, JSON, plain text, email body
Identify the data type: Is this a financial statement? A customer list? A system inventory?
Extract with high confidence: Pull the relevant fields and flag anything ambiguous
Validate against schema: Does the data match what we expect?
Return structured output: JSON with extracted fields + a confidence score for each field

The key insight: extraction isn’t binary. It’s probabilistic. If we’re 95% confident in a number, we include it with a confidence flag. If we’re 60% confident, we flag it for human review. This prevents silent failures.

For schema validation, we use a simple but powerful approach: define the schema upfront in JSON Schema format, then have the agent validate incoming data against it. When validation fails, the agent doesn’t just say “error”—it suggests a transformation or asks a clarifying question.

Risk and Quality Scoring

Scoring is where judgment lives. A reusable scoring skill encodes the judgment of experienced operators so it can be applied consistently across dozens of engagements.

Our Risk Scorer skill uses a weighted rubric approach:

1. Define risk dimensions (e.g., financial, technical, market, team)
2. For each dimension, define 5–7 observable signals
3. Weight each signal (e.g., negative churn is 3x more important than high CAC)
4. For each signal, define thresholds (e.g., >40% annual churn = red flag)
5. Aggregate scores using a decision tree (not a simple average)
6. Return the final score + the reasoning (which signals drove the score)

This approach is reusable because the structure is the same across all scoring tasks. What changes is the dimensions, signals, weights, and thresholds. For a financial health score, the dimensions are profitability, growth, and cash. For a tech health score, the dimensions are architecture, security, and scalability.

The agent can apply the same skill logic to different domains by swapping in different rubrics.

Mapping and Dependency Analysis

When you’re doing M&A, ERP migration, or compliance work, you need to understand how things connect. Which systems feed which? Which teams own which processes? Which controls map to which policies?

Our Dependency Mapper skill builds a graph of relationships:

Identify entities: Systems, teams, processes, data flows, controls, policies
Identify relationships: A → B (e.g., Salesforce → data warehouse)
Classify relationships: Feeds, depends on, owns, implements, audits
Detect cycles: Are there circular dependencies? (Often a red flag)
Compute criticality: Which entities are most central? Which are isolated?
Return a graph: JSON representation of the dependency network

This skill is reusable across technical, organisational, and regulatory domains. The logic is the same; the entity types change.

Due Diligence & M&A Skills

Due diligence is where we’ve seen the most dramatic reuse. A single Tier 3 “Due Diligence Engine” skill orchestrates 5 Tier 2 skills to produce a comprehensive investment memo in 4 weeks instead of 12.

Financial Health Scorer

This skill takes financial statements (P&L, balance sheet, cash flow) and produces a health score (0–100) plus a narrative assessment.

The skill encodes heuristics like:

Profitability: Gross margin > 60% is green. < 40% is red. (Varies by SaaS vs. marketplace vs. services.)
Growth: YoY revenue growth > 30% is healthy. < 10% is concerning.
Runway: Months of cash remaining at current burn rate. < 12 months is a risk.
Unit economics: CAC payback < 12 months. LTV:CAC > 3:1. Both are green flags.
Efficiency: Magic number (ARR growth / S&M spend) > 0.75 is efficient.

The skill weights these heuristics based on company stage (seed companies are judged differently than Series B) and industry (SaaS vs. marketplace vs. services).

Output:

{
  "score": 72,
  "grade": "B",
  "profitability_score": 65,
  "growth_score": 85,
  "runway_score": 68,
  "unit_economics_score": 75,
  "efficiency_score": 70,
  "key_risks": [
    "Negative gross margin (48%)",
    "Cash runway only 14 months",
    "Declining NRR (0.92)"
  ],
  "key_strengths": [
    "Strong YoY growth (145%)",
    "CAC payback < 10 months",
    "Improving unit economics"
  ],
  "confidence": 0.94
}

Cap Table Auditor

This skill takes a cap table (spreadsheet or document) and validates it for completeness and consistency.

It checks:

Fully diluted ownership: Do all shares + options + warrants add up to 100%?
Vesting schedules: Are vesting cliffs and schedules documented for each grant?
Option pool: Is there an approved option pool? Is it fully allocated?
Board seats and preferences: Do the cap table and board composition align?
Recent fundraising: Are the latest funding round shares reflected?
Liquidation preferences: Are they documented and do they align with term sheets?

Output:

{
  "is_valid": false,
  "issues": [
    {
      "severity": "critical",
      "issue": "Option pool not fully allocated",
      "detail": "5M options reserved but only 2.3M granted"
    },
    {
      "severity": "high",
      "issue": "Founder vesting not documented",
      "detail": "No vesting schedule for Founder A (5M shares)"
    }
  ],
  "fully_diluted_ownership": {
    "common": 0.65,
    "options": 0.15,
    "warrants": 0.02,
    "unallocated": 0.18
  },
  "confidence": 0.88
}

Tech Stack Assessor

This skill inventories the technology stack and scores it on architecture, security, scalability, and maintainability.

It identifies:

Core systems: Database, backend, frontend, mobile
Infrastructure: Cloud provider, deployment model, monitoring
Data pipeline: ETL, analytics, BI
Security: Auth, encryption, secrets management
DevOps: CI/CD, testing, deployment frequency

For each component, it scores on maturity (startup-grade vs. enterprise-grade), technical debt, and risk.

This skill is crucial for pre-acquisition technical due diligence and for understanding agentic AI vs traditional automation approaches when planning post-acquisition platform modernisation.

Revenue Quality Scorer

This skill takes revenue data and produces a quality score. Not all revenue is created equal.

It evaluates:

Customer concentration: Is revenue concentrated in a few customers? (High concentration = risk.)
Churn: Are customers leaving? At what rate?
Net retention: Are existing customers growing or shrinking their spend?
Revenue growth: Is the company growing? At what rate?
Recurring vs. one-time: What % is recurring?
Gross margin by customer: Which customers are profitable?

Output:

{
  "quality_score": 78,
  "grade": "B+",
  "concentration_risk": 0.35,
  "top_10_customers_pct": 45,
  "annual_churn": 0.08,
  "net_retention": 1.15,
  "recurring_revenue_pct": 0.92,
  "key_concerns": [
    "Customer concentration above 40%",
    "Churn accelerating (8% vs 5% last year)"
  ],
  "key_strengths": [
    "Net retention > 1.0 (customers expanding)",
    "92% recurring revenue (very sticky)"
  ]
}

Customer Health Analyzer

This skill takes customer data (usage, support tickets, renewal history, NPS) and predicts churn risk and expansion opportunity.

It identifies:

At-risk customers: Low usage, declining engagement, recent support issues
Expansion opportunities: High usage, high satisfaction, adjacent product interest
Cohort patterns: Do certain customer segments churn more? Which segments expand?

This skill is reused across multiple clients because the logic is the same; the data sources change.

SOC 2 & ISO 27001 Compliance Skills

Compliance is tedious, but it’s also highly reusable. Once you’ve built a skill to draft SOC 2 evidence, you can use it across dozens of clients.

We’ve seen this firsthand. Our SOC 2 readiness skills have been used across 30+ engagements, saving an average of 6 weeks per audit cycle. Here’s why they work.

Control Mapper

This skill takes a business process description and maps it to relevant SOC 2 or ISO 27001 controls.

For example:

Input: “We store customer data in AWS S3. Access is restricted to authenticated employees via IAM roles. All access is logged in CloudTrail.”

Output:

{
  "process": "Customer data storage",
  "controls_mapped": [
    {
      "control_id": "CC6.1",
      "control_name": "Logical Access Controls",
      "evidence_type": "System configuration",
      "status": "implemented",
      "confidence": 0.95
    },
    {
      "control_id": "CC7.2",
      "control_name": "System Monitoring",
      "evidence_type": "Logs and audit trails",
      "status": "implemented",
      "confidence": 0.92
    }
  ]
}

The skill encodes the SOC 2 control framework (all 64 controls) and knows which business processes map to which controls.

Evidence Drafter

This is the workhorse skill. It takes a control ID and existing evidence (logs, screenshots, policies) and drafts an audit-ready narrative.

Auditors want to see:

What is the control? (Definition)
How is it implemented? (Design)
How do we know it’s working? (Operating effectiveness)
What’s the evidence? (Proof)

The Evidence Drafter skill generates all four sections from input data.

For example, given:

Control: “CC6.1 – Logical Access Controls”
Evidence: AWS IAM policy document, CloudTrail logs showing access, employee onboarding checklist

The skill generates:

Control CC6.1: Logical Access Controls

Design: The Company restricts logical access to customer data systems 
using AWS Identity and Access Management (IAM). All employees are 
provisioned with IAM roles corresponding to their job function. 
Access is granted on a least-privilege basis.

Operating Effectiveness: During the period under review, the Company 
granted access to 45 new employees and revoked access for 12 departed 
employees. All access changes were documented in our HRIS and 
implemented within 24 hours of hire/termination.

Evidence:
- AWS IAM policy document (Exhibit A)
- CloudTrail logs showing 847 access grants and 156 revocations (Exhibit B)
- Employee onboarding checklist with access provisioning sign-off (Exhibit C)
- Access review certification signed by IT Manager (Exhibit D)

This skill has been used across 30+ clients. The logic is identical; the evidence changes.

Policy Generator

This skill takes a control requirement and generates a policy document that satisfies it.

For example:

Input: Control CC6.2 – “Restrictions and security measures for user access to system components and data are established and implemented.”

Output: A complete “Access Control Policy” document with sections on:

Policy objective
Scope
Roles and responsibilities
Access request process
Approval workflow
Access review cadence
Revocation process
Exceptions and compensating controls

The skill generates policies that are:

Audit-ready (written in the language auditors expect)
Operationally feasible (don’t require impossible processes)
Reusable (can be adapted for different organisations)

Risk Assessor

This skill evaluates whether a control is actually implemented or just documented.

It looks for:

Design gaps: Is the control design sound?
Implementation gaps: Is the control actually built?
Operating effectiveness gaps: Is the control actually working?

For each gap, it produces a risk rating (critical, high, medium, low) and a remediation recommendation.

This skill is crucial for understanding SOC 2 compliance strategies and identifying which gaps need immediate attention before an audit.

Audit Readiness Scorer

This skill takes the results of a risk assessment and predicts the likelihood of passing an audit.

It considers:

Number of critical gaps: Each critical gap reduces the probability significantly
Control family coverage: Are all families at least partially implemented?
Evidence quality: Is the evidence strong or weak?
Auditor history: Have you passed SOC 2 before? With which auditor?

Output:

{
  "readiness_score": 68,
  "probability_of_passing": 0.62,
  "critical_gaps": 3,
  "high_gaps": 7,
  "medium_gaps": 12,
  "estimated_weeks_to_readiness": 8,
  "recommended_focus": [
    "CC6.1 – Logical Access Controls",
    "CC7.2 – System Monitoring",
    "CC9.2 – Encryption"
  ]
}

This skill helps organisations understand: Should we audit now or wait 8 weeks?

ERP Migration & Platform Engineering Skills

ERP migrations and platform re-platforming projects are complex, multi-month endeavours. Reusable skills compress the planning phase from 12 weeks to 4 weeks.

Data Mapper

This skill takes a source system schema and a target system schema and generates a data mapping.

For example:

Source: SAP (legacy, on-premises) Target: NetSuite (cloud)

The skill:

Identifies all tables and fields in the source system
Identifies all tables and fields in the target system
Matches source fields to target fields (exact matches, transformations, or “no mapping”)
For each transformation, generates a rule (e.g., “SAP VBAK.VBELN → NetSuite Transaction.ID, prepend ‘SAP-’”)
Identifies data that exists in source but has no home in target (orphaned data)
Identifies data in target that must be manually populated (gaps)

Output:

{
  "source_system": "SAP",
  "target_system": "NetSuite",
  "mappings": [
    {
      "source_field": "VBAK.VBELN",
      "target_field": "Transaction.ID",
      "mapping_type": "transformation",
      "rule": "prepend 'SAP-'",
      "confidence": 0.98
    },
    {
      "source_field": "VBAK.NETWR",
      "target_field": "Transaction.Amount",
      "mapping_type": "exact",
      "confidence": 1.0
    }
  ],
  "orphaned_fields": [
    "VBAK.ZZCUSTOM1"
  ],
  "gaps": [
    "NetSuite.Tax_ID (not in SAP)"
  ]
}

Schema Transformer

This skill takes a data mapping and generates SQL, Python, or dbt code to perform the transformation.

For example:

Input: Data mapping from SAP to NetSuite

Output: dbt model that reads from SAP staging tables and writes to NetSuite format

select
  concat('SAP-', vbak.vbeln) as transaction_id,
  vbak.netwr as amount,
  vbak.waerk as currency,
  vbak.erdat as created_date,
  vbak.ernam as created_by,
  case
    when vbak.augru = '01' then 'Rush'
    when vbak.augru = '02' then 'Standard'
    else 'Other'
  end as fulfillment_priority
from {{ source('sap', 'vbak') }} vbak
where vbak.mandt = '100'  -- Client 100 only

The skill generates transformation code that is:

Correct (maps fields according to the mapping)
Efficient (uses appropriate aggregations and filters)
Debuggable (includes comments, clear field names)
Testable (generates dbt tests automatically)

Cutover Planner

This skill takes a migration scope and generates a detailed cutover plan.

It sequences the migration in phases:

Preparation phase (2 weeks): Data extraction, transformation testing, user training
Pilot phase (1 week): Run transformation on a subset of data, validate results
Parallel run (2 weeks): Run old and new systems side-by-side, reconcile differences
Cutover (1 day): Stop the old system, run final transformation, validate
Stabilisation (2 weeks): Monitor for issues, handle exceptions, optimise

For each phase, the skill generates:

Activities (what needs to happen)
Owners (who is responsible)
Success criteria (how do we know it worked)
Rollback plan (what if it fails)

This skill is reused across dozens of migrations because the structure is the same; the details change.

Dependency Analyzer

This skill identifies all systems that depend on the system being migrated.

For example, if you’re migrating from SAP to NetSuite:

Which systems read from SAP? (Data warehouse, BI tools, reporting systems)
Which systems write to SAP? (E-commerce platform, accounting software)
Which systems are tightly coupled to SAP? (Custom integrations, plugins)

For each dependency, the skill assesses:

Criticality: How important is this integration?
Effort: How much work to migrate this integration?
Risk: What could go wrong?

Output:

{
  "dependencies": [
    {
      "system": "Data Warehouse",
      "direction": "reads",
      "criticality": "high",
      "effort_weeks": 3,
      "risk": "If DW breaks, reporting breaks. Need to test daily."
    },
    {
      "system": "Shopify",
      "direction": "writes",
      "criticality": "critical",
      "effort_weeks": 2,
      "risk": "If integration breaks, orders don't sync. Revenue impact."
    }
  ],
  "critical_path": ["Shopify integration", "Data Warehouse"],
  "estimated_total_effort_weeks": 12
}

This skill prevents the “surprise dependency” problem where you think you’re done with the migration and discover a system that depends on the old system still running.

Post-Migration Validator

This skill tests the migrated data for completeness, accuracy, and consistency.

It runs automated checks:

Row count validation: Does the target have the same number of rows as the source?
Field-level validation: Do amounts, dates, and IDs match?
Referential integrity: Do foreign keys still point to valid records?
Data quality: Are there unexpected nulls, duplicates, or outliers?
Business logic: Do calculated fields (totals, balances) still compute correctly?

For each check, it produces a pass/fail result and, if it fails, highlights the problematic rows.

This skill is reused across all migration projects because the validation logic is the same; the data changes.

File Structure, Triggering, and Deployment

Reusable skills aren’t just logic—they’re also files, versioning, and deployment pipelines. Here’s how we structure them at Padiso.

Skill File Structure

Each skill lives in its own directory:

skills/
├── financial-health-scorer/
│   ├── skill.yaml                 # Metadata
│   ├── README.md                  # Documentation
│   ├── prompt.md                  # The actual prompt
│   ├── rubric.json                # Scoring rubric
│   ├── tests/
│   │   ├── test_cases.json        # Test inputs and expected outputs
│   │   └── test_runner.py         # Test harness
│   ├── examples/
│   │   ├── input_example.json     # Example input
│   │   └── output_example.json    # Example output
│   └── changelog.md               # Version history
├── cap-table-auditor/
│   ├── skill.yaml
│   ├── README.md
│   ├── prompt.md
│   ├── validation_rules.json      # Cap table validation rules
│   ├── tests/
│   ├── examples/
│   └── changelog.md
├── ...

The skill.yaml File

This is the contract. It declares what the skill does, what it expects, and what it returns.

name: "Financial Health Scorer"
version: "2.1.0"
description: "Scores financial health of a company based on P&L, balance sheet, and cash flow."

inputs:
  - name: "annual_recurring_revenue"
    type: "number"
    unit: "USD"
    required: true
    description: "Annual recurring revenue"
  - name: "gross_margin"
    type: "number"
    unit: "percentage"
    required: true
    description: "Gross profit / revenue"
  - name: "cash_balance"
    type: "number"
    unit: "USD"
    required: true
  - name: "monthly_burn"
    type: "number"
    unit: "USD"
    required: true
  - name: "company_stage"
    type: "string"
    enum: ["seed", "series-a", "series-b", "series-c+"]
    required: true

outputs:
  - name: "score"
    type: "number"
    range: [0, 100]
    description: "Overall health score"
  - name: "grade"
    type: "string"
    enum: ["A", "B", "C", "D", "F"]
  - name: "key_risks"
    type: "array"
    items: "string"
  - name: "confidence"
    type: "number"
    range: [0, 1]

triggers:
  - event: "due_diligence_started"
    condition: "company_stage in [seed, series-a, series-b]"
  - event: "quarterly_health_check"
    condition: "always"

owner: "padiso-dd-team"
last_updated: "2025-02-15"

This YAML file is machine-readable. An orchestration engine can parse it and understand:

What inputs the skill needs
What outputs it produces
When to trigger it automatically
Who owns it

Following skill authoring best practices, we ensure each skill has a clear contract and explicit triggering logic.

The prompt.md File

This is the actual prompt that the agent executes. It’s written to be:

Clear and specific
Grounded in domain expertise
Robust to edge cases
Explicit about reasoning steps

Example:

# Financial Health Scorer

You are an expert investor evaluating the financial health of a startup.

## Your Task

Given financial metrics, score the company's financial health on a scale of 0-100.

## Scoring Framework

### Profitability (30% weight)
- Gross margin > 70%: Excellent (90-100 points)
- Gross margin 50-70%: Good (70-89 points)
- Gross margin 30-50%: Fair (50-69 points)
- Gross margin < 30%: Poor (0-49 points)

### Growth (25% weight)
- YoY growth > 100%: Exceptional (90-100 points)
- YoY growth 50-100%: Strong (70-89 points)
- YoY growth 20-50%: Moderate (50-69 points)
- YoY growth < 20%: Weak (0-49 points)

### Runway (25% weight)
- Runway > 24 months: Excellent (90-100 points)
- Runway 12-24 months: Good (70-89 points)
- Runway 6-12 months: Fair (50-69 points)
- Runway < 6 months: Critical (0-49 points)

### Unit Economics (20% weight)
- CAC payback < 12 months AND LTV:CAC > 3: Excellent (90-100 points)
- CAC payback < 15 months AND LTV:CAC > 2: Good (70-89 points)
- CAC payback < 18 months AND LTV:CAC > 1.5: Fair (50-69 points)
- Otherwise: Poor (0-49 points)

## Adjustment for Stage

If company_stage is "seed":
- Reduce growth expectations by 20%
- Reduce profitability expectations by 30%
- Increase runway importance by 10%

If company_stage is "series-b":
- Increase growth expectations by 20%
- Increase profitability expectations by 20%

## Output Format

Return a JSON object with:
- score (0-100)
- grade (A/B/C/D/F)
- key_risks (list of 2-4 risks)
- key_strengths (list of 2-4 strengths)
- reasoning (explain your score in 2-3 sentences)
- confidence (0-1)

## Important

- Be conservative: If data is ambiguous, lower the confidence.
- Be specific: Don't say "growth is good"—say "YoY growth is 145%, which is strong for a Series A."
- Flag edge cases: If a metric is unusual, note it.

Triggering Logic

Skills aren’t triggered manually. They’re triggered automatically based on events and conditions.

We use a simple trigger engine:

{
  "triggers": [
    {
      "id": "dd_financial_assessment",
      "skill": "financial-health-scorer",
      "event": "due_diligence_started",
      "condition": "company_stage in ['seed', 'series-a', 'series-b']",
      "inputs_from": {
        "annual_recurring_revenue": "company.arr",
        "gross_margin": "company.financials.gross_margin",
        "cash_balance": "company.financials.cash_balance",
        "monthly_burn": "company.financials.monthly_burn",
        "company_stage": "company.stage"
      },
      "on_success": "store_result_in_dd_memo",
      "on_failure": "escalate_to_human"
    }
  ]
}

When a due diligence is started, the orchestration engine:

Checks if the company stage matches the condition
Extracts the required inputs from the company data
Invokes the skill
Stores the result in the DD memo
If the skill fails, escalates to a human analyst

This approach means skills are composable. A Tier 3 “Due Diligence Engine” skill can trigger 5 Tier 2 skills in sequence, each waiting for the previous one to complete.

Deployment Pipeline

Skills are versioned and deployed like production code:

dev branch → test against corpus → staging → production

Before a skill is deployed to production:

Corpus testing: Run the skill against 10+ real examples from past engagements. Does it produce the expected output?
Regression testing: Run the skill against examples that previously failed. Are we better?
Edge case testing: Run against malformed inputs, missing data, extreme values. Does it fail gracefully?
Peer review: Another operator reviews the skill logic. Are the heuristics sound?
Staged rollout: Deploy to 1 client first. Monitor for issues. Then roll out to all clients.

When a skill is deployed, we increment the version number in skill.yaml and add an entry to changelog.md.

Skill Composition and Orchestration

The real power of reusable skills is composition. A single agent request can orchestrate 10+ skills, each invoking sub-skills, to solve a complex problem.

For example, here’s how the “Due Diligence Engine” skill works:

User: "Run a complete DD on Acme Corp"

Orchestration:
1. Invoke "Financial Health Scorer"
   └─ Input: Financial statements
   └─ Output: Financial health score + narrative

2. Invoke "Tech Stack Assessor"
   └─ Input: System inventory, architecture docs
   └─ Output: Tech health score + risk assessment

3. Invoke "Cap Table Auditor"
   └─ Input: Cap table spreadsheet
   └─ Output: Validation results + issues

4. Invoke "Revenue Quality Scorer"
   └─ Input: Customer data, revenue records
   └─ Output: Revenue quality score + churn prediction

5. Invoke "Customer Health Analyzer"
   └─ Input: Customer usage, NPS, support tickets
   └─ Output: At-risk customers + expansion opportunities

6. Invoke "Evidence Synthesiser" (Tier 1 skill)
   └─ Input: Results from all above skills
   └─ Output: Cohesive DD memo (executive summary, detailed findings, recommendations)

Total time: 4 weeks (vs. 12 weeks if done manually)

Each skill runs independently. If one fails, the orchestration engine:

Logs the failure
Continues with the remaining skills (if possible)
Flags the issue for human review

This resilience is crucial for production agentic AI. Real data is messy. Failures happen. But the orchestration shouldn’t collapse because one skill failed.

Following agent skills as distilled tasks, we ensure each skill is independently valuable. You can use “Financial Health Scorer” in isolation or as part of a larger orchestration.

Real-World Outcomes: Skill Reuse Across Clients

Here’s where the rubber meets the road. Let’s look at three real examples of how skill reuse has delivered outcomes.

Example 1: Due Diligence Acceleration (4 weeks vs. 12 weeks)

Client: A Sydney-based PE firm running a 10-company roll-up in the logistics space.

Challenge: They needed to DD 10 companies in 12 weeks. Manual DD would take 12 weeks per company (120 weeks total). Impossible.

Solution: We deployed our Due Diligence Engine skill, which orchestrates 5 Tier 2 skills. Each company took 4 weeks instead of 12.

Outcome:

10 companies DD’d in 12 weeks (vs. 120 weeks manually)
90% time saving
Consistent DD quality across all 10 companies
Identified 3 companies with critical tech debt, saving the PE firm from bad acquisitions
Total engagement value: $500K+

Example 2: SOC 2 Readiness Audit (6 weeks vs. 16 weeks)

Client: A Series B SaaS company preparing for their first SOC 2 audit.

Challenge: They had 16 weeks until audit. Manual evidence drafting and control mapping would take 12–16 weeks, leaving no buffer.

Solution: We deployed our SOC 2 Readiness Audit skill, which orchestrates:

Control Mapper (identifies relevant controls)
Risk Assessor (finds gaps)
Evidence Drafter (generates audit-ready narratives)
Policy Generator (creates missing policies)

Outcome:

Evidence drafted and reviewed in 6 weeks
10 weeks of buffer before audit
Passed SOC 2 Type II audit on first attempt
Identified and remediated 8 medium-risk gaps before audit
Total engagement value: $180K

Example 3: ERP Migration Planning (4 weeks vs. 12 weeks)

Client: A mid-market Australian manufacturing company migrating from SAP to NetSuite.

Challenge: Migration planning usually takes 12 weeks. They had 8 weeks before go-live.

Solution: We deployed our ERP Migration Planner skill, which orchestrates:

Data Mapper (generates field mappings)
Dependency Analyzer (identifies all dependent systems)
Cutover Planner (sequences the migration)
Post-Migration Validator (generates test cases)

Outcome:

Migration plan completed in 4 weeks
Identified 12 dependent systems that would have been missed
Cutover executed on schedule with 0 critical issues
Post-migration data validation passed 99.8% of checks
Total engagement value: $240K

These outcomes aren’t anomalies. They’re the norm when you have reusable skills. Skills are leverage. Once built and validated, they scale.

Building Your Own Skill Library

If you’re running a venture studio, a fractional CTO practice, or leading platform engineering at scale, you should be building your own skill library. Here’s how.

Step 1: Identify Your Repeating Problems

What problems do you solve repeatedly? What do your best operators know by heart?

For us, it’s:

Due diligence (we’ve done 50+ DD engagements)
SOC 2 audits (30+ clients)
ERP migrations (15+ projects)

For you, it might be different. The point is: identify the 3–5 problem classes you solve repeatedly.

Step 2: Extract the Heuristics

Interview your best operators. What shortcuts do they use? What red flags do they watch for? What exceptions have they learned?

Write these down. This is domain knowledge. This is what makes your skills valuable.

Step 3: Encode as Skills

Take the heuristics and encode them as skills. Start with Tier 1 (foundational) skills. Then build Tier 2 (domain) skills on top.

For each skill:

Write a skill.yaml file (the contract)
Write a prompt.md file (the logic)
Write test cases (the validation)
Write documentation (the usage guide)

Step 4: Test Against Real Data

Run each skill against 10+ real examples from past engagements. Does it produce the expected output? Is the reasoning sound?

If the skill fails, debug it. Refine the heuristics. Test again.

Step 5: Version and Deploy

Once a skill is validated, version it (v1.0) and deploy it to production. Use it on your next engagement.

Monitor its performance. If it fails, log the failure. Refine the skill. Release v1.1.

Step 6: Compose into Tier 3 Skills

Once you have 8–10 Tier 2 skills, start composing them into Tier 3 orchestration skills.

For example, if you have:

Financial Health Scorer
Tech Stack Assessor
Cap Table Auditor
Revenue Quality Scorer

Compose them into a “Due Diligence Engine” skill that runs all four and synthesises the results.

Step 7: Measure and Iterate

For each engagement:

How much time did skills save vs. manual work?
How much higher quality is the output?
Did any skills fail? Why?

Use this data to refine your skills and identify new skills to build.

Over time, you’ll build a library of 20–30 skills that compress your delivery timeline by 50–70%.

This is how we’ve built our practice at Padiso. We’re not consultants. We’re operators with leverage. Our skills are our leverage.

Next Steps

Building reusable agent skills is a journey, not a destination. Here’s what to do next:

If You’re Exploring Agentic AI

Understanding the difference between agentic AI and traditional automation is crucial before you build skills. Read our guide on agentic AI vs traditional automation to understand when skills are the right approach.

Also, learn from real agentic AI production failures so you don’t make the same mistakes. Skills are powerful, but they can fail in production. Know how to handle failures gracefully.

If You’re Building a Venture Studio

Skills are the playbook for venture studios. They let you co-build with founders faster. Read about how agentic AI makes automation accessible to non-technical people to understand how to extend your studio’s reach.

Also, explore AI agency methodology to understand how to structure your studio for skill reuse and rapid iteration.

If You’re Modernising Enterprise IT

Enterprise IT is being reborn around agentic AI. Read about the $2 trillion renaissance in enterprise IT to understand the opportunity.

Then, explore how agentic AI integrates with tools like Apache Superset to understand how skills can democratise data access.

If You’re Ready to Partner

If you’re a founder, operator, or engineering leader ready to build or modernise with agentic AI, we’re here to help. We’ve built 30+ reusable skills across DD, compliance, and platform engineering. We know how to compose them into delivery engines that ship in weeks instead of months.

We partner with:

Founders and CEOs of seed-to-Series-B startups needing fractional CTO leadership and co-build support
Operators at mid-market and enterprise companies modernising with agentic AI, workflow automation, and platform re-platforming
Engineering and security leaders pursuing SOC 2 or ISO 27001 compliance via Vanta
Non-technical founders and domain experts looking for a venture studio partner
Private equity firms and portfolio companies running modernisation and roll-up projects

Visit PADISO to learn more about our services: CTO as a Service, AI & Agents Automation, AI Strategy & Readiness, Security Audit (SOC 2 / ISO 27001), Platform Design & Engineering, and Venture Studio & Co-Build.

If You’re Building Skills Yourself

Start small. Pick one repeating problem you solve. Extract the heuristics. Build a Tier 1 skill. Test it against real data. Deploy it. Measure the outcome.

Then, build the next skill. And the next.

Over time, you’ll have a library. That library is leverage. That leverage is how you scale from consulting to a product-like delivery model.

Reusable skills are the future of AI delivery. The teams that build them first will win.

Summary

Reusable agent skills are the bridge between one-off AI projects and scalable, repeatable AI delivery. They encode domain expertise in a way that AI agents can reliably invoke. They’re versioned, tested, and deployed like production code. They compose into larger orchestrations to solve complex problems.

At Padiso, we’ve built 30+ reusable skills across three core domains: due diligence, compliance, and platform engineering. These skills have saved our clients thousands of hours and millions of dollars. They’ve become the backbone of our venture studio, fractional CTO, and AI automation practices.

If you’re building agentic AI at scale, you need reusable skills. This guide shows you how we build them, how we structure them, and how we deploy them. Now it’s your turn to build your own library.

Start with one skill. Validate it. Then build the next. Your future self—and your clients—will thank you.

Building Reusable Agent Skills: A Padiso Skill Library Walkthrough

Building Reusable Agent Skills: A Padiso Skill Library Walkthrough

Table of Contents

Why Reusable Agent Skills Matter

What Makes a Skill Reusable

Clear, Narrow Scope

Documented Input/Output Contract

Grounded in Real Expertise

Versioned and Tested

Failure Mode Documentation

The Padiso 30-Skill Library: Architecture & Structure

Tier 1: Foundational Skills (8 skills)

Tier 2: Domain Skills (15 skills)

Tier 3: Orchestration Skills (7 skills)

Core Skill Categories

Data Extraction and Validation

Risk and Quality Scoring

Mapping and Dependency Analysis

Due Diligence & M&A Skills

Financial Health Scorer

Cap Table Auditor

Tech Stack Assessor

Revenue Quality Scorer

Customer Health Analyzer

SOC 2 & ISO 27001 Compliance Skills

Control Mapper

Evidence Drafter

Policy Generator

Risk Assessor

Audit Readiness Scorer

ERP Migration & Platform Engineering Skills

Data Mapper

Schema Transformer

Cutover Planner

Dependency Analyzer

Post-Migration Validator

File Structure, Triggering, and Deployment

Skill File Structure

The skill.yaml File

The prompt.md File

Triggering Logic

Deployment Pipeline

Skill Composition and Orchestration

Real-World Outcomes: Skill Reuse Across Clients

Example 1: Due Diligence Acceleration (4 weeks vs. 12 weeks)

Example 2: SOC 2 Readiness Audit (6 weeks vs. 16 weeks)

Example 3: ERP Migration Planning (4 weeks vs. 12 weeks)

Building Your Own Skill Library

Step 1: Identify Your Repeating Problems

Step 2: Extract the Heuristics

Step 3: Encode as Skills

Step 4: Test Against Real Data

Step 5: Version and Deploy

Step 6: Compose into Tier 3 Skills

Step 7: Measure and Iterate

Next Steps

If You’re Exploring Agentic AI

If You’re Building a Venture Studio

If You’re Modernising Enterprise IT

If You’re Ready to Partner

If You’re Building Skills Yourself

Summary