Building Reusable Agent Skills: A Padiso Skill Library Walkthrough
Tour 30+ reusable agent skills Padiso uses across clients: DD, SOC 2, ERP migration. Learn skill architecture, file structures, and triggering tactics.
Building Reusable Agent Skills: A Padiso Skill Library Walkthrough
Table of Contents
- Why Reusable Agent Skills Matter
- What Makes a Skill Reusable
- The Padiso 30-Skill Library: Architecture & Structure
- Core Skill Categories
- Due Diligence & M&A Skills
- SOC 2 & ISO 27001 Compliance Skills
- ERP Migration & Platform Engineering Skills
- File Structure, Triggering, and Deployment
- Skill Composition and Orchestration
- Real-World Outcomes: Skill Reuse Across Clients
- Building Your Own Skill Library
- Next Steps
Why Reusable Agent Skills Matter
Agentic AI is no longer a lab experiment. It’s shipping in production across mid-market and enterprise teams—but most organisations are building agents from scratch every time. They’re hiring contractors to write prompt chains, debugging runaway loops, and watching costs balloon because nobody documented what actually worked.
Reusable agent skills flip that equation. A skill is a self-contained, discoverable, versioned package of knowledge and decision logic that an AI agent can reliably invoke to solve a specific class of problem. Once built and validated, a skill can be reused across dozens of projects, clients, and use cases without rewriting the logic or retraining the agent.
At Padiso, we’ve spent two years building and refining a library of 30+ production-grade agent skills. These skills power our work across three core domains:
- Due diligence and M&A: financial health scoring, cap table audits, tech stack assessment, revenue quality scoring
- SOC 2 and ISO 27001 audit readiness: evidence drafting, control mapping, policy generation, risk assessment
- ERP migration and platform engineering: data mapping, schema validation, dependency analysis, cutover planning
Each skill has been tested across 10+ client engagements. Each has a documented trigger condition, a clear input/output contract, and a failure mode playbook. This guide walks you through the architecture, shows you the real file structures we use, and gives you a framework to build your own reusable skill library.
If you’re running a venture studio, a fractional CTO practice, or leading platform modernisation at scale, this is the playbook that separates one-off consulting from repeatable, scalable AI delivery.
What Makes a Skill Reusable
Not every prompt or workflow deserves to be a skill. Reusable skills share five core properties:
Clear, Narrow Scope
A skill solves one class of problem well. “Assess financial health” is a skill. “Assess financial health, review cap table, and predict runway” is three skills masquerading as one. When scope creeps, reusability dies—you end up with a monolithic prompt that breaks the moment context changes.
At Padiso, we use the “single responsibility principle” borrowed from software engineering. Each skill has exactly one purpose. A skill either extracts data, validates it, scores it, or recommends an action—not all four.
Documented Input/Output Contract
A reusable skill specifies exactly what it expects and what it returns. No surprises. No “it usually works unless the data is messy.” The contract is the covenant between the agent and the skill.
For example, our “Revenue Quality Scorer” skill expects:
Input: {
"annual_recurring_revenue": number (USD),
"customer_concentration": number (0-1),
"churn_rate_annual": number (0-1),
"net_retention": number (0-2),
"revenue_growth_yoy": number (-1 to 5)
}
Output: {
"score": number (0-100),
"grade": "A" | "B" | "C" | "D" | "F",
"risk_factors": string[],
"confidence": number (0-1)
}
The agent knows exactly what to feed in and what to expect back. No ambiguity. No hallucination.
Grounded in Real Expertise
A reusable skill isn’t a generic LLM prompt. It encodes specific heuristics, decision trees, and domain knowledge that only an expert would know. Following the best practices for skill creators, we extract these patterns directly from our operators’ workflows—the shortcuts they use, the red flags they watch for, the exceptions they’ve learned over years.
Our SOC 2 evidence drafting skill, for instance, doesn’t just say “write evidence.” It encodes 12 years of audit experience: which control families need written policies vs. system screenshots, which auditors accept inference vs. require direct proof, which evidence types are reusable across multiple controls.
Versioned and Tested
A reusable skill is versioned like production code. v1.0, v1.1, v2.0. Each version has a changelog. Each version is tested against a corpus of real inputs from past engagements.
When a skill breaks—and they do—you know which version broke and which clients are affected. You can roll back or patch forward with confidence.
Failure Mode Documentation
Every skill has a failure playbook. What happens when the input is incomplete? When the agent hallucinates? When the data contradicts the decision heuristic? A reusable skill doesn’t just succeed—it fails gracefully, with clear signals and remediation steps.
Our ERP migration skills, for example, document exactly what happens when a data mapping fails (halt the migration, log the row, flag for manual review) versus when a schema validation fails (suggest a transformation, offer three alternatives, wait for human approval).
The Padiso 30-Skill Library: Architecture & Structure
Our skill library is organised into three tiers:
Tier 1: Foundational Skills (8 skills)
These are the building blocks. They handle data extraction, validation, scoring, and basic reasoning. Every other skill is built on top of these.
- Data Extractor: Pulls structured data from unstructured documents (PDFs, emails, spreadsheets)
- Schema Validator: Checks data against a defined schema and reports mismatches
- Risk Scorer: Converts qualitative observations into quantitative risk scores
- Dependency Mapper: Identifies relationships between entities (systems, teams, data flows)
- Gap Analyzer: Compares current state vs. desired state and identifies deltas
- Recommendation Engine: Generates prioritised, actionable recommendations
- Evidence Synthesiser: Compiles multiple data points into a coherent narrative
- Cost Estimator: Predicts effort, timeline, and budget for a given initiative
Tier 2: Domain Skills (15 skills)
These are built from Tier 1 skills and solve specific problems within our three core domains. They’re where domain expertise lives.
Due Diligence Skills:
- Financial Health Scorer
- Cap Table Auditor
- Tech Stack Assessor
- Revenue Quality Scorer
- Customer Health Analyzer
SOC 2 & Compliance Skills:
- Control Mapper (maps business processes to control requirements)
- Evidence Drafter (generates audit-ready evidence narratives)
- Policy Generator (creates control policies from templates)
- Risk Assessor (identifies gaps in control implementation)
- Audit Readiness Scorer (predicts likelihood of passing audit)
ERP & Platform Skills:
- Data Mapper (defines transformation rules between systems)
- Schema Transformer (generates migration scripts)
- Cutover Planner (sequences migration steps)
- Dependency Analyzer (identifies system and data dependencies)
- Post-Migration Validator (tests migrated data quality)
Tier 3: Orchestration Skills (7 skills)
These coordinate multiple Tier 2 skills to solve end-to-end problems. They’re where the magic happens—where a single agent request triggers a choreographed sequence of 5–10 sub-skills.
- Due Diligence Engine: Runs a complete DD assessment (financial + tech + customer)
- SOC 2 Readiness Audit: Evaluates and drafts evidence for all control families
- ERP Migration Planner: Designs a complete migration approach
- Post-Acquisition Integration: Plans 100-day integration roadmap
- Platform Modernisation Roadmap: Designs multi-year platform engineering strategy
- Compliance Remediation Plan: Generates a prioritised remediation roadmap
- Venture Studio Diligence: Runs a complete pre-seed/seed investment assessment
This three-tier architecture means:
- Reuse is maximised: A Tier 3 skill might invoke 5 Tier 2 skills, which each invoke 2–3 Tier 1 skills. But those Tier 1 skills are shared across all Tier 2 and Tier 3 skills.
- Maintenance is centralised: If we improve the Data Extractor, all downstream skills improve automatically.
- Reasoning is transparent: Each tier adds a layer of reasoning. You can see exactly where a decision came from.
Following 3 principles for designing agent skills, we ensure each skill is modular, composable, and independently testable. This is not a monolithic prompt. It’s a library of small, focused, well-tested building blocks.
Core Skill Categories
Data Extraction and Validation
Every engagement starts with messy data. Financial statements with inconsistent formatting. Spreadsheets with hidden columns. PDFs where numbers are stored as images. A reusable data extraction skill needs to handle all of this without human intervention.
Our Data Extractor skill works like this:
- Accept multiple input formats: PDF, XLSX, CSV, JSON, plain text, email body
- Identify the data type: Is this a financial statement? A customer list? A system inventory?
- Extract with high confidence: Pull the relevant fields and flag anything ambiguous
- Validate against schema: Does the data match what we expect?
- Return structured output: JSON with extracted fields + a confidence score for each field
The key insight: extraction isn’t binary. It’s probabilistic. If we’re 95% confident in a number, we include it with a confidence flag. If we’re 60% confident, we flag it for human review. This prevents silent failures.
For schema validation, we use a simple but powerful approach: define the schema upfront in JSON Schema format, then have the agent validate incoming data against it. When validation fails, the agent doesn’t just say “error”—it suggests a transformation or asks a clarifying question.
Risk and Quality Scoring
Scoring is where judgment lives. A reusable scoring skill encodes the judgment of experienced operators so it can be applied consistently across dozens of engagements.
Our Risk Scorer skill uses a weighted rubric approach:
1. Define risk dimensions (e.g., financial, technical, market, team)
2. For each dimension, define 5–7 observable signals
3. Weight each signal (e.g., negative churn is 3x more important than high CAC)
4. For each signal, define thresholds (e.g., >40% annual churn = red flag)
5. Aggregate scores using a decision tree (not a simple average)
6. Return the final score + the reasoning (which signals drove the score)
This approach is reusable because the structure is the same across all scoring tasks. What changes is the dimensions, signals, weights, and thresholds. For a financial health score, the dimensions are profitability, growth, and cash. For a tech health score, the dimensions are architecture, security, and scalability.
The agent can apply the same skill logic to different domains by swapping in different rubrics.
Mapping and Dependency Analysis
When you’re doing M&A, ERP migration, or compliance work, you need to understand how things connect. Which systems feed which? Which teams own which processes? Which controls map to which policies?
Our Dependency Mapper skill builds a graph of relationships:
- Identify entities: Systems, teams, processes, data flows, controls, policies
- Identify relationships: A → B (e.g., Salesforce → data warehouse)
- Classify relationships: Feeds, depends on, owns, implements, audits
- Detect cycles: Are there circular dependencies? (Often a red flag)
- Compute criticality: Which entities are most central? Which are isolated?
- Return a graph: JSON representation of the dependency network
This skill is reusable across technical, organisational, and regulatory domains. The logic is the same; the entity types change.
Due Diligence & M&A Skills
Due diligence is where we’ve seen the most dramatic reuse. A single Tier 3 “Due Diligence Engine” skill orchestrates 5 Tier 2 skills to produce a comprehensive investment memo in 4 weeks instead of 12.
Financial Health Scorer
This skill takes financial statements (P&L, balance sheet, cash flow) and produces a health score (0–100) plus a narrative assessment.
The skill encodes heuristics like:
- Profitability: Gross margin > 60% is green. < 40% is red. (Varies by SaaS vs. marketplace vs. services.)
- Growth: YoY revenue growth > 30% is healthy. < 10% is concerning.
- Runway: Months of cash remaining at current burn rate. < 12 months is a risk.
- Unit economics: CAC payback < 12 months. LTV:CAC > 3:1. Both are green flags.
- Efficiency: Magic number (ARR growth / S&M spend) > 0.75 is efficient.
The skill weights these heuristics based on company stage (seed companies are judged differently than Series B) and industry (SaaS vs. marketplace vs. services).
Output:
{
"score": 72,
"grade": "B",
"profitability_score": 65,
"growth_score": 85,
"runway_score": 68,
"unit_economics_score": 75,
"efficiency_score": 70,
"key_risks": [
"Negative gross margin (48%)",
"Cash runway only 14 months",
"Declining NRR (0.92)"
],
"key_strengths": [
"Strong YoY growth (145%)",
"CAC payback < 10 months",
"Improving unit economics"
],
"confidence": 0.94
}
Cap Table Auditor
This skill takes a cap table (spreadsheet or document) and validates it for completeness and consistency.
It checks:
- Fully diluted ownership: Do all shares + options + warrants add up to 100%?
- Vesting schedules: Are vesting cliffs and schedules documented for each grant?
- Option pool: Is there an approved option pool? Is it fully allocated?
- Board seats and preferences: Do the cap table and board composition align?
- Recent fundraising: Are the latest funding round shares reflected?
- Liquidation preferences: Are they documented and do they align with term sheets?
Output:
{
"is_valid": false,
"issues": [
{
"severity": "critical",
"issue": "Option pool not fully allocated",
"detail": "5M options reserved but only 2.3M granted"
},
{
"severity": "high",
"issue": "Founder vesting not documented",
"detail": "No vesting schedule for Founder A (5M shares)"
}
],
"fully_diluted_ownership": {
"common": 0.65,
"options": 0.15,
"warrants": 0.02,
"unallocated": 0.18
},
"confidence": 0.88
}
Tech Stack Assessor
This skill inventories the technology stack and scores it on architecture, security, scalability, and maintainability.
It identifies:
- Core systems: Database, backend, frontend, mobile
- Infrastructure: Cloud provider, deployment model, monitoring
- Data pipeline: ETL, analytics, BI
- Security: Auth, encryption, secrets management
- DevOps: CI/CD, testing, deployment frequency
For each component, it scores on maturity (startup-grade vs. enterprise-grade), technical debt, and risk.
This skill is crucial for pre-acquisition technical due diligence and for understanding agentic AI vs traditional automation approaches when planning post-acquisition platform modernisation.
Revenue Quality Scorer
This skill takes revenue data and produces a quality score. Not all revenue is created equal.
It evaluates:
- Customer concentration: Is revenue concentrated in a few customers? (High concentration = risk.)
- Churn: Are customers leaving? At what rate?
- Net retention: Are existing customers growing or shrinking their spend?
- Revenue growth: Is the company growing? At what rate?
- Recurring vs. one-time: What % is recurring?
- Gross margin by customer: Which customers are profitable?
Output:
{
"quality_score": 78,
"grade": "B+",
"concentration_risk": 0.35,
"top_10_customers_pct": 45,
"annual_churn": 0.08,
"net_retention": 1.15,
"recurring_revenue_pct": 0.92,
"key_concerns": [
"Customer concentration above 40%",
"Churn accelerating (8% vs 5% last year)"
],
"key_strengths": [
"Net retention > 1.0 (customers expanding)",
"92% recurring revenue (very sticky)"
]
}
Customer Health Analyzer
This skill takes customer data (usage, support tickets, renewal history, NPS) and predicts churn risk and expansion opportunity.
It identifies:
- At-risk customers: Low usage, declining engagement, recent support issues
- Expansion opportunities: High usage, high satisfaction, adjacent product interest
- Cohort patterns: Do certain customer segments churn more? Which segments expand?
This skill is reused across multiple clients because the logic is the same; the data sources change.
SOC 2 & ISO 27001 Compliance Skills
Compliance is tedious, but it’s also highly reusable. Once you’ve built a skill to draft SOC 2 evidence, you can use it across dozens of clients.
We’ve seen this firsthand. Our SOC 2 readiness skills have been used across 30+ engagements, saving an average of 6 weeks per audit cycle. Here’s why they work.
Control Mapper
This skill takes a business process description and maps it to relevant SOC 2 or ISO 27001 controls.
For example:
Input: “We store customer data in AWS S3. Access is restricted to authenticated employees via IAM roles. All access is logged in CloudTrail.”
Output:
{
"process": "Customer data storage",
"controls_mapped": [
{
"control_id": "CC6.1",
"control_name": "Logical Access Controls",
"evidence_type": "System configuration",
"status": "implemented",
"confidence": 0.95
},
{
"control_id": "CC7.2",
"control_name": "System Monitoring",
"evidence_type": "Logs and audit trails",
"status": "implemented",
"confidence": 0.92
}
]
}
The skill encodes the SOC 2 control framework (all 64 controls) and knows which business processes map to which controls.
Evidence Drafter
This is the workhorse skill. It takes a control ID and existing evidence (logs, screenshots, policies) and drafts an audit-ready narrative.
Auditors want to see:
- What is the control? (Definition)
- How is it implemented? (Design)
- How do we know it’s working? (Operating effectiveness)
- What’s the evidence? (Proof)
The Evidence Drafter skill generates all four sections from input data.
For example, given:
- Control: “CC6.1 – Logical Access Controls”
- Evidence: AWS IAM policy document, CloudTrail logs showing access, employee onboarding checklist
The skill generates:
Control CC6.1: Logical Access Controls
Design: The Company restricts logical access to customer data systems
using AWS Identity and Access Management (IAM). All employees are
provisioned with IAM roles corresponding to their job function.
Access is granted on a least-privilege basis.
Operating Effectiveness: During the period under review, the Company
granted access to 45 new employees and revoked access for 12 departed
employees. All access changes were documented in our HRIS and
implemented within 24 hours of hire/termination.
Evidence:
- AWS IAM policy document (Exhibit A)
- CloudTrail logs showing 847 access grants and 156 revocations (Exhibit B)
- Employee onboarding checklist with access provisioning sign-off (Exhibit C)
- Access review certification signed by IT Manager (Exhibit D)
This skill has been used across 30+ clients. The logic is identical; the evidence changes.
Policy Generator
This skill takes a control requirement and generates a policy document that satisfies it.
For example:
Input: Control CC6.2 – “Restrictions and security measures for user access to system components and data are established and implemented.”
Output: A complete “Access Control Policy” document with sections on:
- Policy objective
- Scope
- Roles and responsibilities
- Access request process
- Approval workflow
- Access review cadence
- Revocation process
- Exceptions and compensating controls
The skill generates policies that are:
- Audit-ready (written in the language auditors expect)
- Operationally feasible (don’t require impossible processes)
- Reusable (can be adapted for different organisations)
Risk Assessor
This skill evaluates whether a control is actually implemented or just documented.
It looks for:
- Design gaps: Is the control design sound?
- Implementation gaps: Is the control actually built?
- Operating effectiveness gaps: Is the control actually working?
For each gap, it produces a risk rating (critical, high, medium, low) and a remediation recommendation.
This skill is crucial for understanding SOC 2 compliance strategies and identifying which gaps need immediate attention before an audit.
Audit Readiness Scorer
This skill takes the results of a risk assessment and predicts the likelihood of passing an audit.
It considers:
- Number of critical gaps: Each critical gap reduces the probability significantly
- Control family coverage: Are all families at least partially implemented?
- Evidence quality: Is the evidence strong or weak?
- Auditor history: Have you passed SOC 2 before? With which auditor?
Output:
{
"readiness_score": 68,
"probability_of_passing": 0.62,
"critical_gaps": 3,
"high_gaps": 7,
"medium_gaps": 12,
"estimated_weeks_to_readiness": 8,
"recommended_focus": [
"CC6.1 – Logical Access Controls",
"CC7.2 – System Monitoring",
"CC9.2 – Encryption"
]
}
This skill helps organisations understand: Should we audit now or wait 8 weeks?
ERP Migration & Platform Engineering Skills
ERP migrations and platform re-platforming projects are complex, multi-month endeavours. Reusable skills compress the planning phase from 12 weeks to 4 weeks.
Data Mapper
This skill takes a source system schema and a target system schema and generates a data mapping.
For example:
Source: SAP (legacy, on-premises) Target: NetSuite (cloud)
The skill:
- Identifies all tables and fields in the source system
- Identifies all tables and fields in the target system
- Matches source fields to target fields (exact matches, transformations, or “no mapping”)
- For each transformation, generates a rule (e.g., “SAP VBAK.VBELN → NetSuite Transaction.ID, prepend ‘SAP-’”)
- Identifies data that exists in source but has no home in target (orphaned data)
- Identifies data in target that must be manually populated (gaps)
Output:
{
"source_system": "SAP",
"target_system": "NetSuite",
"mappings": [
{
"source_field": "VBAK.VBELN",
"target_field": "Transaction.ID",
"mapping_type": "transformation",
"rule": "prepend 'SAP-'",
"confidence": 0.98
},
{
"source_field": "VBAK.NETWR",
"target_field": "Transaction.Amount",
"mapping_type": "exact",
"confidence": 1.0
}
],
"orphaned_fields": [
"VBAK.ZZCUSTOM1"
],
"gaps": [
"NetSuite.Tax_ID (not in SAP)"
]
}
Schema Transformer
This skill takes a data mapping and generates SQL, Python, or dbt code to perform the transformation.
For example:
Input: Data mapping from SAP to NetSuite
Output: dbt model that reads from SAP staging tables and writes to NetSuite format
select
concat('SAP-', vbak.vbeln) as transaction_id,
vbak.netwr as amount,
vbak.waerk as currency,
vbak.erdat as created_date,
vbak.ernam as created_by,
case
when vbak.augru = '01' then 'Rush'
when vbak.augru = '02' then 'Standard'
else 'Other'
end as fulfillment_priority
from {{ source('sap', 'vbak') }} vbak
where vbak.mandt = '100' -- Client 100 only
The skill generates transformation code that is:
- Correct (maps fields according to the mapping)
- Efficient (uses appropriate aggregations and filters)
- Debuggable (includes comments, clear field names)
- Testable (generates dbt tests automatically)
Cutover Planner
This skill takes a migration scope and generates a detailed cutover plan.
It sequences the migration in phases:
- Preparation phase (2 weeks): Data extraction, transformation testing, user training
- Pilot phase (1 week): Run transformation on a subset of data, validate results
- Parallel run (2 weeks): Run old and new systems side-by-side, reconcile differences
- Cutover (1 day): Stop the old system, run final transformation, validate
- Stabilisation (2 weeks): Monitor for issues, handle exceptions, optimise
For each phase, the skill generates:
- Activities (what needs to happen)
- Owners (who is responsible)
- Success criteria (how do we know it worked)
- Rollback plan (what if it fails)
This skill is reused across dozens of migrations because the structure is the same; the details change.
Dependency Analyzer
This skill identifies all systems that depend on the system being migrated.
For example, if you’re migrating from SAP to NetSuite:
- Which systems read from SAP? (Data warehouse, BI tools, reporting systems)
- Which systems write to SAP? (E-commerce platform, accounting software)
- Which systems are tightly coupled to SAP? (Custom integrations, plugins)
For each dependency, the skill assesses:
- Criticality: How important is this integration?
- Effort: How much work to migrate this integration?
- Risk: What could go wrong?
Output:
{
"dependencies": [
{
"system": "Data Warehouse",
"direction": "reads",
"criticality": "high",
"effort_weeks": 3,
"risk": "If DW breaks, reporting breaks. Need to test daily."
},
{
"system": "Shopify",
"direction": "writes",
"criticality": "critical",
"effort_weeks": 2,
"risk": "If integration breaks, orders don't sync. Revenue impact."
}
],
"critical_path": ["Shopify integration", "Data Warehouse"],
"estimated_total_effort_weeks": 12
}
This skill prevents the “surprise dependency” problem where you think you’re done with the migration and discover a system that depends on the old system still running.
Post-Migration Validator
This skill tests the migrated data for completeness, accuracy, and consistency.
It runs automated checks:
- Row count validation: Does the target have the same number of rows as the source?
- Field-level validation: Do amounts, dates, and IDs match?
- Referential integrity: Do foreign keys still point to valid records?
- Data quality: Are there unexpected nulls, duplicates, or outliers?
- Business logic: Do calculated fields (totals, balances) still compute correctly?
For each check, it produces a pass/fail result and, if it fails, highlights the problematic rows.
This skill is reused across all migration projects because the validation logic is the same; the data changes.
File Structure, Triggering, and Deployment
Reusable skills aren’t just logic—they’re also files, versioning, and deployment pipelines. Here’s how we structure them at Padiso.
Skill File Structure
Each skill lives in its own directory:
skills/
├── financial-health-scorer/
│ ├── skill.yaml # Metadata
│ ├── README.md # Documentation
│ ├── prompt.md # The actual prompt
│ ├── rubric.json # Scoring rubric
│ ├── tests/
│ │ ├── test_cases.json # Test inputs and expected outputs
│ │ └── test_runner.py # Test harness
│ ├── examples/
│ │ ├── input_example.json # Example input
│ │ └── output_example.json # Example output
│ └── changelog.md # Version history
├── cap-table-auditor/
│ ├── skill.yaml
│ ├── README.md
│ ├── prompt.md
│ ├── validation_rules.json # Cap table validation rules
│ ├── tests/
│ ├── examples/
│ └── changelog.md
├── ...
The skill.yaml File
This is the contract. It declares what the skill does, what it expects, and what it returns.
name: "Financial Health Scorer"
version: "2.1.0"
description: "Scores financial health of a company based on P&L, balance sheet, and cash flow."
inputs:
- name: "annual_recurring_revenue"
type: "number"
unit: "USD"
required: true
description: "Annual recurring revenue"
- name: "gross_margin"
type: "number"
unit: "percentage"
required: true
description: "Gross profit / revenue"
- name: "cash_balance"
type: "number"
unit: "USD"
required: true
- name: "monthly_burn"
type: "number"
unit: "USD"
required: true
- name: "company_stage"
type: "string"
enum: ["seed", "series-a", "series-b", "series-c+"]
required: true
outputs:
- name: "score"
type: "number"
range: [0, 100]
description: "Overall health score"
- name: "grade"
type: "string"
enum: ["A", "B", "C", "D", "F"]
- name: "key_risks"
type: "array"
items: "string"
- name: "confidence"
type: "number"
range: [0, 1]
triggers:
- event: "due_diligence_started"
condition: "company_stage in [seed, series-a, series-b]"
- event: "quarterly_health_check"
condition: "always"
owner: "padiso-dd-team"
last_updated: "2025-02-15"
This YAML file is machine-readable. An orchestration engine can parse it and understand:
- What inputs the skill needs
- What outputs it produces
- When to trigger it automatically
- Who owns it
Following skill authoring best practices, we ensure each skill has a clear contract and explicit triggering logic.
The prompt.md File
This is the actual prompt that the agent executes. It’s written to be:
- Clear and specific
- Grounded in domain expertise
- Robust to edge cases
- Explicit about reasoning steps
Example:
# Financial Health Scorer
You are an expert investor evaluating the financial health of a startup.
## Your Task
Given financial metrics, score the company's financial health on a scale of 0-100.
## Scoring Framework
### Profitability (30% weight)
- Gross margin > 70%: Excellent (90-100 points)
- Gross margin 50-70%: Good (70-89 points)
- Gross margin 30-50%: Fair (50-69 points)
- Gross margin < 30%: Poor (0-49 points)
### Growth (25% weight)
- YoY growth > 100%: Exceptional (90-100 points)
- YoY growth 50-100%: Strong (70-89 points)
- YoY growth 20-50%: Moderate (50-69 points)
- YoY growth < 20%: Weak (0-49 points)
### Runway (25% weight)
- Runway > 24 months: Excellent (90-100 points)
- Runway 12-24 months: Good (70-89 points)
- Runway 6-12 months: Fair (50-69 points)
- Runway < 6 months: Critical (0-49 points)
### Unit Economics (20% weight)
- CAC payback < 12 months AND LTV:CAC > 3: Excellent (90-100 points)
- CAC payback < 15 months AND LTV:CAC > 2: Good (70-89 points)
- CAC payback < 18 months AND LTV:CAC > 1.5: Fair (50-69 points)
- Otherwise: Poor (0-49 points)
## Adjustment for Stage
If company_stage is "seed":
- Reduce growth expectations by 20%
- Reduce profitability expectations by 30%
- Increase runway importance by 10%
If company_stage is "series-b":
- Increase growth expectations by 20%
- Increase profitability expectations by 20%
## Output Format
Return a JSON object with:
- score (0-100)
- grade (A/B/C/D/F)
- key_risks (list of 2-4 risks)
- key_strengths (list of 2-4 strengths)
- reasoning (explain your score in 2-3 sentences)
- confidence (0-1)
## Important
- Be conservative: If data is ambiguous, lower the confidence.
- Be specific: Don't say "growth is good"—say "YoY growth is 145%, which is strong for a Series A."
- Flag edge cases: If a metric is unusual, note it.
Triggering Logic
Skills aren’t triggered manually. They’re triggered automatically based on events and conditions.
We use a simple trigger engine:
{
"triggers": [
{
"id": "dd_financial_assessment",
"skill": "financial-health-scorer",
"event": "due_diligence_started",
"condition": "company_stage in ['seed', 'series-a', 'series-b']",
"inputs_from": {
"annual_recurring_revenue": "company.arr",
"gross_margin": "company.financials.gross_margin",
"cash_balance": "company.financials.cash_balance",
"monthly_burn": "company.financials.monthly_burn",
"company_stage": "company.stage"
},
"on_success": "store_result_in_dd_memo",
"on_failure": "escalate_to_human"
}
]
}
When a due diligence is started, the orchestration engine:
- Checks if the company stage matches the condition
- Extracts the required inputs from the company data
- Invokes the skill
- Stores the result in the DD memo
- If the skill fails, escalates to a human analyst
This approach means skills are composable. A Tier 3 “Due Diligence Engine” skill can trigger 5 Tier 2 skills in sequence, each waiting for the previous one to complete.
Deployment Pipeline
Skills are versioned and deployed like production code:
dev branch → test against corpus → staging → production
Before a skill is deployed to production:
- Corpus testing: Run the skill against 10+ real examples from past engagements. Does it produce the expected output?
- Regression testing: Run the skill against examples that previously failed. Are we better?
- Edge case testing: Run against malformed inputs, missing data, extreme values. Does it fail gracefully?
- Peer review: Another operator reviews the skill logic. Are the heuristics sound?
- Staged rollout: Deploy to 1 client first. Monitor for issues. Then roll out to all clients.
When a skill is deployed, we increment the version number in skill.yaml and add an entry to changelog.md.
Skill Composition and Orchestration
The real power of reusable skills is composition. A single agent request can orchestrate 10+ skills, each invoking sub-skills, to solve a complex problem.
For example, here’s how the “Due Diligence Engine” skill works:
User: "Run a complete DD on Acme Corp"
Orchestration:
1. Invoke "Financial Health Scorer"
└─ Input: Financial statements
└─ Output: Financial health score + narrative
2. Invoke "Tech Stack Assessor"
└─ Input: System inventory, architecture docs
└─ Output: Tech health score + risk assessment
3. Invoke "Cap Table Auditor"
└─ Input: Cap table spreadsheet
└─ Output: Validation results + issues
4. Invoke "Revenue Quality Scorer"
└─ Input: Customer data, revenue records
└─ Output: Revenue quality score + churn prediction
5. Invoke "Customer Health Analyzer"
└─ Input: Customer usage, NPS, support tickets
└─ Output: At-risk customers + expansion opportunities
6. Invoke "Evidence Synthesiser" (Tier 1 skill)
└─ Input: Results from all above skills
└─ Output: Cohesive DD memo (executive summary, detailed findings, recommendations)
Total time: 4 weeks (vs. 12 weeks if done manually)
Each skill runs independently. If one fails, the orchestration engine:
- Logs the failure
- Continues with the remaining skills (if possible)
- Flags the issue for human review
This resilience is crucial for production agentic AI. Real data is messy. Failures happen. But the orchestration shouldn’t collapse because one skill failed.
Following agent skills as distilled tasks, we ensure each skill is independently valuable. You can use “Financial Health Scorer” in isolation or as part of a larger orchestration.
Real-World Outcomes: Skill Reuse Across Clients
Here’s where the rubber meets the road. Let’s look at three real examples of how skill reuse has delivered outcomes.
Example 1: Due Diligence Acceleration (4 weeks vs. 12 weeks)
Client: A Sydney-based PE firm running a 10-company roll-up in the logistics space.
Challenge: They needed to DD 10 companies in 12 weeks. Manual DD would take 12 weeks per company (120 weeks total). Impossible.
Solution: We deployed our Due Diligence Engine skill, which orchestrates 5 Tier 2 skills. Each company took 4 weeks instead of 12.
Outcome:
- 10 companies DD’d in 12 weeks (vs. 120 weeks manually)
- 90% time saving
- Consistent DD quality across all 10 companies
- Identified 3 companies with critical tech debt, saving the PE firm from bad acquisitions
- Total engagement value: $500K+
Example 2: SOC 2 Readiness Audit (6 weeks vs. 16 weeks)
Client: A Series B SaaS company preparing for their first SOC 2 audit.
Challenge: They had 16 weeks until audit. Manual evidence drafting and control mapping would take 12–16 weeks, leaving no buffer.
Solution: We deployed our SOC 2 Readiness Audit skill, which orchestrates:
- Control Mapper (identifies relevant controls)
- Risk Assessor (finds gaps)
- Evidence Drafter (generates audit-ready narratives)
- Policy Generator (creates missing policies)
Outcome:
- Evidence drafted and reviewed in 6 weeks
- 10 weeks of buffer before audit
- Passed SOC 2 Type II audit on first attempt
- Identified and remediated 8 medium-risk gaps before audit
- Total engagement value: $180K
Example 3: ERP Migration Planning (4 weeks vs. 12 weeks)
Client: A mid-market Australian manufacturing company migrating from SAP to NetSuite.
Challenge: Migration planning usually takes 12 weeks. They had 8 weeks before go-live.
Solution: We deployed our ERP Migration Planner skill, which orchestrates:
- Data Mapper (generates field mappings)
- Dependency Analyzer (identifies all dependent systems)
- Cutover Planner (sequences the migration)
- Post-Migration Validator (generates test cases)
Outcome:
- Migration plan completed in 4 weeks
- Identified 12 dependent systems that would have been missed
- Cutover executed on schedule with 0 critical issues
- Post-migration data validation passed 99.8% of checks
- Total engagement value: $240K
These outcomes aren’t anomalies. They’re the norm when you have reusable skills. Skills are leverage. Once built and validated, they scale.
Building Your Own Skill Library
If you’re running a venture studio, a fractional CTO practice, or leading platform engineering at scale, you should be building your own skill library. Here’s how.
Step 1: Identify Your Repeating Problems
What problems do you solve repeatedly? What do your best operators know by heart?
For us, it’s:
- Due diligence (we’ve done 50+ DD engagements)
- SOC 2 audits (30+ clients)
- ERP migrations (15+ projects)
For you, it might be different. The point is: identify the 3–5 problem classes you solve repeatedly.
Step 2: Extract the Heuristics
Interview your best operators. What shortcuts do they use? What red flags do they watch for? What exceptions have they learned?
Write these down. This is domain knowledge. This is what makes your skills valuable.
Step 3: Encode as Skills
Take the heuristics and encode them as skills. Start with Tier 1 (foundational) skills. Then build Tier 2 (domain) skills on top.
For each skill:
- Write a skill.yaml file (the contract)
- Write a prompt.md file (the logic)
- Write test cases (the validation)
- Write documentation (the usage guide)
Step 4: Test Against Real Data
Run each skill against 10+ real examples from past engagements. Does it produce the expected output? Is the reasoning sound?
If the skill fails, debug it. Refine the heuristics. Test again.
Step 5: Version and Deploy
Once a skill is validated, version it (v1.0) and deploy it to production. Use it on your next engagement.
Monitor its performance. If it fails, log the failure. Refine the skill. Release v1.1.
Step 6: Compose into Tier 3 Skills
Once you have 8–10 Tier 2 skills, start composing them into Tier 3 orchestration skills.
For example, if you have:
- Financial Health Scorer
- Tech Stack Assessor
- Cap Table Auditor
- Revenue Quality Scorer
Compose them into a “Due Diligence Engine” skill that runs all four and synthesises the results.
Step 7: Measure and Iterate
For each engagement:
- How much time did skills save vs. manual work?
- How much higher quality is the output?
- Did any skills fail? Why?
Use this data to refine your skills and identify new skills to build.
Over time, you’ll build a library of 20–30 skills that compress your delivery timeline by 50–70%.
This is how we’ve built our practice at Padiso. We’re not consultants. We’re operators with leverage. Our skills are our leverage.
Next Steps
Building reusable agent skills is a journey, not a destination. Here’s what to do next:
If You’re Exploring Agentic AI
Understanding the difference between agentic AI and traditional automation is crucial before you build skills. Read our guide on agentic AI vs traditional automation to understand when skills are the right approach.
Also, learn from real agentic AI production failures so you don’t make the same mistakes. Skills are powerful, but they can fail in production. Know how to handle failures gracefully.
If You’re Building a Venture Studio
Skills are the playbook for venture studios. They let you co-build with founders faster. Read about how agentic AI makes automation accessible to non-technical people to understand how to extend your studio’s reach.
Also, explore AI agency methodology to understand how to structure your studio for skill reuse and rapid iteration.
If You’re Modernising Enterprise IT
Enterprise IT is being reborn around agentic AI. Read about the $2 trillion renaissance in enterprise IT to understand the opportunity.
Then, explore how agentic AI integrates with tools like Apache Superset to understand how skills can democratise data access.
If You’re Ready to Partner
If you’re a founder, operator, or engineering leader ready to build or modernise with agentic AI, we’re here to help. We’ve built 30+ reusable skills across DD, compliance, and platform engineering. We know how to compose them into delivery engines that ship in weeks instead of months.
We partner with:
- Founders and CEOs of seed-to-Series-B startups needing fractional CTO leadership and co-build support
- Operators at mid-market and enterprise companies modernising with agentic AI, workflow automation, and platform re-platforming
- Engineering and security leaders pursuing SOC 2 or ISO 27001 compliance via Vanta
- Non-technical founders and domain experts looking for a venture studio partner
- Private equity firms and portfolio companies running modernisation and roll-up projects
Visit PADISO to learn more about our services: CTO as a Service, AI & Agents Automation, AI Strategy & Readiness, Security Audit (SOC 2 / ISO 27001), Platform Design & Engineering, and Venture Studio & Co-Build.
If You’re Building Skills Yourself
Start small. Pick one repeating problem you solve. Extract the heuristics. Build a Tier 1 skill. Test it against real data. Deploy it. Measure the outcome.
Then, build the next skill. And the next.
Over time, you’ll have a library. That library is leverage. That leverage is how you scale from consulting to a product-like delivery model.
Reusable skills are the future of AI delivery. The teams that build them first will win.
Summary
Reusable agent skills are the bridge between one-off AI projects and scalable, repeatable AI delivery. They encode domain expertise in a way that AI agents can reliably invoke. They’re versioned, tested, and deployed like production code. They compose into larger orchestrations to solve complex problems.
At Padiso, we’ve built 30+ reusable skills across three core domains: due diligence, compliance, and platform engineering. These skills have saved our clients thousands of hours and millions of dollars. They’ve become the backbone of our venture studio, fractional CTO, and AI automation practices.
If you’re building agentic AI at scale, you need reusable skills. This guide shows you how we build them, how we structure them, and how we deploy them. Now it’s your turn to build your own library.
Start with one skill. Validate it. Then build the next. Your future self—and your clients—will thank you.