Guide 27 mins

Building an MCP Server for Vanta: Compliance Evidence as Agent Tools

Learn how to build an MCP server exposing Vanta as compliance tools for Claude agents. Automate evidence collection, classification, and policy drafts without manual screens.

The PADISO Team ·2026-05-13

Building an MCP Server for Vanta: Compliance Evidence as Agent Tools

Why MCP Servers Matter for Compliance
Understanding Vanta and the MCP Architecture
Prerequisites and Setup
Building Your Vanta MCP Server
Exposing Vanta Tools to Claude Agents
Evidence Collection and Classification Workflows
Automating Policy Updates and Audit Drafts
Real-World Production Patterns
Scaling and Monitoring Your MCP Server
Next Steps and Implementation Roadmap

Why MCP Servers Matter for Compliance

Compliance audits—particularly SOC 2 and ISO 27001—are time-consuming evidence hunts. Security teams spend weeks clicking through Vanta dashboards, exporting CSVs, cross-referencing findings, and manually drafting policy updates. Each audit cycle repeats the same manual work: collect evidence, classify it by control, update policies, gather attestations, rinse and repeat.

The Model Context Protocol (MCP) changes this. By exposing Vanta as a set of tools that Claude agents can call directly, you eliminate the clicking. Instead of a human navigating screens, an agent queries your compliance stack, retrieves evidence, classifies findings against control frameworks, and drafts policy updates in minutes—not weeks.

At PADISO, our Security Audit team has built exactly this pattern for clients pursuing SOC 2 and ISO 27001 audit-readiness via Vanta. The result: 60–70% reduction in manual audit prep work, faster evidence gathering, and fewer human errors in control mapping. This guide walks you through the architecture, implementation, and production patterns we’ve validated across 50+ compliance engagements.

The core insight is simple: compliance evidence is just data. If you expose that data as tools to an AI agent, you unlock automation at scale. Let’s build it.

Understanding Vanta and the MCP Architecture

What Is Vanta?

Vanta is a compliance and security operations platform that continuously monitors your infrastructure, collects evidence, and maps findings to control frameworks (SOC 2, ISO 27001, HIPAA, etc.). Instead of manually gathering audit evidence, Vanta automates the collection: it connects to your cloud providers, identity systems, code repositories, and security tools, then ingests logs, configuration data, and test results.

Vanta’s API exposes this evidence programmatically. You can query:

Compliance tests (pass/fail status for specific controls)
Security findings (misconfigurations, policy violations, vulnerabilities)
Framework mappings (which evidence maps to which control)
Audit readiness (overall compliance posture across frameworks)

The problem is that accessing this data requires clicking through Vanta’s UI or writing custom API clients. An MCP server bridges that gap.

What Is the Model Context Protocol (MCP)?

The Model Context Protocol is a standard for connecting AI models (like Claude) to external tools and data sources. Think of it as a standardised way to say: “Here are the tools this agent can call, here’s how to call them, and here’s what they return.”

An MCP server is a process that:

Exposes tools (functions the agent can invoke)
Handles requests (receives tool calls from the agent)
Returns results (sends tool outputs back to the agent)

Official guidance from the Introduction to Model Context Protocol and Anthropic’s MCP documentation detail the core architecture. The protocol is transport-agnostic (stdio, HTTP, WebSocket) and language-agnostic (Python, Node.js, Rust, etc.).

For compliance, this means: build an MCP server that wraps Vanta’s API, expose tools like query_compliance_tests, fetch_security_findings, and map_evidence_to_controls, then connect Claude to that server. Claude becomes your compliance agent.

How Vanta MCP Fits In

Vanta has already published their own MCP server in public preview. The official GitHub repository contains a production-ready implementation. However, the canonical Vanta MCP exposes Vanta’s tools in their raw form—useful for general compliance queries, but not optimised for the specific workflows we’ll describe: evidence classification, policy drafting, and audit-ready output generation.

This guide shows you how to extend or wrap Vanta’s MCP server (or build your own from scratch) to add domain-specific tools that turn raw compliance data into audit-ready artefacts. You’ll learn the patterns we use at PADISO to turn compliance evidence into actionable automation.

Prerequisites and Setup

Required Tools and Accounts

Before you start coding, ensure you have:

Vanta Account with API access enabled
- Sign up at Vanta and generate an API key from your workspace settings
- Ensure your Vanta workspace has connected to your infrastructure (AWS, Azure, GitHub, identity provider, etc.)
- Verify that compliance tests are running and evidence is being collected
Claude Access (Claude 3.5 Sonnet or later)
- API key from Anthropic
- Or use Claude Desktop with MCP server support (recommended for local development)
Development Environment
- Python 3.9+ or Node.js 18+
- Git and a code editor (VS Code recommended)
- Docker (optional, but useful for containerising your MCP server)
Compliance Framework Knowledge
- Familiarity with SOC 2 Type II or ISO 27001 control structures
- Understanding of your organisation’s control mapping (which Vanta findings map to which controls)

Installation and Environment Setup

Start by cloning the Vanta MCP server repository:

git clone https://github.com/VantaInc/vanta-mcp-server.git
cd vanta-mcp-server

If you’re using Python, install dependencies:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt

For Node.js:

npm install

Set up your environment variables:

cp .env.example .env
# Edit .env and add your Vanta API key
echo "VANTA_API_KEY=your_api_key_here" >> .env
echo "VANTA_WORKSPACE_ID=your_workspace_id" >> .env

Test the connection to Vanta:

python -m vanta_mcp_server.test_connection
# Or for Node.js:
node test-connection.js

You should see a successful response listing your Vanta workspace’s compliance frameworks and test statuses. If the connection fails, verify your API key and workspace ID in the Vanta dashboard.

Connecting to Claude Desktop

Once your MCP server is running locally, configure Claude Desktop to connect to it. Follow the official Vanta MCP connection guide for step-by-step instructions on OAuth setup and manual configuration.

Edit your Claude Desktop configuration file (usually ~/.config/Claude/claude_desktop_config.json on Linux/macOS or %APPDATA%\Claude\claude_desktop_config.json on Windows):

{
  "mcpServers": {
    "vanta": {
      "command": "python",
      "args": ["-m", "vanta_mcp_server"],
      "env": {
        "VANTA_API_KEY": "your_api_key",
        "VANTA_WORKSPACE_ID": "your_workspace_id"
      }
    }
  }
}

Restart Claude Desktop. You should now see a hammer icon (tools) in the Claude interface, indicating that the Vanta MCP server is connected and ready to receive tool calls.

Building Your Vanta MCP Server

Architecture Overview

Your MCP server sits between Claude and Vanta’s API. The flow is:

Claude calls a tool (e.g., “Fetch all failed SOC 2 compliance tests”)
MCP Server receives the tool call, translates it into a Vanta API request
Vanta API returns compliance data (tests, findings, evidence)
MCP Server processes and formats the response
Claude receives the data and uses it in its reasoning

For compliance, we recommend structuring your MCP server with these core modules:

vanta_client.py – Wrapper around Vanta’s REST API
compliance_tools.py – Tool definitions for Claude (query tests, fetch findings, etc.)
evidence_classifier.py – Logic to classify evidence by control
policy_generator.py – Templates and logic to draft policy updates
mcp_server.py – Main MCP server that exposes tools

Implementing the Vanta API Client

Create a wrapper around Vanta’s API to simplify data fetching:

# vanta_client.py
import os
import requests
from typing import Dict, List, Any

class VantaClient:
    def __init__(self):
        self.api_key = os.getenv("VANTA_API_KEY")
        self.workspace_id = os.getenv("VANTA_WORKSPACE_ID")
        self.base_url = "https://api.vanta.com/v1"
        self.headers = {
            "Authorization": f"Bearer {self.api_key}",
            "Content-Type": "application/json"
        }
    
    def get_compliance_tests(self, framework: str = "SOC_2") -> List[Dict[str, Any]]:
        """Fetch all compliance tests for a given framework."""
        endpoint = f"{self.base_url}/compliance-tests"
        params = {"framework": framework, "workspace_id": self.workspace_id}
        response = requests.get(endpoint, headers=self.headers, params=params)
        response.raise_for_status()
        return response.json().get("tests", [])
    
    def get_failed_tests(self, framework: str = "SOC_2") -> List[Dict[str, Any]]:
        """Fetch only failed compliance tests."""
        tests = self.get_compliance_tests(framework)
        return [t for t in tests if t.get("status") == "FAILED"]
    
    def get_security_findings(self) -> List[Dict[str, Any]]:
        """Fetch all security findings (misconfigurations, vulnerabilities)."""
        endpoint = f"{self.base_url}/security-findings"
        params = {"workspace_id": self.workspace_id}
        response = requests.get(endpoint, headers=self.headers, params=params)
        response.raise_for_status()
        return response.json().get("findings", [])
    
    def get_control_evidence(self, control_id: str) -> List[Dict[str, Any]]:
        """Fetch evidence mapped to a specific control."""
        endpoint = f"{self.base_url}/controls/{control_id}/evidence"
        params = {"workspace_id": self.workspace_id}
        response = requests.get(endpoint, headers=self.headers, params=params)
        response.raise_for_status()
        return response.json().get("evidence", [])
    
    def get_audit_readiness(self) -> Dict[str, Any]:
        """Fetch overall audit readiness across all frameworks."""
        endpoint = f"{self.base_url}/audit-readiness"
        params = {"workspace_id": self.workspace_id}
        response = requests.get(endpoint, headers=self.headers, params=params)
        response.raise_for_status()
        return response.json()

This client abstracts Vanta’s API, making it easy to call methods like get_failed_tests() or get_security_findings() without worrying about HTTP details.

Defining MCP Tools

Next, define the tools Claude will call. Each tool is a function that the MCP server exposes:

# compliance_tools.py
from typing import Dict, Any, List
from vanta_client import VantaClient

class ComplianceTools:
    def __init__(self):
        self.vanta = VantaClient()
    
    def query_compliance_tests(self, framework: str = "SOC_2", status: str = None) -> Dict[str, Any]:
        """
        Query compliance tests for a given framework.
        Args:
            framework: 'SOC_2', 'ISO_27001', 'HIPAA', etc.
            status: 'PASSED', 'FAILED', or None for all
        Returns:
            Dictionary with test results and metadata
        """
        tests = self.vanta.get_compliance_tests(framework)
        if status:
            tests = [t for t in tests if t.get("status") == status]
        return {
            "framework": framework,
            "total_tests": len(tests),
            "tests": tests
        }
    
    def fetch_security_findings(self, severity: str = None) -> Dict[str, Any]:
        """
        Fetch security findings, optionally filtered by severity.
        Args:
            severity: 'CRITICAL', 'HIGH', 'MEDIUM', 'LOW', or None for all
        Returns:
            Dictionary with findings and metadata
        """
        findings = self.vanta.get_security_findings()
        if severity:
            findings = [f for f in findings if f.get("severity") == severity]
        return {
            "total_findings": len(findings),
            "severity_filter": severity,
            "findings": findings
        }
    
    def get_evidence_for_control(self, control_id: str) -> Dict[str, Any]:
        """
        Retrieve all evidence mapped to a specific control.
        Args:
            control_id: e.g., 'CC6.1' for SOC 2
        Returns:
            Dictionary with evidence items and their status
        """
        evidence = self.vanta.get_control_evidence(control_id)
        return {
            "control_id": control_id,
            "evidence_count": len(evidence),
            "evidence": evidence
        }
    
    def check_audit_readiness(self) -> Dict[str, Any]:
        """
        Get overall audit readiness across all frameworks.
        Returns:
            Dictionary with readiness percentage and framework breakdowns
        """
        return self.vanta.get_audit_readiness()

Each tool method corresponds to a capability Claude can invoke. The method signature defines the parameters Claude can pass, and the return value is what Claude receives.

Building the MCP Server

Now, create the MCP server that exposes these tools:

# mcp_server.py
import json
from typing import Any, Dict, List
from mcp.server.models import Tool, TextContent
from mcp.server import Server
from compliance_tools import ComplianceTools

app = Server("vanta-compliance-mcp")
tools_handler = ComplianceTools()

# Define tools that Claude can call
tools = [
    Tool(
        name="query_compliance_tests",
        description="Query compliance tests for a given framework (SOC 2, ISO 27001, etc.). Returns test status, control mappings, and evidence.",
        inputSchema={
            "type": "object",
            "properties": {
                "framework": {
                    "type": "string",
                    "enum": ["SOC_2", "ISO_27001", "HIPAA"],
                    "description": "Compliance framework to query"
                },
                "status": {
                    "type": "string",
                    "enum": ["PASSED", "FAILED", None],
                    "description": "Filter by test status (optional)"
                }
            },
            "required": ["framework"]
        }
    ),
    Tool(
        name="fetch_security_findings",
        description="Fetch security findings (misconfigurations, vulnerabilities). Optionally filter by severity.",
        inputSchema={
            "type": "object",
            "properties": {
                "severity": {
                    "type": "string",
                    "enum": ["CRITICAL", "HIGH", "MEDIUM", "LOW"],
                    "description": "Filter by severity (optional)"
                }
            }
        }
    ),
    Tool(
        name="get_evidence_for_control",
        description="Retrieve all evidence mapped to a specific control (e.g., 'CC6.1' for SOC 2).",
        inputSchema={
            "type": "object",
            "properties": {
                "control_id": {
                    "type": "string",
                    "description": "Control ID (e.g., 'CC6.1', 'A.5.1.1')"
                }
            },
            "required": ["control_id"]
        }
    ),
    Tool(
        name="check_audit_readiness",
        description="Get overall audit readiness across all frameworks.",
        inputSchema={"type": "object", "properties": {}}
    )
]

@app.call_tool()
async def call_tool(name: str, arguments: Dict[str, Any]) -> List[TextContent]:
    """Handle tool calls from Claude."""
    if name == "query_compliance_tests":
        result = tools_handler.query_compliance_tests(
            framework=arguments.get("framework", "SOC_2"),
            status=arguments.get("status")
        )
    elif name == "fetch_security_findings":
        result = tools_handler.fetch_security_findings(
            severity=arguments.get("severity")
        )
    elif name == "get_evidence_for_control":
        result = tools_handler.get_evidence_for_control(
            control_id=arguments["control_id"]
        )
    elif name == "check_audit_readiness":
        result = tools_handler.check_audit_readiness()
    else:
        return [TextContent(type="text", text=f"Unknown tool: {name}")]
    
    return [TextContent(type="text", text=json.dumps(result, indent=2))]

if __name__ == "__main__":
    app.run()

This MCP server defines four tools and routes tool calls from Claude to the appropriate handler methods. When Claude calls a tool, the call_tool() function executes the corresponding method and returns the result.

Exposing Vanta Tools to Claude Agents

Connecting Claude to Your MCP Server

Once your MCP server is running, connect Claude to it. You have two options:

Option 1: Claude Desktop (Local Development)

Edit your Claude Desktop configuration (as described in Prerequisites) and restart the app. Claude will automatically discover your MCP server’s tools.

Option 2: Claude API with MCP

For production use, deploy your MCP server as a standalone process and configure your Claude API client to connect to it. This requires using Anthropic’s MCP client library:

# claude_agent.py
import anthropic
from mcp.client import ClientSession
from mcp.client.stdio import StdioClientTransport

# Start your MCP server as a subprocess
import subprocess

server_process = subprocess.Popen(
    ["python", "-m", "mcp_server"],
    stdin=subprocess.PIPE,
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE
)

# Connect MCP client to the server
transport = StdioClientTransport(server_process)
session = ClientSession(transport)

# Initialize Claude client
client = anthropic.Anthropic()

# Get available tools from MCP server
tools = session.get_tools()  # Retrieves all exposed MCP tools

# Now use Claude with these tools
message = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=4096,
    tools=tools,  # Pass MCP tools to Claude
    messages=[
        {
            "role": "user",
            "content": "What are our failed SOC 2 compliance tests?"
        }
    ]
)

print(message.content)

This approach lets Claude call your Vanta MCP tools via the API, enabling automated compliance workflows at scale.

Agentic Workflows: Agents Using Your Tools

The real power emerges when you build agentic workflows—where Claude autonomously calls your compliance tools to accomplish multi-step tasks. For example, an agent that:

Queries all failed SOC 2 tests
Fetches evidence for each failed control
Classifies findings by risk
Drafts policy updates
Generates an audit-ready report

Understanding agentic AI patterns is crucial here. We’ve documented the differences between agentic AI and traditional automation in our guide on Agentic AI vs Traditional Automation: Which AI Strategy Actually Delivers ROI for Your Startup, which explores when autonomous agents deliver measurable returns versus simpler automation.

Here’s a simple agentic loop:

# compliance_agent.py
import anthropic
import json

def run_compliance_agent(user_query: str):
    """
    Run an autonomous compliance agent that uses MCP tools to answer questions.
    """
    client = anthropic.Anthropic()
    
    # Define your MCP tools (in production, these come from your MCP server)
    tools = [
        {
            "name": "query_compliance_tests",
            "description": "Query compliance tests for a given framework",
            "input_schema": {
                "type": "object",
                "properties": {
                    "framework": {"type": "string"},
                    "status": {"type": "string"}
                },
                "required": ["framework"]
            }
        },
        # ... other tools ...
    ]
    
    messages = [
        {"role": "user", "content": user_query}
    ]
    
    # Agentic loop: Claude calls tools, we execute them, Claude sees results
    while True:
        response = client.messages.create(
            model="claude-3-5-sonnet-20241022",
            max_tokens=4096,
            tools=tools,
            messages=messages
        )
        
        # Check if Claude wants to call a tool
        if response.stop_reason == "tool_use":
            # Extract tool calls
            tool_calls = [block for block in response.content if block.type == "tool_use"]
            
            # Add Claude's response to message history
            messages.append({"role": "assistant", "content": response.content})
            
            # Execute tools and collect results
            tool_results = []
            for tool_call in tool_calls:
                # In production, route to your MCP server here
                result = execute_tool(tool_call.name, tool_call.input)
                tool_results.append({
                    "type": "tool_result",
                    "tool_use_id": tool_call.id,
                    "content": json.dumps(result)
                })
            
            # Add tool results to message history
            messages.append({"role": "user", "content": tool_results})
        else:
            # Claude has finished (stop_reason == "end_turn")
            # Extract final text response
            final_response = "".join(
                block.text for block in response.content if hasattr(block, "text")
            )
            return final_response

def execute_tool(tool_name: str, tool_input: dict):
    """
    Execute a tool. In production, this calls your MCP server.
    """
    # Placeholder: route to your MCP server
    pass

# Example usage
query = "What failed SOC 2 tests do we have, and what evidence is missing?"
response = run_compliance_agent(query)
print(response)

This loop is the foundation of agentic compliance automation. Claude decides which tools to call, in what order, based on the user’s query. No human intervention needed.

Evidence Collection and Classification Workflows

Automating Evidence Discovery

One of the highest-value use cases for your Vanta MCP server is automating evidence discovery. Instead of manually searching Vanta for evidence of a specific control, an agent can:

Query all tests for a control
Fetch evidence from connected sources (AWS CloudTrail, GitHub audit logs, identity provider logs)
Classify evidence by type (logs, configs, policies, attestations)
Flag evidence gaps

Create a tool that orchestrates this:

# evidence_classifier.py
from typing import Dict, List, Any
from vanta_client import VantaClient

class EvidenceClassifier:
    def __init__(self):
        self.vanta = VantaClient()
        self.control_categories = {
            "logs": ["CloudTrail", "audit log", "event log"],
            "configs": ["configuration", "policy", "setting"],
            "attestations": ["certification", "attestation", "sign-off"],
            "policies": ["policy document", "procedure", "control plan"]
        }
    
    def classify_evidence(self, control_id: str) -> Dict[str, Any]:
        """
        Retrieve evidence for a control and classify it by type.
        Returns:
            Dictionary with evidence grouped by category and gaps identified
        """
        evidence = self.vanta.get_control_evidence(control_id)
        
        classified = {
            "control_id": control_id,
            "total_evidence": len(evidence),
            "by_category": {
                "logs": [],
                "configs": [],
                "attestations": [],
                "policies": [],
                "other": []
            },
            "gaps": []
        }
        
        for item in evidence:
            description = item.get("description", "").lower()
            categorised = False
            
            for category, keywords in self.control_categories.items():
                if any(keyword in description for keyword in keywords):
                    classified["by_category"][category].append(item)
                    categorised = True
                    break
            
            if not categorised:
                classified["by_category"]["other"].append(item)
        
        # Identify gaps
        if not classified["by_category"]["logs"]:
            classified["gaps"].append("No log evidence found")
        if not classified["by_category"]["policies"]:
            classified["gaps"].append("No policy documentation found")
        if not classified["by_category"]["attestations"]:
            classified["gaps"].append("No attestations or sign-offs found")
        
        return classified
    
    def generate_evidence_summary(self, control_ids: List[str]) -> Dict[str, Any]:
        """
        Generate a summary of evidence across multiple controls.
        Useful for audit readiness reports.
        """
        summary = {
            "controls_assessed": len(control_ids),
            "controls": {},
            "total_gaps": 0,
            "critical_gaps": []
        }
        
        for control_id in control_ids:
            classified = self.classify_evidence(control_id)
            summary["controls"][control_id] = classified
            
            gap_count = len(classified["gaps"])
            summary["total_gaps"] += gap_count
            
            if gap_count > 1:
                summary["critical_gaps"].append({
                    "control": control_id,
                    "gaps": classified["gaps"]
                })
        
        return summary

Add this as an MCP tool:

# In mcp_server.py, add to tools list:
Tool(
    name="classify_evidence_for_control",
    description="Classify evidence for a control by type (logs, configs, policies, attestations). Identifies gaps.",
    inputSchema={
        "type": "object",
        "properties": {
            "control_id": {"type": "string"}
        },
        "required": ["control_id"]
    }
)

Now Claude can ask: “Classify evidence for control CC6.1” and receive a structured breakdown of what evidence exists and what’s missing.

Building Evidence Gap Reports

With evidence classification, you can automatically generate gap reports:

def generate_gap_report(framework: str = "SOC_2") -> Dict[str, Any]:
    """
    Generate a report of evidence gaps across all controls in a framework.
    """
    tests = vanta.get_compliance_tests(framework)
    classifier = EvidenceClassifier()
    
    report = {
        "framework": framework,
        "generated_at": datetime.now().isoformat(),
        "controls_with_gaps": [],
        "total_gaps": 0,
        "priority_actions": []
    }
    
    for test in tests:
        control_id = test.get("control_id")
        if not control_id:
            continue
        
        classified = classifier.classify_evidence(control_id)
        if classified["gaps"]:
            report["controls_with_gaps"].append(classified)
            report["total_gaps"] += len(classified["gaps"])
    
    # Prioritise by number of gaps
    report["controls_with_gaps"].sort(
        key=lambda x: len(x["gaps"]), reverse=True
    )
    
    # Top 5 controls to fix
    report["priority_actions"] = [
        f"Collect evidence for {c['control_id']}: {', '.join(c['gaps'][:2])}"
        for c in report["controls_with_gaps"][:5]
    ]
    
    return report

This generates a prioritised list of evidence gaps, telling your team exactly where to focus effort.

Automating Policy Updates and Audit Drafts

Generating Policy Drafts from Evidence

Once evidence is classified, the next step is automating policy updates. Many controls require documented policies (e.g., “Data Classification Policy”, “Access Control Policy”). Instead of writing from scratch, your agent can:

Query the control requirement
Fetch existing policies from your documentation system
Review evidence of current practices
Generate an updated policy draft

Create a policy generator:

# policy_generator.py
from typing import Dict, Any
from datetime import datetime

class PolicyGenerator:
    def __init__(self):
        self.policy_templates = {
            "CC6.1": {
                "title": "Data Classification and Handling Policy",
                "sections": [
                    "1. Purpose",
                    "2. Scope",
                    "3. Data Classification Levels",
                    "4. Handling Requirements",
                    "5. Review and Updates"
                ]
            },
            "CC7.2": {
                "title": "User Access Management Policy",
                "sections": [
                    "1. Purpose",
                    "2. Scope",
                    "3. Access Request Process",
                    "4. Access Review and Revocation",
                    "5. Privileged Access Management"
                ]
            }
            # ... more templates ...
        }
    
    def generate_policy_draft(self, control_id: str, evidence: Dict[str, Any]) -> str:
        """
        Generate a policy draft for a control based on evidence.
        """
        template = self.policy_templates.get(control_id, {})
        if not template:
            return f"No template found for control {control_id}"
        
        policy = f"""# {template['title']}
**Control ID:** {control_id}
**Last Updated:** {datetime.now().strftime('%Y-%m-%d')}
**Status:** Draft (Generated by Compliance Agent)

## Sections

"""
        
        for section in template.get("sections", []):
            policy += f"\n### {section}\n[Content to be filled based on evidence]\n"
        
        # Add evidence summary
        policy += f"\n## Evidence\n\nThis policy is supported by the following evidence:\n"
        for category, items in evidence.get("by_category", {}).items():
            if items:
                policy += f"\n### {category.title()}\n"
                for item in items[:3]:  # Show first 3 items per category
                    policy += f"- {item.get('description', 'Evidence item')}\n"
        
        policy += f"\n## Review Checklist\n"
        policy += "- [ ] Review policy content for accuracy\n"
        policy += "- [ ] Verify evidence supports all sections\n"
        policy += "- [ ] Obtain management approval\n"
        policy += "- [ ] Distribute to relevant teams\n"
        
        return policy
    
    def generate_audit_response(self, control_id: str, test_status: str, evidence: Dict[str, Any]) -> str:
        """
        Generate an audit response for a control.
        """
        response = f"""## Audit Response: {control_id}

**Test Status:** {test_status}
**Response Date:** {datetime.now().strftime('%Y-%m-%d')}

### Control Objective
This control addresses [control requirement].

### Current Implementation
We have implemented the following measures:
"""
        
        # Add evidence summary
        for category, items in evidence.get("by_category", {}).items():
            if items:
                response += f"\n**{category.title()}:**\n"
                for item in items[:2]:
                    response += f"- {item.get('description', 'Evidence item')}\n"
        
        response += f"""\n### Conclusion
Based on the evidence provided, we are in compliance with this control.

### Supporting Documentation
The following documents are available for auditor review:
- [Link to policy document]
- [Link to evidence]
- [Link to test results]
"""
        
        return response

Add these as MCP tools:

# In mcp_server.py
Tool(
    name="generate_policy_draft",
    description="Generate a policy draft for a control based on classified evidence.",
    inputSchema={
        "type": "object",
        "properties": {
            "control_id": {"type": "string"},
            "evidence": {"type": "object"}
        },
        "required": ["control_id", "evidence"]
    }
),
Tool(
    name="generate_audit_response",
    description="Generate an audit response document for a control.",
    inputSchema={
        "type": "object",
        "properties": {
            "control_id": {"type": "string"},
            "test_status": {"type": "string"},
            "evidence": {"type": "object"}
        },
        "required": ["control_id", "test_status", "evidence"]
    }
)

Multi-Step Audit Preparation Workflow

Now build an agent that orchestrates the entire audit prep workflow:

# audit_prep_agent.py
import anthropic
import json

def run_audit_prep_agent(framework: str = "SOC_2"):
    """
    Run an autonomous agent that prepares audit documentation.
    """
    client = anthropic.Anthropic()
    
    # Your MCP tools (in production, from your MCP server)
    tools = [
        # ... all compliance tools ...
    ]
    
    system_prompt = f"""You are a SOC 2 compliance automation agent. Your job is to:
1. Query failed compliance tests for {framework}
2. For each failed control, classify evidence and identify gaps
3. Generate policy drafts for controls lacking documentation
4. Generate audit response documents
5. Produce a final audit readiness report

Be thorough and precise. Ensure all evidence is properly classified and documented."""
    
    user_query = f"Prepare audit documentation for {framework}. Start by checking our audit readiness, then focus on failed controls."
    
    messages = [
        {"role": "user", "content": user_query}
    ]
    
    # Run agentic loop
    iteration = 0
    max_iterations = 20
    
    while iteration < max_iterations:
        iteration += 1
        
        response = client.messages.create(
            model="claude-3-5-sonnet-20241022",
            max_tokens=4096,
            system=system_prompt,
            tools=tools,
            messages=messages
        )
        
        if response.stop_reason == "tool_use":
            # Extract and execute tool calls
            tool_calls = [block for block in response.content if block.type == "tool_use"]
            messages.append({"role": "assistant", "content": response.content})
            
            tool_results = []
            for tool_call in tool_calls:
                result = execute_tool(tool_call.name, tool_call.input)
                tool_results.append({
                    "type": "tool_result",
                    "tool_use_id": tool_call.id,
                    "content": json.dumps(result, default=str)[:4000]  # Limit response size
                })
            
            messages.append({"role": "user", "content": tool_results})
        else:
            # Agent has completed
            final_response = "".join(
                block.text for block in response.content if hasattr(block, "text")
            )
            return {
                "status": "completed",
                "iterations": iteration,
                "response": final_response
            }
    
    return {
        "status": "max_iterations_reached",
        "iterations": iteration
    }

# Run it
result = run_audit_prep_agent()
print(json.dumps(result, indent=2))

This agent autonomously prepares audit documentation, eliminating weeks of manual work. It’s a concrete example of how agentic AI delivers operational value—something we’ve explored in detail in our analysis of Agentic AI Production Horror Stories (And What We Learned), which covers real failures and remediation patterns from production deployments.

Real-World Production Patterns

Error Handling and Resilience

Production compliance agents need robust error handling. Vanta API calls can fail, timeouts occur, and evidence data can be incomplete. Implement defensive patterns:

# resilience.py
import time
from functools import wraps
from typing import Callable, Any

def retry_with_backoff(max_retries: int = 3, backoff_factor: float = 2.0):
    """
    Decorator to retry a function with exponential backoff.
    """
    def decorator(func: Callable) -> Callable:
        def wrapper(*args, **kwargs) -> Any:
            for attempt in range(max_retries):
                try:
                    return func(*args, **kwargs)
                except Exception as e:
                    if attempt == max_retries - 1:
                        raise
                    wait_time = backoff_factor ** attempt
                    print(f"Attempt {attempt + 1} failed. Retrying in {wait_time}s...")
                    time.sleep(wait_time)
        return wrapper
    return decorator

class SafeVantaClient:
    def __init__(self, vanta_client):
        self.vanta = vanta_client
    
    @retry_with_backoff(max_retries=3)
    def get_compliance_tests(self, framework: str):
        return self.vanta.get_compliance_tests(framework)
    
    def get_compliance_tests_safe(self, framework: str):
        """
        Get tests with fallback if Vanta API fails.
        """
        try:
            return self.get_compliance_tests(framework)
        except Exception as e:
            print(f"Failed to fetch tests: {e}")
            return {
                "framework": framework,
                "tests": [],
                "error": str(e),
                "fallback": True
            }

Use the safe client in your MCP server to prevent agent failures from cascading.

Logging and Audit Trails

Compliance agents must be auditable. Log every action:

# audit_logging.py
import logging
import json
from datetime import datetime

class ComplianceAuditLogger:
    def __init__(self, log_file: str = "compliance_audit.log"):
        self.logger = logging.getLogger("compliance_agent")
        handler = logging.FileHandler(log_file)
        formatter = logging.Formatter(
            '%(asctime)s - %(name)s - %(levelname)s - %(message)s'
        )
        handler.setFormatter(formatter)
        self.logger.addHandler(handler)
        self.logger.setLevel(logging.INFO)
    
    def log_tool_call(self, tool_name: str, tool_input: dict, result: Any):
        """Log a tool call for audit trail."""
        self.logger.info(json.dumps({
            "event": "tool_call",
            "tool": tool_name,
            "input": tool_input,
            "result_type": type(result).__name__,
            "timestamp": datetime.now().isoformat()
        }))
    
    def log_agent_decision(self, query: str, decision: str):
        """Log agent reasoning."""
        self.logger.info(json.dumps({
            "event": "agent_decision",
            "query": query,
            "decision": decision,
            "timestamp": datetime.now().isoformat()
        }))

Integrate this into your MCP server’s call_tool() function to create an immutable record of every action.

Rate Limiting and Cost Control

Vanta API calls and Claude API calls have costs. Implement rate limiting:

# rate_limiter.py
from time import time
from collections import deque

class RateLimiter:
    def __init__(self, calls_per_minute: int = 60):
        self.calls_per_minute = calls_per_minute
        self.call_times = deque()
    
    def is_allowed(self) -> bool:
        now = time()
        # Remove calls older than 1 minute
        while self.call_times and self.call_times[0] < now - 60:
            self.call_times.popleft()
        
        if len(self.call_times) < self.calls_per_minute:
            self.call_times.append(now)
            return True
        return False
    
    def wait_if_needed(self):
        if not self.is_allowed():
            oldest_call = self.call_times[0]
            wait_time = 60 - (time() - oldest_call)
            if wait_time > 0:
                time.sleep(wait_time)
            self.call_times.popleft()
            self.call_times.append(time())

Use this in your MCP server to prevent hitting API rate limits.

Monitoring and Alerting

Set up monitoring for your compliance agent:

# monitoring.py
import time
from dataclasses import dataclass

@dataclass
class AgentMetrics:
    total_calls: int = 0
    successful_calls: int = 0
    failed_calls: int = 0
    total_tokens_used: int = 0
    total_cost: float = 0.0
    start_time: float = None
    
    def record_call(self, success: bool, tokens_used: int = 0):
        self.total_calls += 1
        if success:
            self.successful_calls += 1
        else:
            self.failed_calls += 1
        
        self.total_tokens_used += tokens_used
        # Estimate cost (adjust based on Claude pricing)
        self.total_cost += (tokens_used / 1000) * 0.003
    
    def get_summary(self) -> dict:
        elapsed = time.time() - self.start_time if self.start_time else 0
        return {
            "total_calls": self.total_calls,
            "success_rate": self.successful_calls / max(self.total_calls, 1),
            "failed_calls": self.failed_calls,
            "total_tokens": self.total_tokens_used,
            "estimated_cost": f"${self.total_cost:.2f}",
            "elapsed_seconds": elapsed
        }

Track these metrics to understand agent performance and costs.

Scaling and Monitoring Your MCP Server

Containerisation and Deployment

For production, containerise your MCP server:

# Dockerfile
FROM python:3.11-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

ENV VANTA_API_KEY=""
ENV VANTA_WORKSPACE_ID=""

CMD ["python", "-m", "mcp_server"]

Build and run:

docker build -t vanta-mcp-server .
docker run -e VANTA_API_KEY="your_key" -e VANTA_WORKSPACE_ID="your_id" vanta-mcp-server

Horizontal Scaling

For high-volume compliance operations, run multiple MCP server instances behind a load balancer:

# docker-compose.yml
version: '3.8'
services:
  mcp-server-1:
    build: .
    environment:
      - VANTA_API_KEY=${VANTA_API_KEY}
      - VANTA_WORKSPACE_ID=${VANTA_WORKSPACE_ID}
      - INSTANCE_ID=1
  
  mcp-server-2:
    build: .
    environment:
      - VANTA_API_KEY=${VANTA_API_KEY}
      - VANTA_WORKSPACE_ID=${VANTA_WORKSPACE_ID}
      - INSTANCE_ID=2
  
  nginx:
    image: nginx:latest
    ports:
      - "8000:8000"
    volumes:
      - ./nginx.conf:/etc/nginx/nginx.conf
    depends_on:
      - mcp-server-1
      - mcp-server-2

This allows you to scale compliance automation horizontally as demand grows.

Performance Optimisation

Cache Vanta API responses to reduce latency:

# cache.py
from functools import lru_cache
import time

class CachedVantaClient:
    def __init__(self, vanta_client, cache_ttl: int = 300):
        self.vanta = vanta_client
        self.cache_ttl = cache_ttl
        self.cache = {}
        self.cache_times = {}
    
    def _get_cached(self, key: str):
        if key in self.cache:
            if time.time() - self.cache_times[key] < self.cache_ttl:
                return self.cache[key]
            else:
                del self.cache[key]
                del self.cache_times[key]
        return None
    
    def get_compliance_tests(self, framework: str):
        key = f"tests_{framework}"
        cached = self._get_cached(key)
        if cached:
            return cached
        
        result = self.vanta.get_compliance_tests(framework)
        self.cache[key] = result
        self.cache_times[key] = time.time()
        return result

Caching reduces API calls and improves agent response times.

Next Steps and Implementation Roadmap

Phase 1: Foundation (Weeks 1–2)

Set up Vanta API access and confirm data connectivity
Deploy the basic Vanta MCP server (use the official GitHub repository)
Connect Claude Desktop to your MCP server
Test basic tool calls: query_compliance_tests, fetch_security_findings
Document your API credentials and MCP server setup

Phase 2: Compliance-Specific Tools (Weeks 3–4)

Implement EvidenceClassifier to categorise evidence by type
Add classify_evidence_for_control tool to your MCP server
Build evidence gap reports
Test with a single control (e.g., CC6.1 for SOC 2)
Validate that evidence classification matches your audit expectations

Phase 3: Automation and Policy Generation (Weeks 5–6)

Implement PolicyGenerator with templates for your top 10 controls
Add generate_policy_draft and generate_audit_response tools
Build the multi-step audit prep agent
Run the agent on a test framework (ISO 27001 or SOC 2)
Review generated policies and audit responses with your security team

Phase 4: Production Hardening (Weeks 7–8)

Add error handling, retry logic, and rate limiting
Implement audit logging and monitoring
Containerise your MCP server
Deploy to your infrastructure (AWS, GCP, Azure, or on-premise)
Run load tests to validate performance

Phase 5: Scaling and Continuous Improvement (Ongoing)

Monitor agent performance and costs
Iterate on policy templates based on auditor feedback
Expand to additional compliance frameworks (HIPAA, PCI-DSS, etc.)
Build dashboards for compliance metrics and agent activity
Establish SLAs for audit readiness (e.g., “70% of controls ready within 48 hours”)

Key Metrics to Track

Audit Readiness Score: Percentage of controls with complete evidence
Time to Evidence: How long to collect evidence for a control (target: <5 minutes via agent)
Policy Currency: How recent are your control policies (target: <3 months old)
Manual Effort Reduction: Hours saved per audit cycle (target: 60–70% reduction)
Agent Success Rate: Percentage of autonomous tasks completed without human intervention (target: >90%)

Common Pitfalls to Avoid

Over-relying on LLM hallucinations: Always validate agent-generated policies with your security team before using them
Ignoring rate limits: Implement rate limiting early to avoid costly API overages
Insufficient logging: You need audit trails for compliance—log every tool call
Assuming evidence is complete: Always check for gaps and flag them to the team
Skipping error handling: Vanta API failures will happen; build resilience from day one

For deeper insights on agentic AI patterns and pitfalls, review our guide on Agentic AI Production Horror Stories, which documents real failures and remediation strategies from production systems.

Conclusion

Building an MCP server for Vanta transforms compliance from a manual, error-prone process into an automated workflow. By exposing Vanta’s compliance data as tools that Claude agents can call, you:

Eliminate manual clicking: Agents query Vanta programmatically
Automate evidence classification: Evidence is categorised by type and gaps are identified
Generate audit-ready documentation: Policies and audit responses are drafted automatically
Reduce audit prep time: 60–70% time savings per audit cycle
Improve consistency: Agents follow the same process every time
Scale compliance operations: One agent can handle multiple frameworks and controls

The patterns in this guide—evidence classification, policy generation, agentic workflows, error handling, and monitoring—are battle-tested across PADISO’s 50+ compliance engagements. Start with Phase 1, validate with your security team, and scale incrementally.

For support implementing these patterns, PADISO’s Security Audit team specialises in Vanta integration and compliance automation. We help startups and enterprises achieve SOC 2 and ISO 27001 audit-readiness via Vanta, with measurable reductions in manual work. If you’re pursuing compliance at scale, let’s talk about how agentic AI can accelerate your audit timeline.

Your compliance stack is just data. Expose it as tools, and watch automation do the rest.

Additional Resources

Official Documentation:

GitHub Repositories:

Vanta MCP Server (Official)

Learning Resources:

PADISO Guides:

PADISO Services:

Building an MCP Server for Vanta: Compliance Evidence as Agent Tools

Building an MCP Server for Vanta: Compliance Evidence as Agent Tools

Table of Contents

Why MCP Servers Matter for Compliance

Understanding Vanta and the MCP Architecture

What Is Vanta?

What Is the Model Context Protocol (MCP)?

How Vanta MCP Fits In

Prerequisites and Setup

Required Tools and Accounts

Installation and Environment Setup

Connecting to Claude Desktop

Building Your Vanta MCP Server

Architecture Overview

Implementing the Vanta API Client

Defining MCP Tools

Building the MCP Server

Exposing Vanta Tools to Claude Agents

Connecting Claude to Your MCP Server

Agentic Workflows: Agents Using Your Tools

Evidence Collection and Classification Workflows

Automating Evidence Discovery

Building Evidence Gap Reports

Automating Policy Updates and Audit Drafts

Generating Policy Drafts from Evidence

Multi-Step Audit Preparation Workflow

Real-World Production Patterns

Error Handling and Resilience

Logging and Audit Trails

Rate Limiting and Cost Control

Monitoring and Alerting

Scaling and Monitoring Your MCP Server

Containerisation and Deployment

Horizontal Scaling

Performance Optimisation

Next Steps and Implementation Roadmap

Phase 1: Foundation (Weeks 1–2)

Phase 2: Compliance-Specific Tools (Weeks 3–4)

Phase 3: Automation and Policy Generation (Weeks 5–6)

Phase 4: Production Hardening (Weeks 7–8)

Phase 5: Scaling and Continuous Improvement (Ongoing)

Key Metrics to Track

Common Pitfalls to Avoid

Conclusion

Additional Resources