Plan Mode + Subagents: Architectural Reviews That Don't Need a Senior
Master architectural reviews using Plan Mode and subagents. Ship faster, reduce senior bottlenecks, and maintain code quality without expensive architects.
Table of Contents
- Why Architectural Reviews Fail Without Seniors
- What Plan Mode + Subagents Actually Does
- The Architecture of Plan Mode
- Building Your Subagent Reviewer Layer
- Planner Subagent: From Spec to Design
- Reviewer Subagent: Catching What Humans Miss
- Running the Full Cycle: Real Workflow
- Common Pitfalls and How to Avoid Them
- Measuring Quality Without Senior Review
- Next Steps: Getting Started Today
Why Architectural Reviews Fail Without Seniors
Most engineering teams face the same bottleneck: architectural decisions get reviewed by the one senior engineer who understands the codebase, the business constraints, and the technical debt. That person becomes a gate. Sprints slip. Junior engineers wait. Code ships without proper review, or review takes three weeks.
The core problem isn’t that seniors are lazy. It’s that architectural review requires holding multiple contexts simultaneously: the spec, the existing system, edge cases, security implications, scalability constraints, and team capability. One person doing this synchronously is inefficient and doesn’t scale.
When you try to skip architectural review entirely, you get:
- Fragmented designs that don’t talk to each other
- Rework cycles where code ships, breaks, then gets rewritten
- Security gaps that slip through because no one validated the auth layer
- Scalability surprises discovered in production
- Team confusion about why decisions were made
The solution isn’t to hire another senior. It’s to structure the review process so that it can be parallelised, asynchronous, and repeatable. That’s where Plan Mode and subagents come in.
What Plan Mode + Subagents Actually Does
Plan Mode is a structured thinking approach where an AI model breaks down a complex problem into discrete, reviewable steps before executing. Rather than jumping straight to code, the model says: “Here’s my plan. Here’s why. Here are the assumptions. Here are the risks.”
Subagents are specialised AI agents, each with a narrow, well-defined role. Instead of one agent doing everything, you deploy a Planner subagent to draft architecture, then hand off to a Reviewer subagent to challenge it.
Together, they create a pattern that works:
- Planner subagent takes the feature spec and produces a detailed design document: data models, API contracts, deployment topology, dependency list, risk surface.
- Reviewer subagent (a separate instance with different system prompts) reads that design and actively tries to break it: missing edge cases, security holes, scalability problems, team capability gaps.
- Iteration loop happens asynchronously. The Planner refines based on Reviewer feedback. No senior in the loop.
- Ship gate happens when Reviewer signs off, or when the design reaches a confidence threshold.
The key insight: you’re not replacing the senior. You’re automating the structure of review so that seniors can focus on high-stakes decisions, not every architectural choice.
This pattern has been validated in production at teams using Plan Mode with Model Steering and subagent-driven development workflows. The result: 40–60% faster architectural sign-off, fewer rework cycles, and better junior engineer learning.
The Architecture of Plan Mode
Plan Mode works by forcing the model to think step-by-step before acting. When you invoke Plan Mode, the model:
- Reads the brief (feature spec, existing codebase, constraints)
- Generates a plan in structured format (numbered steps, dependencies, assumptions)
- Exposes the plan for human or agent review
- Executes the plan only after approval or iteration
For architectural review, this is powerful because it makes reasoning visible. You can see why the Planner chose a certain database, why they proposed this API shape, where they think risk lives.
Plan Structure for Architecture
A good architectural plan includes:
- Problem restatement: What are we actually solving?
- Constraints: Latency, throughput, team size, deployment environment, compliance requirements
- Data model: Tables, relationships, indexing strategy
- API surface: Endpoints, request/response shapes, error handling
- Integration points: Where does this touch existing systems?
- Deployment topology: Containers, orchestration, scaling strategy
- Failure modes: What breaks? How do we recover?
- Security assumptions: Auth, encryption, audit logging
- Team capability: Do we have the skills? Do we need training?
- Rollback strategy: How do we undo this if it fails?
When the Planner generates this explicitly, the Reviewer has something concrete to attack. They’re not guessing. They’re reading a plan and asking: “Is this right?”
This structured approach aligns with architectural design decisions in AI agent harnesses, which identifies subagent architecture and task decomposition as key dimensions for reliable agent systems.
Building Your Subagent Reviewer Layer
Subagents aren’t magic. They’re specialised instances of a language model, each with:
- A narrow role (e.g., “You are an architect reviewer for backend systems”)
- Specific instructions (e.g., “Find security gaps. Find scalability problems. Challenge assumptions.”)
- Context isolation (they don’t see the full codebase, only the design doc)
- Output format (structured feedback: issue category, severity, recommendation)
Setting Up the Planner Subagent
Your Planner subagent gets:
- The feature spec
- The existing codebase (or a summary of it)
- Constraints (latency budget, team size, deployment environment)
- A system prompt that says: “You are an expert backend architect. Your job is to produce a detailed, reviewable architecture design. Be specific. Include data models, API contracts, deployment topology, and risk assessment.”
The Planner outputs a design document in Markdown. It’s not code yet. It’s a blueprint.
Setting Up the Reviewer Subagent
Your Reviewer subagent gets:
- The same spec
- The design document from the Planner
- A system prompt that says: “You are a critical architect. Your job is to find flaws. Look for: missing edge cases, security gaps, scalability problems, team capability mismatches, deployment risks. Be specific. Propose fixes.”
The Reviewer outputs structured feedback. For each issue found, they note:
- Category (security, scalability, design, team capability, etc.)
- Severity (blocker, high, medium, low)
- Description (what’s the problem?)
- Recommendation (how to fix it?)
This pattern is detailed in the Architect Reviewer Subagent documentation, which shows how to structure a subagent specifically for catching architectural flaws.
Planner Subagent: From Spec to Design
The Planner subagent’s job is to take ambiguity and turn it into a concrete design. Here’s what a real workflow looks like.
Input: The Feature Spec
Feature: Multi-tenant reporting dashboard
Business requirement: Customers need to see usage metrics (API calls, errors, latency)
for their own tenant only. Data must be queryable in <2 seconds.
We expect 100+ tenants, 1M+ API calls per day across all tenants.
Constraints:
- We're a 4-person engineering team
- We run on AWS (ECS + RDS)
- Budget: no new services
- Security: SOC 2 audit in 6 months
Planner Output: The Design Document
The Planner generates something like:
## Architecture: Multi-Tenant Reporting Dashboard
### Problem Restatement
Provide per-tenant visibility into API usage metrics (calls, errors, latency)
with <2s query latency, supporting 100+ tenants and 1M+ daily API calls.
### Data Model
**api_events** (fact table, partitioned by date)
- tenant_id (indexed)
- timestamp
- endpoint
- status_code
- latency_ms
- error_message (nullable)
**tenants** (dimension table)
- tenant_id (PK)
- name
- created_at
### Query Strategy
For "calls in last 24h": SELECT COUNT(*) FROM api_events
WHERE tenant_id = ? AND timestamp > NOW() - INTERVAL 1 DAY
Index: (tenant_id, timestamp DESC) ensures <2s response for typical tenant volumes.
### API Surface
GET /v1/metrics/usage?period=24h
- Returns: {calls: 1234, errors: 12, p95_latency: 450}
- Auth: Bearer token (existing auth layer)
- Rate limit: 100 req/min per tenant
### Deployment
New service: metrics-api (single container, ECS)
- Reads from existing RDS (read replica if needed)
- Caches responses in Redis (5-min TTL)
- Deployed alongside existing services
### Risk Assessment
**High risk**: Slow queries if indexing is wrong. Mitigation: load test with 100 tenants.
**Medium risk**: Cache invalidation. Mitigation: TTL-based, no manual invalidation.
**Low risk**: Auth bypass. Mitigation: reuse existing auth middleware.
### Team Capability
This requires: SQL optimization (we have it), API design (we have it),
ECS deployment (we have it). No new skills needed.
### Rollback
If metrics-api fails, it's a new service. Turn off the feature flag.
No impact on existing systems.
This is what the Planner produces. It’s detailed, reviewable, and specific enough to code from.
Reviewer Subagent: Catching What Humans Miss
The Reviewer subagent reads that design and systematically attacks it. Here’s what they might find.
Issue 1: Missing Tenant Isolation Check
Severity: Blocker (Security)
Problem: The API endpoint doesn’t validate that the requesting tenant can only see their own data. If a token is stolen, an attacker could enumerate other tenants by changing the tenant_id in the WHERE clause.
Recommendation: Add a middleware check: extract tenant_id from the auth token, verify it matches the query parameter. Reject mismatches with 403.
Issue 2: Scalability Cliff at 1M Calls/Day
Severity: High (Scalability)
Problem: The design assumes a single RDS instance can handle metric writes + reporting queries. At 1M calls/day, that’s ~12 writes/sec plus reporting reads. RDS will bottleneck.
Recommendation: Either (a) use a time-series database (TimescaleDB), (b) pre-aggregate metrics into hourly buckets, or (c) use a read replica specifically for reporting.
Issue 3: No Audit Logging
Severity: Medium (Compliance)
Problem: SOC 2 audit in 6 months will require proof that metric queries are logged. Current design doesn’t log who queried what.
Recommendation: Log all metric API calls to CloudWatch: timestamp, tenant_id, user_id, query params, response time. Retain for 90 days.
Issue 4: Cache Invalidation Race
Severity: Medium (Correctness)
Problem: If an API call occurs at 11:59pm and the cache TTL expires at 12:00am, a user might see stale data from yesterday mixed with today’s data.
Recommendation: Use time-based cache keys: metrics:tenant_id:date:hour. Invalidate explicitly at hour boundaries.
This is what good review looks like. The Reviewer isn’t saying “looks good.” They’re finding real problems and proposing solutions.
Running the Full Cycle: Real Workflow
Here’s how this works in practice, end-to-end.
Day 1: Planner Runs
You invoke the Planner subagent:
Feature spec: Multi-tenant reporting dashboard
Existing codebase: [link to repo]
Constraints: 4-person team, AWS, SOC 2 in 6 months
The Planner takes 2–5 minutes and produces a design document. You post it in Slack. The team skims it.
Day 1 Evening: Reviewer Runs
You invoke the Reviewer subagent with the design document. They spend 5–10 minutes finding issues. They output:
[
{
"category": "security",
"severity": "blocker",
"issue": "Missing tenant isolation check",
"recommendation": "Add middleware to validate tenant_id matches auth token"
},
{
"category": "scalability",
"severity": "high",
"issue": "RDS bottleneck at 1M calls/day",
"recommendation": "Use TimescaleDB or pre-aggregate to hourly buckets"
},
...
]
Day 2: Planner Refines
The Planner reads the Reviewer’s feedback and updates the design:
- Adds tenant isolation check to the API surface
- Proposes hourly pre-aggregation to reduce write load
- Adds audit logging section
- Updates cache strategy to use time-based keys
You post the updated design in Slack.
Day 2 Evening: Reviewer Re-checks
The Reviewer looks at the updated design. They find:
- The tenant isolation check is good
- Hourly pre-aggregation works, but they note a new issue: what happens if a metric write is delayed? (Mitigation: use a 5-minute buffer before aggregating)
- Audit logging looks good
They mark the design as approved with minor notes.
Day 3: Code Starts
A junior engineer takes the approved design and starts coding. They have a clear blueprint. They don’t guess. They don’t make architectural decisions on the fly. They implement.
The whole cycle took 2 days, asynchronously, with zero senior involvement.
Common Pitfalls and How to Avoid Them
Pitfall 1: Planner Generates Code, Not Design
If your Planner subagent starts writing SQL or Python, you’ve lost the review step. The Reviewer can’t effectively critique code. They need a design document.
Fix: Explicitly instruct the Planner: “Output a design document in Markdown. Do not write code. Do not write SQL. Write data models, API contracts, deployment topology, and risk assessment.”
Pitfall 2: Reviewer Becomes a Rubber Stamp
If the Reviewer just says “looks good,” they’re not doing their job. Review should be adversarial.
Fix: Instruct the Reviewer: “Your job is to find problems. Be critical. Assume the Planner made mistakes. Look for: security gaps, scalability bottlenecks, missing edge cases, team capability mismatches.”
Pitfall 3: No Iteration Loop
If the Planner generates once and you’re done, you’re missing the point. The Planner needs to see Reviewer feedback and refine.
Fix: Build a loop: Planner → Reviewer → Planner (if feedback) → Reviewer (if changes) → Ship. Make this asynchronous and tracked in your project management tool.
Pitfall 4: Losing Context
If the Reviewer doesn’t have access to the existing codebase, they’ll miss integration points and suggest designs that clash with reality.
Fix: Give the Reviewer context: the existing codebase (or a summary), the team’s tech stack, deployment constraints, and any architectural patterns you’ve already established.
Pitfall 5: No Approval Gate
If designs ship without explicit sign-off, you’ll get fragmented architectures.
Fix: Require the Reviewer to explicitly approve the design before coding starts. Use a simple checklist: security ✓, scalability ✓, team capability ✓, deployment ✓. No check = no code.
Measuring Quality Without Senior Review
You can’t just assume this works. You need metrics.
Metric 1: Time to Architectural Sign-Off
Before Plan Mode + Subagents: How long did it take from spec to approved design? (Probably 1–2 weeks, waiting for a senior.)
After: Track the cycle time. You should see 2–5 days, asynchronously.
Metric 2: Rework Cycles
How many times does a feature get redesigned after coding starts? Before, this was common (senior finds a flaw mid-implementation). After, it should drop by 70%+.
Metric 3: Security Issues Found in Review vs. Production
Before: Security issues slipped through to production. After: The Reviewer should catch most of them. Track: issues found in review vs. issues found in production. The ratio should shift dramatically.
Metric 4: Team Satisfaction
Ask junior engineers: “Do you feel confident in your architectural decisions?” Before, the answer was often “I’m guessing.” After, it should be “I have a clear design to implement.”
Metric 5: Feature Ship Time
From spec to production, how long does it take? Plan Mode + Subagents should reduce this by 20–40% because you’re not waiting for seniors and you’re not reworking mid-implementation.
Next Steps: Getting Started Today
You don’t need a fancy platform to start. You can begin with Claude Code and basic prompt engineering.
Step 1: Define Your Planner Prompt
Write a system prompt for your Planner subagent. Include:
- Role: “You are a backend architect.”
- Task: “Given a feature spec, produce a detailed design document.”
- Format: “Output Markdown with sections: Problem, Data Model, API Surface, Deployment, Risk Assessment, Team Capability, Rollback.”
- Constraints: “Be specific. No code. No hand-waving.”
Step 2: Define Your Reviewer Prompt
Write a system prompt for your Reviewer subagent. Include:
- Role: “You are a critical architect.”
- Task: “Review this design. Find flaws.”
- Format: “Output JSON with issue category, severity, description, and recommendation.”
- Tone: “Be adversarial. Assume the Planner made mistakes.”
Refer to best practices for AI agents, subagents, skills & MCP for guidance on designing effective subagent prompts and context isolation.
Step 3: Test with One Feature
Pick a mid-sized feature (not too small, not too big). Run it through the Planner → Reviewer cycle. Measure:
- Time to sign-off
- Number of issues found
- Quality of the final design
- Junior engineer confidence
Step 4: Iterate on Prompts
After the first feature, refine your prompts. Did the Reviewer miss anything? Did the Planner over-specify? Adjust and try again.
Step 5: Integrate into Your Workflow
Once you’re confident, make this standard: every feature spec goes through Planner → Reviewer before coding starts. Track it in your project management tool (Jira, Linear, whatever you use).
For additional guidance on implementing Plan Mode in your development workflow, see Plan Mode in Claude Code and Create Custom Subagents for detailed technical documentation.
Why This Matters for Your Team
Architectural review isn’t a luxury. It’s how you prevent 6-month rework cycles and security breaches. But it doesn’t have to be a senior bottleneck.
Plan Mode + Subagents automates the structure of review. You’re not replacing human judgment. You’re making review asynchronous, repeatable, and scalable.
The result:
- Faster ship cycles: 2–5 days to architectural sign-off, not 2 weeks
- Better junior engineers: They learn by seeing detailed, reviewed designs
- Fewer surprises: Issues caught in review, not production
- Senior engineers freed up: They focus on high-stakes decisions, not every design
- Consistent quality: Same review process, same rigor, every time
This is how modern engineering teams scale without hiring a team of architects. It’s how you ship fast and maintain quality.
Implementation at PADISO
At PADISO, we’ve built this pattern into our AI & Agents Automation and Platform Design & Engineering services. We use Plan Mode + Subagents to help Sydney-based startups and enterprises ship architecturally sound systems without the senior bottleneck.
If you’re a founder or engineering leader looking to scale your architectural review process, we can help you implement this pattern. Whether you need CTO as a Service guidance, fractional CTO support, or a full venture studio co-build, we apply these principles to ship faster and maintain quality.
We also help teams pursuing SOC 2 compliance and ISO 27001 compliance by ensuring architectural decisions are security-first from day one. Our Security Audit and Vanta implementation services build on solid architectural foundations.
Ready to eliminate your architectural review bottleneck? Get in touch.