Claude Code Agency Workflow -- Social Animal

TL;DR

We run a headless web agency where Claude Code handles 60-70% of implementation work that used to need a full team. Our cost-per-MVP dropped from $35,000-$50,000 to $8,000-$15,000. Time-to-first-deploy went from 6-8 weeks to 10-18 days. But AI didn't replace everything--it replaced specific, well-scoped tasks. Here's what works, what doesn't, and what we still pay humans for.

Why We Rebuilt Our Agency Around Claude Code
What Does a Claude Code Agency Workflow Actually Look Like?
What AI Handles in Our Projects
What We Still Hire Humans For
Real Numbers: Cost-Per-MVP and Time-to-Deploy
Our Claude Code Project Setup
Is the One-Person Billion-Dollar Company Real?
What Doesn't Work Yet
How We Scope Client Projects Now
The Founder Math: Hours Per Week Breakdown
FAQ

Why We Rebuilt Our Agency Around Claude Code

We didn't plan this. Late 2024, we were a 4-person headless dev shop billing $150/hour for Next.js and headless CMS work. By March 2025, after integrating Claude Code--specifically Claude 3.5 Sonnet initially, now Claude 4 Sonnet--into every project, two of those roles had fundamentally changed. Not eliminated. Changed. One senior dev became a full-time AI-directed engineer. The other shifted entirely to code review and architecture.

The catalyst: a Sanity + Next.js 14 project where we used Claude Code to scaffold the entire schema layer, generate GROQ queries, build 14 page templates, and write the deployment pipeline. What would have been 120 billable hours came in at 34. We looked at each other and said: "We need to restructure everything."

That's the honest origin. Not a grand strategy. A project that finished too fast.

What Does a Claude Code Agency Workflow Actually Look Like?

Here's a typical week on an active client build:

Monday: Architecture + Kickoff

Me: 2 hours defining component architecture, data model, API contracts
Me: 1 hour writing CLAUDE.md project instructions (more on this below)
Claude Code: generates initial project scaffold, installs dependencies, configures TypeScript strict mode, sets up linting

Tuesday-Thursday: Build Sprint

Me: 1-2 hours per day reviewing Claude Code output, catching errors, redirecting
Claude Code: 6-8 tasks per day--page components, API routes, CMS schema definitions, utility functions, test files
Me: architecture pivots, complex state management decisions, client Slack threads

Friday: Integration + QA

Me: 3-4 hours of manual QA, accessibility audit, performance testing
Claude Code: fixing bugs identified in QA, writing missing tests, generating documentation
Me: client demo prep, deployment to staging

Total human hours per week on an active build: 18-24. Down from 35-45 in our pre-AI workflow.

What AI Handles in Our Projects

Here's the specific task inventory--things Claude Code does on real client projects every week:

Code Generation (70-80% automated)

React/Next.js components: Page layouts, UI components from Figma specs described in prompts, form handlers
CMS schemas: Sanity schema types, Contentful content models as migration scripts, Payload CMS collection configs
API routes: Next.js Route Handlers, tRPC procedures, webhook endpoints
Database operations: Prisma schema changes, migration files, seed scripts
TypeScript types: Generating types from API responses, Zod validation schemas, shared type packages

Code Audits (saves 4-6 hours/week)

Reviewing existing codebases before refactor projects
Identifying unused dependencies, dead code, type inconsistencies
Generating audit reports with specific file:line references

Content Drafts (saves 3-5 hours/week)

RFP responses and technical proposals
Project documentation and README files
Client-facing technical explanations
SOW first drafts (always human-reviewed and rewritten)

Testing (saves 5-8 hours/week)

Vitest unit tests for utility functions
Playwright e2e test scaffolds
Test data generation and fixtures
Edge case identification we might miss

What We Still Hire Humans For

Task	Why AI Can't Do It (Yet)	Who We Hire	Typical Cost
Brand strategy	Requires understanding client's market position, competitors, customer psychology at a level AI hallucinates on	Contract brand strategist	$3,000-$8,000/project
Copy direction	Tone, voice, and persuasion architecture need human judgment	Freelance copywriter	$2,000-$5,000/project
Sales calls	Clients want to talk to a person who understands their business	We do this ourselves	Our time
Visual design	Figma work, art direction, design systems	Contract designer	$4,000-$12,000/project
Complex DevOps	Kubernetes configs, multi-region deployments, CI/CD for regulated industries	Contract DevOps engineer	$150-$200/hour
Legal review	Contracts, MSAs, IP clauses	Attorney	$350-$500/hour
Accessibility audits	Automated tools catch 30-40% of issues; real screen reader testing needs a human	A11y specialist	$1,500-$3,000/audit
User research	Talking to actual users, synthesizing feedback	UX researcher	$100-$150/hour

That's 8 categories where humans are non-negotiable.

Real Numbers: Cost-Per-MVP and Time-to-Deploy

Here are actual numbers from our last 6 client projects (Q1-Q2 2025), anonymized:

Project	Stack	Legacy Estimate	AI-Assisted Actual	Time-to-Deploy
SaaS marketing site	Next.js 15 + Sanity v3	$38,000	$11,500	12 days
E-commerce storefront	Next.js 15 + Shopify Storefront API	$52,000	$18,200	18 days
Portfolio/CMS for creative agency	Astro 5 + Payload CMS 3.0	$28,000	$8,400	10 days
SaaS dashboard MVP	Next.js 15 + Supabase + Prisma	$45,000	$14,800	16 days
Nonprofit site redesign	Next.js 14 + Contentful	$32,000	$9,200	11 days
Developer docs site	Astro 5 + MDX + Algolia	$22,000	$7,600	8 days

"Legacy estimate" is what we would have quoted in 2023 with our old team structure. "AI-assisted actual" is what the client paid in 2025.

Cost reduction: 62%. Time-to-first-deploy: 12.5 days.

These are all projects in our sweet spot--headless CMS sites and Next.js applications. Enterprise RBAC systems, real-time collaborative apps, or anything involving complex distributed systems would look different.

Our Claude Code Project Setup

Every project starts with a CLAUDE.md file in the repo root. This is the single most impactful thing we've done to improve AI output quality. Here's our template structure:

# Project: [Client Name]

## Tech Stack
- Framework: Next.js 15.1 (App Router)
- CMS: Sanity v3.72
- Styling: Tailwind CSS v4.0
- Language: TypeScript 5.7 (strict mode)
- Package manager: pnpm 9.x
- Node: 22 LTS

## Architecture Decisions
- All data fetching in Server Components
- Client components only for interactivity
- GROQ queries co-located with page components
- No barrel exports
- Prefer named exports

## Code Conventions
- Use `cn()` utility for conditional classes (already in lib/utils.ts)
- Error boundaries at route segment level
- All images through next/image with explicit dimensions
- Forms use react-hook-form + zod

## File Structure
[tree output of src/ directory]

## Known Constraints
- Client requires WCAG 2.2 AA
- Must support IE-- just kidding. Chrome 120+, Safari 17+, Firefox 121+
- Deploy target: Vercel (Pro plan, us-east-1)

## Do NOT
- Install new dependencies without asking
- Create files outside src/
- Use default exports (except for Next.js pages/layouts)
- Write CSS outside of Tailwind classes

This file eliminates roughly 40% of the "Claude went off the rails" incidents. Without it, you get generic code that doesn't match your project's patterns. With it, Claude Code generates components that look like your team wrote them.

We also use claude --dangerously-skip-permissions during scaffolding phases (never in production branches) and switch to the interactive approval mode once we're past initial setup. Cost per project in API usage: typically $40-$120 for a full build, running on Claude 4 Sonnet.

Is the One-Person Billion-Dollar Company Real?

No. But it's a thought experiment that reveals something real about where we are.

Evartology's piece on Substack--"How to Run a Company Alone in 2026"--lays out an impressive stack: AI for engineering, marketing, sales, operations, even hiring. It's a well-organized playbook, and I agree with about 60% of it. The parts about using AI for content drafts, code generation, and operational docs match our experience. But the piece underestimates the irreducibility of trust. Clients don't buy code. They buy confidence that someone understands their problem. That's a human thing.

Henry's piece (henrythe9th on Substack) about a solo founder who "cloned himself" with AI agents is more grounded. The specific example of using AI to handle customer support triage and first-draft responses resonates--we do something similar with technical proposal drafts. But the framing of "cloning" oversells it. What actually happened is task delegation to AI. You didn't clone your judgment. You offloaded your typing.

Nate's executive briefing on one-person businesses touches on the Carta data showing a growing percentage of solo-founder startups. That's real. Carta's data from early 2025 showed solo incorporations trending upward. But a solo-incorporated company on Carta isn't the same as a solo-operated company. Most of those founders hire contractors, agencies (like us), and fractional roles. They're solo on the cap table, not solo in practice.

Our take: the realistic version of this isn't one person doing a billion dollars. It's one person (or a very small team) doing $1M-$5M in revenue with 70-80% margins, handling the work that used to require 8-12 people. That's not a fantasy. We're watching it happen. But it requires AI competence, domain expertise, and an existing professional network. Not just a ChatGPT subscription.

What Doesn't Work Yet

1. Complex Multi-File Refactors

Claude Code can refactor a single file brilliantly. But when you need coordinated changes across 15+ files--say, changing a data model that touches API routes, components, types, tests, and CMS schemas simultaneously--it loses coherence around file 8-10. We've had it introduce breaking circular dependencies, forget to update imports in files it touched earlier in the session, and silently skip files. Our workaround: break refactors into 3-4 file batches and verify between each.

2. Design-to-Code from Figma

Despite the hype, generating production-quality components from Figma designs is still a 60% accuracy task at best. Claude Code (or any LLM) can't see your Figma file directly. You're describing layouts in words or pasting screenshots. The output gets the structure roughly right but misses spacing, responsive breakpoints, and interaction states. We still have a human translate designs to components, then use Claude Code to flesh out variants and states.

3. Performance Optimization

Claude Code will tell you to add React.memo() and call it a day. Real performance work--identifying unnecessary re-renders through React DevTools profiling, optimizing GROQ queries by analyzing Sanity's execution plans, reducing CLS by auditing third-party scripts--requires human observation of runtime behavior. AI can't profile your app.

4. Debugging Production Issues

When something breaks at 2 AM and the error is a cryptic Vercel Edge Runtime timeout, Claude Code can suggest possibilities. But it can't look at your Datadog dashboard, correlate the timing with a deploy, check if the CDN cache was purged, or realize that the issue is actually a DNS propagation delay from a domain transfer that happened 48 hours ago. Production debugging is context-heavy and AI context windows are still too narrow.

5. Anything Requiring Visual Judgment

Is this animation too fast? Does this color combination feel right for a luxury brand? Is the whitespace balanced? Claude Code has zero opinions here. Don't ask.

6. Long-Running Session Coherence

After about 45-60 minutes of continuous work in a single Claude Code session, we notice quality degradation. It starts repeating patterns from earlier in the session even when the context has changed. It forgets constraints from the CLAUDE.md. We restart sessions every 45 minutes as a rule. This is a real productivity tax--probably 20-30 minutes of re-orientation time per day.

How We Scope Client Projects Now

Our scoping process changed fundamentally. Here's the before and after:

Before (2023)

Discovery call (1 hour)
Internal architecture discussion (2 hours)
Detailed SOW with hourly estimates per feature (4-6 hours)
Client review cycle (1-2 weeks)
Signed contract → kickoff

After (2025)

Discovery call (45 minutes)
Claude Code generates SOW first draft from call notes (15 minutes of prompting)
I review and rewrite the SOW (1 hour)
We build a throwaway proof-of-concept of the hardest technical challenge using Claude Code (2-3 hours)
Scope is now based on actual implementation data, not guesses
Client review (3-5 days)
Signed contract → kickoff

Step 4 is the key difference. We used to estimate "Shopify Storefront API integration: 40 hours" based on experience. Now we actually build a rough version in 2-3 hours and know it's 22 hours with AI assistance. Our estimates are within 15% of actuals. They used to be within 30-40%.

This costs us 3-4 hours of unbilled pre-sales work per project. But our close rate went from ~35% to ~55% because clients see a working prototype before signing.

The Founder Math: Hours Per Week Breakdown

Here's how my week actually breaks down as an agency founder using Claude Code:

Activity	Hours/Week	AI-Assisted?
Client calls and Slack	6	No
Architecture and technical decisions	5	Partially (Claude Code for research)
Code review of AI output	8	No
Directing Claude Code sessions	6	N/A (this IS the AI work)
Business ops (invoicing, contracts, planning)	3	Partially (drafts)
Sales and proposals	3	Partially (first drafts)
Manual QA and testing	3	No
Learning and staying current	2	No
Total	36

36 hours a week. Not 80. Not 20. And that's running an agency doing $60K-$80K/month in revenue with 2 active client projects at any time.

Pre-AI, this same output required 3.5 FTEs and my 50-hour weeks. The math is real. But notice: 22 of those 36 hours are still entirely human work. AI didn't eliminate work. It changed the ratio of thinking-to-typing.

FAQ

How much does Claude Code cost per month for agency work?

We spend approximately $180-$300/month on Claude API usage for Claude Code across all projects. This is on the Claude 4 Sonnet model. Individual project costs range from $40-$120 depending on scope and session count.

Can Claude Code replace a junior developer?

It replaces the output of a junior developer but not the role. Someone still needs to direct, review, and correct the AI's work. That someone needs senior-level judgment. AI-generated code without expert review ships bugs faster.

What's the best CMS to pair with a Claude Code workflow?

Sanity v3, because its schema definitions are TypeScript files that Claude Code generates exceptionally well. Payload CMS 3.0 is a close second. Contentful works but its management API is more complex for AI to work with reliably.

Does Claude Code work for mobile app development?

We've used it for React Native (Expo SDK 52) projects with decent results for component generation and navigation setup. It struggles more with native module configuration and platform-specific debugging. Roughly 40-50% productivity gain vs. 60-70% for web projects.

How do you handle client IP concerns with AI-generated code?

Our MSA includes a clause stating all deliverables are original work product regardless of tooling used. Anthropic's terms (as of June 2025) grant users rights to outputs. We don't send client proprietary data to the API--only code patterns and generic implementations.

What happens when Claude Code generates incorrect code?

It happens on roughly 15-20% of tasks. Our workflow accounts for this with mandatory human code review on every PR. Common failure modes: incorrect TypeScript generics, stale API patterns from training data, and missing error handling for edge cases. We budget review time into every estimate.

Claude Code Agency Workflow: How We Run Projects in 2025

TL;DR

Table of Contents

Why We Rebuilt Our Agency Around Claude Code

What Does a Claude Code Agency Workflow Actually Look Like?

Monday: Architecture + Kickoff

Tuesday-Thursday: Build Sprint

Friday: Integration + QA

What AI Handles in Our Projects

Code Generation (70-80% automated)

Code Audits (saves 4-6 hours/week)

Content Drafts (saves 3-5 hours/week)

Testing (saves 5-8 hours/week)

What We Still Hire Humans For

Real Numbers: Cost-Per-MVP and Time-to-Deploy

Our Claude Code Project Setup

Is the One-Person Billion-Dollar Company Real?

What Doesn't Work Yet

1. Complex Multi-File Refactors

2. Design-to-Code from Figma

3. Performance Optimization

4. Debugging Production Issues

5. Anything Requiring Visual Judgment

6. Long-Running Session Coherence

How We Scope Client Projects Now

Before (2023)

After (2025)

The Founder Math: Hours Per Week Breakdown

FAQ

How much does Claude Code cost per month for agency work?

Can Claude Code replace a junior developer?

What's the best CMS to pair with a Claude Code workflow?

Does Claude Code work for mobile app development?

How do you handle client IP concerns with AI-generated code?

What happens when Claude Code generates incorrect code?

Let's build
something together.

TL;DR

Table of Contents

Why We Rebuilt Our Agency Around Claude Code

What Does a Claude Code Agency Workflow Actually Look Like?

Monday: Architecture + Kickoff

Tuesday-Thursday: Build Sprint

Friday: Integration + QA

What AI Handles in Our Projects

Code Generation (70-80% automated)

Code Audits (saves 4-6 hours/week)

Content Drafts (saves 3-5 hours/week)

Testing (saves 5-8 hours/week)

What We Still Hire Humans For

Real Numbers: Cost-Per-MVP and Time-to-Deploy

Our Claude Code Project Setup

Is the One-Person Billion-Dollar Company Real?

What Doesn't Work Yet

1. Complex Multi-File Refactors

2. Design-to-Code from Figma

3. Performance Optimization

4. Debugging Production Issues

5. Anything Requiring Visual Judgment

6. Long-Running Session Coherence

How We Scope Client Projects Now

Before (2023)

After (2025)

The Founder Math: Hours Per Week Breakdown

FAQ

How much does Claude Code cost per month for agency work?

Can Claude Code replace a junior developer?

What's the best CMS to pair with a Claude Code workflow?

Does Claude Code work for mobile app development?

How do you handle client IP concerns with AI-generated code?

What happens when Claude Code generates incorrect code?

Keep reading

Hire a Claude Code Developer in 2026: Rates, Red Flags & More

10 Best Claude Code Agencies in 2026 (Ranked by Work Shipped)

Claude Code vs Cursor for Agencies: A Workflow Architecture Guide

Let's build something together.

Let's build
something together.