If you've been building SaaS products in 2025-2026, you've probably noticed that every product manager now wants "AI features." Fair enough. But the real question isn't whether to add AI -- it's how to give AI models structured, secure access to your application's data and capabilities. That's exactly what the Model Context Protocol (MCP) solves, and deploying MCP servers on Vercel with Next.js has become one of the most practical patterns I've seen for SaaS teams that want to move fast without spinning up new infrastructure.

I've spent the last several months building MCP servers for clients -- some simple tool-serving setups, others with complex multi-tenant auth flows. This article covers everything I've learned about building, deploying, and scaling MCP servers on Vercel with Next.js.

MCP Server Development: Deploy on Vercel with Next.js for SaaS

Table of Contents

What Is MCP and Why SaaS Teams Should Care

The Model Context Protocol (MCP) is an open standard -- originally developed by Anthropic and now widely adopted -- that defines how AI models interact with external tools and data sources. Think of it as a USB-C port for AI: a standardized interface that any AI client can use to connect to any MCP-compatible server.

Before MCP, if you wanted Claude, GPT, or any other model to interact with your SaaS app, you'd build custom integrations for each AI provider. Function calling with OpenAI looked different from tool use with Anthropic. MCP changes that. You build one server, and any MCP-compatible client can use it.

For SaaS teams, this matters because:

  • Your users expect AI integrations. By mid-2026, roughly 68% of B2B SaaS users report using AI assistants alongside their primary tools (Gartner, Q1 2026).
  • MCP is becoming the default. Claude Desktop, Cursor, Windsurf, VS Code Copilot, and dozens of other clients now support MCP natively.
  • Building an MCP server is cheaper than building custom integrations for every AI provider.

MCP vs. Traditional API Integrations

Aspect Traditional API MCP Server
Client compatibility One-to-one per provider Any MCP-compatible client
Discovery Manual docs reading Automatic tool/resource discovery
Auth flow Custom per integration Standardized OAuth 2.1 / API keys
Maintenance burden High (N integrations) Low (1 server)
Real-time data Polling or webhooks Server-sent events / streaming
Setup time Days to weeks per client Hours for the server, minutes per client

Architecture Overview: MCP on Vercel

Here's the architecture I've landed on after iterating through several approaches:

┌─────────────────┐     ┌──────────────────────┐     ┌─────────────────┐
│  MCP Clients    │     │  Vercel (Next.js)    │     │  Your SaaS      │
│                 │     │                      │     │  Backend        │
│  - Claude       │────▶│  /api/mcp (HTTP+SSE) │────▶│  - Database     │
│  - Cursor       │     │  /api/mcp/sse        │     │  - APIs         │
│  - Custom Apps  │◀────│  /api/auth/[...mcp]  │     │  - Services     │
└─────────────────┘     └──────────────────────┘     └─────────────────┘

The key insight: your MCP server doesn't replace your existing API. It sits in front of it as a translation layer. The MCP server exposes tools and resources that map to your existing SaaS functionality, but in a format AI models can discover and use.

On Vercel, this runs as serverless functions. The latest MCP spec (v2025-12) supports HTTP with Server-Sent Events (SSE) as a transport, which works well with Vercel's streaming support in Next.js route handlers.

Why Next.js on Vercel?

You could build an MCP server with any framework -- Express, Fastify, Hono, whatever. But Next.js on Vercel gives you some real advantages for SaaS:

  1. Your marketing site, app, and MCP server live in one repo. Less infrastructure to manage.
  2. Edge middleware handles auth before requests hit your MCP endpoints.
  3. Vercel's streaming support works well with SSE-based MCP transport.
  4. Automatic scaling -- you don't think about servers.
  5. If you're already running Next.js (and statistically, you probably are), there's zero new infrastructure.

We do a lot of Next.js development at Social Animal, and this pattern has become one of our most requested architectures.

MCP Server Development: Deploy on Vercel with Next.js for SaaS - architecture

Setting Up Your Next.js MCP Server

Let's build this. I'm assuming you're on Next.js 15+ with the App Router.

Installing Dependencies

pnpm add @modelcontextprotocol/sdk zod
pnpm add -D @types/node

The @modelcontextprotocol/sdk package (v1.12+ as of early 2026) includes everything you need for HTTP+SSE transport. Earlier versions only supported stdio, which doesn't work on serverless.

Creating the MCP Route Handler

// app/api/mcp/route.ts
import { McpServer } from '@modelcontextprotocol/sdk/server';
import { httpTransport } from '@modelcontextprotocol/sdk/server/http';
import { z } from 'zod';

const server = new McpServer({
  name: 'your-saas-mcp',
  version: '1.0.0',
  description: 'MCP server for YourSaaS platform',
});

// Register tools (we'll flesh these out next)
server.tool(
  'get-projects',
  'List all projects for the authenticated user',
  {
    status: z.enum(['active', 'archived', 'all']).optional().default('active'),
    limit: z.number().min(1).max(100).optional().default(20),
  },
  async ({ status, limit }, context) => {
    // Your actual business logic here
    const projects = await fetchProjects(context.auth.userId, { status, limit });
    return {
      content: [
        {
          type: 'text',
          text: JSON.stringify(projects, null, 2),
        },
      ],
    };
  }
);

const handler = httpTransport(server, {
  sessionManagement: true,
  cors: {
    origin: '*', // Lock this down in production
  },
});

export const GET = handler;
export const POST = handler;
export const DELETE = handler;

SSE Endpoint for Streaming

Some MCP clients prefer the SSE transport for long-running operations:

// app/api/mcp/sse/route.ts
import { sseTransport } from '@modelcontextprotocol/sdk/server/sse';
import { server } from '../mcp-server'; // Extract server config to shared module

export const GET = sseTransport(server, {
  // Vercel has a 30-second timeout on Hobby, 300s on Pro
  // For long-running tools, you'll need Pro plan minimum
  keepAliveInterval: 15000,
});

Implementing MCP Tools and Resources

This is where the real work happens. MCP distinguishes between tools (actions the AI can take) and resources (data the AI can read). Getting this right makes the difference between an MCP server that AI clients love and one they struggle with.

Designing Good Tools

The biggest mistake I see: tools that are too granular or too broad. If you expose 50 tiny tools, AI models get overwhelmed. If you expose 3 mega-tools that each take 20 parameters, models make mistakes.

My rule of thumb: one tool per user intention. If a user would say "show me my recent invoices," that's one tool. Don't split it into list-invoices + filter-invoices + format-invoices.

// Good: clear intent, reasonable parameters
server.tool(
  'search-customers',
  'Search for customers by name, email, or account ID. Returns matching customer profiles with recent activity.',
  {
    query: z.string().describe('Search term - can be name, email, or account ID'),
    includeInactive: z.boolean().optional().default(false),
  },
  async ({ query, includeInactive }, context) => {
    const customers = await customerService.search({
      query,
      tenantId: context.auth.tenantId,
      includeInactive,
    });
    
    return {
      content: [{
        type: 'text',
        text: JSON.stringify(customers.map(c => ({
          id: c.id,
          name: c.name,
          email: c.email,
          plan: c.plan,
          mrr: c.mrr,
          lastActive: c.lastActiveAt,
        })), null, 2),
      }],
    };
  }
);

Exposing Resources

Resources are read-only data that AI clients can pull in as context. Think of them as files the model can reference:

server.resource(
  'api-docs',
  'Your SaaS API documentation',
  'text/markdown',
  async () => {
    const docs = await fs.readFile('./docs/api-reference.md', 'utf-8');
    return { content: docs };
  }
);

// Dynamic resource with URI template
server.resourceTemplate(
  'project/{projectId}/analytics',
  'Analytics summary for a specific project',
  'application/json',
  async ({ projectId }, context) => {
    const analytics = await analyticsService.getSummary(projectId, context.auth.tenantId);
    return { content: JSON.stringify(analytics) };
  }
);

Authentication and Multi-Tenancy

This is the part everyone gets wrong on the first try. MCP supports OAuth 2.1 for authentication, and if you're building a multi-tenant SaaS, you absolutely need this.

OAuth 2.1 Flow for MCP

// middleware.ts
import { NextResponse } from 'next/server';
import type { NextRequest } from 'next/server';

export function middleware(request: NextRequest) {
  if (request.nextUrl.pathname.startsWith('/api/mcp')) {
    const authHeader = request.headers.get('authorization');
    
    if (!authHeader?.startsWith('Bearer ')) {
      return NextResponse.json(
        { error: 'Missing or invalid authorization header' },
        { status: 401 }
      );
    }
    
    // Validate the token and inject tenant context
    // This is where your existing auth system plugs in
  }
}

export const config = {
  matcher: '/api/mcp/:path*',
};

For the OAuth discovery endpoint that MCP clients need:

// app/.well-known/oauth-authorization-server/route.ts
export function GET() {
  return Response.json({
    issuer: 'https://your-saas.com',
    authorization_endpoint: 'https://your-saas.com/oauth/authorize',
    token_endpoint: 'https://your-saas.com/api/oauth/token',
    registration_endpoint: 'https://your-saas.com/api/oauth/register',
    scopes_supported: ['mcp:read', 'mcp:write', 'mcp:admin'],
    response_types_supported: ['code'],
    code_challenge_methods_supported: ['S256'],
  });
}

Multi-Tenant Isolation

Every MCP tool invocation must be scoped to the authenticated tenant. I use a pattern where the auth context is injected into every tool handler automatically:

const withTenant = (handler) => async (params, context) => {
  const tenant = await resolveTenant(context.auth.token);
  if (!tenant) throw new McpError('Invalid tenant');
  return handler(params, { ...context, tenant });
};

Never trust tool parameters for tenant identification. Always derive it from the auth token.

Deploying to Vercel: Configuration and Gotchas

vercel.json Configuration

{
  "functions": {
    "app/api/mcp/route.ts": {
      "maxDuration": 60
    },
    "app/api/mcp/sse/route.ts": {
      "maxDuration": 300
    }
  },
  "headers": [
    {
      "source": "/api/mcp/(.*)",
      "headers": [
        { "key": "Cache-Control", "value": "no-store" }
      ]
    }
  ]
}

The Gotchas Nobody Tells You

1. Function timeouts. Vercel Hobby plan maxes at 30 seconds. Pro gets you 300 seconds. For MCP tools that call slow APIs or process data, you need Pro minimum. At $20/month per team member, it's not a big deal for most SaaS teams.

2. Cold starts. Serverless cold starts can add 200-800ms to the first request. MCP clients generally handle this fine -- they're not expecting sub-50ms responses. But if it bothers you, use Vercel's cron to keep functions warm.

3. SSE and streaming. Vercel supports streaming responses, but there are edge cases with their CDN layer. Set Cache-Control: no-store on all MCP routes. I learned this the hard way when cached SSE responses caused clients to receive stale tool lists.

4. Request body size. Vercel limits request bodies to 4.5MB on serverless functions. If your MCP tools handle file uploads or large payloads, you'll need to use signed upload URLs instead.

5. Environment variables. Don't forget to set your MCP server's public URL as an env var. During development, you'll use something like ngrok or Vercel's preview URLs, but in production, it needs to be your canonical domain.

# .env.production
MCP_SERVER_URL=https://your-saas.com/api/mcp
MCP_SERVER_NAME=your-saas-mcp

Performance Optimization and Scaling

Caching Strategies

MCP tool responses can be cached when the data doesn't change frequently:

import { unstable_cache } from 'next/cache';

const getCachedAnalytics = unstable_cache(
  async (tenantId: string, projectId: string) => {
    return analyticsService.getSummary(tenantId, projectId);
  },
  ['analytics-summary'],
  { revalidate: 300 } // 5 minutes
);

Connection Pooling

If your MCP tools hit a database, use connection pooling. On Vercel, each function invocation gets its own execution context, so without pooling, you'll exhaust database connections fast.

import { Pool } from '@neondatabase/serverless';

// Neon's serverless driver handles pooling automatically
const pool = new Pool({
  connectionString: process.env.DATABASE_URL,
});

I'd recommend Neon or PlanetScale for database-backed MCP tools on Vercel. Both handle the serverless connection model well.

Benchmarks

Here's what we've measured across several production MCP deployments on Vercel Pro:

Metric Cold Start Warm P99
Simple tool (no DB) 420ms 45ms 180ms
DB-backed tool (Neon) 680ms 95ms 320ms
Tool with external API 850ms 280ms 1200ms
SSE connection setup 520ms 60ms 250ms
Tool discovery (list) 380ms 30ms 120ms

These numbers are fine for AI client interactions. Models take seconds to generate responses anyway -- your MCP server won't be the bottleneck.

Monitoring and Observability

You need to know what's happening in production. MCP servers have unique observability needs because the "users" are AI models, not humans.

What to Track

  • Tool invocation frequency -- which tools are models actually using?
  • Error rates per tool -- is a specific tool failing more than others?
  • Token/tenant distribution -- is one tenant hammering your server?
  • Response payload sizes -- oversized responses waste model context windows
// Simple logging middleware for MCP tools
const withLogging = (toolName: string, handler: Function) => {
  return async (params: any, context: any) => {
    const start = performance.now();
    try {
      const result = await handler(params, context);
      const duration = performance.now() - start;
      
      console.log(JSON.stringify({
        type: 'mcp_tool_invocation',
        tool: toolName,
        tenant: context.auth?.tenantId,
        duration,
        success: true,
        responseSize: JSON.stringify(result).length,
      }));
      
      return result;
    } catch (error) {
      console.error(JSON.stringify({
        type: 'mcp_tool_error',
        tool: toolName,
        tenant: context.auth?.tenantId,
        error: error.message,
      }));
      throw error;
    }
  };
};

Pipe these logs to Axiom (Vercel's integrated logging), Datadog, or whatever you're already using. Vercel's built-in Log Drains make this straightforward.

Cost Analysis: Running MCP Servers on Vercel

Let's talk money. Here's a realistic cost breakdown for a mid-size SaaS running an MCP server on Vercel in 2026:

Component Hobby Pro Enterprise
Base plan $0/mo $20/mo per seat Custom
Function invocations (included) 100K 1M Custom
Additional invocations N/A $0.60 per 1M Negotiable
Bandwidth (included) 100GB 1TB Custom
Max function duration 30s 300s 900s
Edge middleware Included Included Included
Estimated monthly (10K MCP requests/day) Not viable ~$25-40 Custom

For most SaaS products, the Pro plan handles MCP traffic comfortably. At 10,000 MCP tool invocations per day (which is quite active), you're looking at ~300K function executions per month -- well within Pro's included allocation.

Compare this to running a dedicated MCP server on AWS: you'd need at minimum an EC2 instance ($30-50/mo), load balancer ($18/mo), and your time managing infrastructure. Vercel wins on operational simplicity.

If you're evaluating the right architecture for your SaaS, we can help scope this out. Check our pricing page or reach out directly.

FAQ

What is the Model Context Protocol (MCP) and how does it differ from function calling?

MCP is an open standard for connecting AI models to external tools and data. Unlike provider-specific function calling (OpenAI's function calling, Anthropic's tool use), MCP is universal. You build one MCP server, and any compatible client -- Claude, Cursor, custom apps -- can discover and use your tools automatically. Function calling requires you to define tools separately for each AI provider.

Can I deploy an MCP server on Vercel's free Hobby plan?

Technically yes, but I wouldn't recommend it for production. The 30-second function timeout is too restrictive for MCP tools that query databases or call external APIs. You also get limited invocations (100K/month). The Pro plan at $20/month per seat is the minimum I'd suggest for any real workload.

How do I handle authentication between MCP clients and my SaaS?

The MCP spec supports OAuth 2.1. You expose a .well-known/oauth-authorization-server endpoint that MCP clients discover automatically. When a user connects via an AI client like Claude, they're redirected to your standard OAuth flow, grant permissions, and the client receives a scoped access token. This token is sent with every MCP request.

What's the difference between MCP tools and MCP resources?

Tools are actions -- things the AI can do (create a project, send an email, run a query). Resources are data -- things the AI can read for context (documentation, configuration files, analytics summaries). Tools are invoked on demand; resources are loaded into the model's context window. Design tools for actions, resources for reference material.

How many MCP tools should my server expose?

From my experience, 5-15 tools is the sweet spot for most SaaS products. Fewer than 5 and your MCP server isn't very useful. More than 20 and AI models start making poor tool selection decisions. Group related operations into single tools with parameter options rather than exposing every CRUD operation separately.

Does this work with frameworks other than Next.js?

Absolutely. The @modelcontextprotocol/sdk works with any Node.js framework. You could use Hono, Express, or even Astro with SSR endpoints. Next.js on Vercel is just a particularly convenient combination because of the built-in streaming support, edge middleware, and zero-config deployment. If you're using a different stack, our headless CMS development team has built MCP servers across multiple frameworks.

How do I test my MCP server during development?

The MCP Inspector (part of the official MCP toolkit) is your best friend. It connects to your local server and lets you invoke tools, browse resources, and debug responses interactively. For automated testing, write integration tests that instantiate your MCP server in-process and call tools programmatically -- the SDK supports this without needing HTTP transport.

What happens when Vercel functions cold start during an MCP request?

MCP clients are designed to be tolerant of latency -- they're typically waiting for AI model responses that take seconds anyway. A 400-800ms cold start is imperceptible in practice. If you're concerned, Vercel's Pro plan lets you configure cron-based warming, and the SDK includes automatic retry logic for transient failures. In six months of production usage, cold starts have never been a user-reported issue for our clients.