Building a 137K Listing Global Directory with Next.js, Supabase & Vercel ISR
Last year, we shipped a global directory with 137,000 listings. Not a prototype. Not a "we'll optimize later" MVP. A production system that serves millions of page views, ranks for thousands of long-tail keywords, and regenerates pages on-demand without breaking a sweat. This is the story of how we built it — and the architectural decisions that made it possible.
The stack: Next.js 14 (App Router), Supabase (PostgreSQL + Edge Functions), Vercel (hosting + ISR), and a healthy dose of pragmatism. We made mistakes. We hit walls. We rewrote things we thought were finished. But the final architecture handles 137,000+ dynamic pages with sub-200ms TTFB globally, and our Supabase bill stays under $100/month.
If you're building something similar — a marketplace, a directory, a listings platform — this is the article I wish existed when we started.
Table of Contents
- Why This Stack
- The Data Layer: Supabase at Scale
- Page Generation Strategy: ISR, SSG, and the 137K Problem
- URL Architecture and SEO at Scale
- Search and Filtering: The Hard Part
- Performance Budgets and Edge Caching
- Monitoring and Observability in Production
- Cost Breakdown: What This Actually Costs
- What We'd Do Differently
- FAQ

Why This Stack
We evaluated a lot of options before landing on Next.js + Supabase + Vercel. The core requirements were:
- 137,000+ unique pages that search engines could crawl and index
- Sub-second page loads globally (users in 40+ countries)
- Dynamic data — listings update daily, some hourly
- Full-text search with faceted filtering
- Budget-conscious — this wasn't a VC-funded moonshot
We considered Astro (great for static sites, but we needed more dynamic interactivity — though our Astro development team has shipped excellent directory projects with it). We looked at WordPress + WPEngine. We briefly considered a pure SPA with Algolia.
Next.js won because of one killer feature: Incremental Static Regeneration. ISR meant we didn't have to choose between static performance and dynamic content. We could have both.
Supabase won over PlanetScale and Neon because of the full package — auth, storage, edge functions, and a genuinely good Postgres implementation with Row Level Security. For a directory, you need all of that.
Vercel was the deployment target because ISR works best on Vercel (unsurprisingly). The integration is native. On-demand revalidation just works.
What About Self-Hosting?
We prototyped a self-hosted Next.js setup on Railway. It worked, but ISR on self-hosted Next.js has quirks. The cache invalidation story is worse. You need to manage your own CDN layer. For a team of 3 engineers, the operational overhead wasn't worth the $200/month we'd save.
The Data Layer: Supabase at Scale
Our Supabase database holds 137,000 listings, each with 40-60 fields. Categories, locations, contact info, rich descriptions, images, ratings, operating hours — the works.
Schema Design
The biggest decision was whether to use a normalized relational schema or a more document-oriented approach with JSONB columns. We went hybrid:
CREATE TABLE listings (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
slug TEXT UNIQUE NOT NULL,
title TEXT NOT NULL,
description TEXT,
category_id UUID REFERENCES categories(id),
city_id UUID REFERENCES cities(id),
country_code TEXT NOT NULL,
coordinates GEOGRAPHY(POINT, 4326),
contact JSONB DEFAULT '{}',
attributes JSONB DEFAULT '{}',
media JSONB DEFAULT '[]',
rating_avg NUMERIC(3,2) DEFAULT 0,
rating_count INTEGER DEFAULT 0,
status TEXT DEFAULT 'active',
published_at TIMESTAMPTZ,
updated_at TIMESTAMPTZ DEFAULT NOW(),
search_vector TSVECTOR
);
CREATE INDEX idx_listings_category ON listings(category_id) WHERE status = 'active';
CREATE INDEX idx_listings_city ON listings(city_id) WHERE status = 'active';
CREATE INDEX idx_listings_country ON listings(country_code) WHERE status = 'active';
CREATE INDEX idx_listings_coordinates ON listings USING GIST(coordinates);
CREATE INDEX idx_listings_search ON listings USING GIN(search_vector);
CREATE INDEX idx_listings_slug ON listings(slug);
Structured relational data for things we filter on (categories, cities, countries). JSONB for semi-structured stuff that varies per listing (contact methods, custom attributes, media arrays). This gave us the best of both worlds — fast indexed queries on the relational columns and flexibility on the rest.
The Search Vector
That search_vector column is critical. We populate it with a trigger:
CREATE OR REPLACE FUNCTION update_search_vector()
RETURNS TRIGGER AS $$
BEGIN
NEW.search_vector :=
setweight(to_tsvector('english', COALESCE(NEW.title, '')), 'A') ||
setweight(to_tsvector('english', COALESCE(NEW.description, '')), 'B') ||
setweight(to_tsvector('english', COALESCE(NEW.attributes->>'keywords', '')), 'C');
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
This means every listing is full-text searchable through Postgres itself. No external search service needed for the first 100K listings. We'll talk about when this breaks down later.
Connection Pooling
Supabase uses PgBouncer for connection pooling. With ISR, you get bursts of serverless function invocations — each one needs a database connection. Without pooling, you'll exhaust connections in minutes.
We use the pooled connection string (port 6543) for all serverless contexts and the direct connection (port 5432) only for migrations and admin tasks. This is one of those things that sounds obvious but catches people.
// lib/supabase.ts
import { createClient } from '@supabase/supabase-js'
export const supabase = createClient(
process.env.NEXT_PUBLIC_SUPABASE_URL!,
process.env.SUPABASE_SERVICE_ROLE_KEY!, // Server-side only
{
db: { schema: 'public' },
auth: { persistSession: false }
}
)
Page Generation Strategy: ISR, SSG, and the 137K Problem
This is where things get interesting. And where we made our biggest early mistake.
The Naive Approach (Don't Do This)
Our first attempt: generate all 137,000 pages at build time using generateStaticParams. The build took 4 hours and 22 minutes. Vercel's free tier has a 45-minute build limit. Even the Pro tier caps at 6 hours. But the real problem wasn't the timeout — it was the feedback loop. Every deploy took half a day. That's unworkable.
The ISR Approach (What Actually Works)
Here's the strategy that shipped:
- At build time: Generate the top 5,000 pages (by traffic) statically
- On first request: Generate remaining pages on-demand and cache them
- Revalidation: Time-based (every 3600 seconds) + on-demand via webhook
// app/listing/[slug]/page.tsx
import { supabase } from '@/lib/supabase'
import { notFound } from 'next/navigation'
export async function generateStaticParams() {
// Only pre-generate top listings by traffic
const { data } = await supabase
.from('listings')
.select('slug')
.eq('status', 'active')
.order('rating_count', { ascending: false })
.limit(5000)
return (data || []).map((listing) => ({
slug: listing.slug,
}))
}
export const revalidate = 3600 // Revalidate every hour
export default async function ListingPage({ params }: { params: { slug: string } }) {
const { data: listing, error } = await supabase
.from('listings')
.select(`
*,
category:categories(*),
city:cities(*, country:countries(*))
`)
.eq('slug', params.slug)
.eq('status', 'active')
.single()
if (!listing || error) notFound()
return <ListingDetail listing={listing} />
}
On-Demand Revalidation
When a listing owner updates their data, we don't want to wait up to an hour for the page to refresh. Supabase webhooks trigger a Next.js API route:
// app/api/revalidate/route.ts
import { revalidatePath } from 'next/cache'
import { NextRequest, NextResponse } from 'next/server'
export async function POST(request: NextRequest) {
const secret = request.headers.get('x-revalidation-secret')
if (secret !== process.env.REVALIDATION_SECRET) {
return NextResponse.json({ error: 'Unauthorized' }, { status: 401 })
}
const { slug, type } = await request.json()
if (type === 'listing') {
revalidatePath(`/listing/${slug}`)
revalidatePath(`/`) // Revalidate homepage too
}
return NextResponse.json({ revalidated: true })
}
This gives us the best of both worlds: static-site performance with dynamic-site freshness. Builds complete in under 8 minutes. Pages that haven't been pre-generated get created on first visit and cached at the edge.
The Numbers
| Metric | Full SSG (Naive) | ISR (Production) |
|---|---|---|
| Build time | 4h 22m | 7m 40s |
| Pages at deploy | 137,000 | 5,000 |
| First visit (uncached) | N/A | ~800ms |
| Subsequent visits | ~120ms | ~120ms |
| Revalidation latency | Full redeploy | < 2 seconds |
| Monthly build minutes | Way over limit | ~230 minutes |

URL Architecture and SEO at Scale
With 137,000 pages, URL structure isn't an afterthought — it's architecture. Every URL is a ranking opportunity.
The URL Hierarchy
/ → Homepage
/categories/[category-slug] → Category pages (48 categories)
/locations/[country]/[city] → Location pages
/listing/[listing-slug] → Individual listing
/search?q=...&category=...&city=... → Search results (noindex)
Category + location intersection pages are the real SEO goldmine:
/categories/restaurants/us/new-york → "Restaurants in New York"
/categories/hotels/uk/london → "Hotels in London"
These intersection pages are generated dynamically with ISR. There are roughly 12,000 valid combinations. Each one targets a specific long-tail keyword.
Sitemap Generation
With 137K URLs, you need sitemap index files. Google's limit is 50,000 URLs per sitemap.
// app/sitemap/[id]/route.ts
export async function GET(request: Request, { params }: { params: { id: string } }) {
const page = parseInt(params.id)
const perPage = 45000 // Stay under the 50K limit
const offset = page * perPage
const { data: listings } = await supabase
.from('listings')
.select('slug, updated_at')
.eq('status', 'active')
.order('id')
.range(offset, offset + perPage - 1)
const xml = generateSitemapXml(listings)
return new Response(xml, {
headers: { 'Content-Type': 'application/xml' },
})
}
We split into 4 sitemaps: sitemap-0.xml through sitemap-3.xml, referenced by a sitemap index. Google Search Console indexed 98% of submitted URLs within 6 weeks.
Structured Data
Every listing page includes JSON-LD structured data. For a directory, LocalBusiness schema is critical:
const structuredData = {
'@context': 'https://schema.org',
'@type': 'LocalBusiness',
name: listing.title,
description: listing.description,
address: {
'@type': 'PostalAddress',
addressLocality: listing.city.name,
addressCountry: listing.city.country.code,
},
geo: {
'@type': 'GeoCoordinates',
latitude: listing.coordinates?.lat,
longitude: listing.coordinates?.lng,
},
aggregateRating: listing.rating_count > 0 ? {
'@type': 'AggregateRating',
ratingValue: listing.rating_avg,
reviewCount: listing.rating_count,
} : undefined,
}
Search and Filtering: The Hard Part
Search is always the hard part. Always.
Phase 1: Postgres Full-Text Search
For our initial launch, Postgres tsvector search handled everything. It's fast enough for 137K rows with a GIN index. Query times averaged 40-80ms.
const { data } = await supabase
.from('listings')
.select('id, slug, title, description, category:categories(name)')
.textSearch('search_vector', query, { type: 'websearch' })
.eq('status', 'active')
.eq('country_code', countryFilter)
.order('rating_avg', { ascending: false })
.range(0, 19)
Phase 2: When Postgres Wasn't Enough
At around 80,000 listings, complex faceted searches (category + location + text + sort) started hitting 300-500ms. Acceptable for most apps, but our users expected instant results.
We added Typesense as a search layer. Not Algolia (too expensive at our scale — we'd be paying $500+/month). Not Meilisearch (great, but Typesense's geo-search was better for our use case).
Typesense runs on a single $48/month Hetzner instance. Syncs from Supabase via a nightly full reindex + real-time webhook updates. Search queries now average 8-15ms.
| Search Solution | Query Time (p50) | Query Time (p99) | Monthly Cost | Faceted Search |
|---|---|---|---|---|
| Postgres FTS | 45ms | 320ms | $0 (included) | Limited |
| Typesense | 9ms | 28ms | $48 | Excellent |
| Algolia | ~5ms | ~15ms | $500+ | Excellent |
| Meilisearch | ~8ms | ~22ms | $48 (self-hosted) | Good |
Performance Budgets and Edge Caching
We set aggressive performance targets from day one:
- TTFB: < 200ms (global p75)
- LCP: < 1.5s
- CLS: < 0.05
- Total page weight: < 300KB (initial load)
Vercel Edge Network
ISR pages are cached at Vercel's edge network — 100+ PoPs globally. Once a page is generated and cached, it serves from the nearest edge location. This is why TTFB stays under 200ms even for users in Southeast Asia or South America.
Image Optimization
Each listing has 1-8 images. That's potentially over a million images. We use Vercel's built-in image optimization with next/image:
<Image
src={listing.media[0]?.url}
alt={listing.title}
width={800}
height={600}
sizes="(max-width: 768px) 100vw, (max-width: 1200px) 50vw, 33vw"
loading={index === 0 ? 'eager' : 'lazy'}
quality={75}
/>
Images are stored in Supabase Storage and served through Vercel's image CDN. The original images are often 2-5MB; after optimization, they're 40-120KB. This alone saved us roughly 80% on bandwidth.
Monitoring and Observability in Production
Running 137K pages in production without monitoring is like driving blindfolded. Here's our stack:
- Vercel Analytics: Core Web Vitals, real user monitoring
- Sentry: Error tracking (we catch ~50 errors/day, mostly from bots sending garbage)
- Supabase Dashboard: Database performance, query analysis
- Checkly: Synthetic monitoring, 5-minute intervals on critical paths
- Google Search Console: Index coverage, crawl stats
The most valuable monitoring we set up was a daily Supabase query that counts indexed pages vs. total active listings. If the ratio drops below 95%, we get an alert. This caught a sitemap regression within 24 hours of deploying a bad change.
Cost Breakdown: What This Actually Costs
People always ask about cost. Here's the real monthly spend as of Q1 2025:
| Service | Plan | Monthly Cost |
|---|---|---|
| Vercel | Pro | $20 |
| Vercel Bandwidth (overages) | Pay-as-you-go | ~$35 |
| Supabase | Pro | $25 |
| Supabase Database (compute) | Small instance | $48 |
| Typesense (Hetzner) | CX31 | $48 |
| Checkly | Starter | $7 |
| Sentry | Team | $26 |
| Domain + DNS (Cloudflare) | Free tier | $0 |
| Total | ~$209/month |
Serving 137,000 pages with millions of monthly page views for about $200/month. Try doing that with a traditional server setup running WordPress.
If you're considering a similar project and want to understand how an architecture like this maps to your budget, our pricing page breaks down how we typically scope directory and marketplace projects.
What We'd Do Differently
Start with ISR from day one. We wasted two weeks trying to make full SSG work before accepting the math didn't add up.
Use Typesense from the start. Postgres FTS was fine early on, but migrating search mid-project was disruptive. The $48/month would have been worth it from launch.
Invest in data validation earlier. With 137K listings imported from various sources, data quality was a nightmare. We should have built stricter Zod schemas and validation pipelines before the first import, not after we found thousands of broken records in production.
Test with realistic data volumes in staging. Our staging environment had 500 listings. Queries that worked great on 500 rows fell apart at 137K. We now seed staging with a 20% random sample of production data.
If you're planning a directory or marketplace build and want to avoid these same pitfalls, reach out to our team. We've been through this enough times to know where the landmines are.
FAQ
How long does it take to build a 100K+ listing directory with Next.js? For our team, the initial architecture and core features took about 10 weeks. Data import, cleaning, and validation added another 3-4 weeks. Total from kickoff to production launch was roughly 14 weeks. If you're working with a Next.js development team that's done this before, you can shave 2-3 weeks off that.
Can Supabase handle 100,000+ rows for a directory? Absolutely. Supabase runs on Postgres, which handles millions of rows without breaking a sweat. The key is proper indexing — without indexes on your most-queried columns, performance degrades fast. With the indexes we described above, our queries on 137K rows consistently return in under 50ms for single-record lookups.
What's the difference between ISR and SSG for large sites? SSG (Static Site Generation) builds every page at deploy time. ISR (Incremental Static Regeneration) builds a subset at deploy time and generates the rest on-demand. For sites with more than ~10,000 pages, ISR is practically required — full SSG builds become too slow for reasonable deployment cycles.
How do you handle SEO for 137,000 dynamically generated pages? Three things matter most: proper sitemap generation split across multiple files, unique structured data (JSON-LD) on every listing page, and ensuring ISR-generated pages return proper HTTP 200 status codes (not soft 404s). We also generate unique meta titles and descriptions per page using the listing data — no duplicate meta content.
Is Vercel ISR reliable for production at scale? In our experience, yes. We've been running this setup for over 8 months with 99.98% uptime. The only incidents were self-inflicted — a bad deploy that broke our revalidation webhook, and one Supabase maintenance window that caused 15 minutes of degraded search. Vercel's edge cache is rock solid.
Should I use Algolia or Typesense for a large directory? It depends on your budget. Algolia is the industry standard with the best developer experience, but it gets expensive past 100K records — expect $500-1000+/month. Typesense delivers 90% of the functionality at a fraction of the cost when self-hosted. We chose Typesense and haven't regretted it.
How do you keep 137,000 listings up to date? We use a combination of approaches: on-demand revalidation triggered by Supabase webhooks when individual listings change, time-based ISR revalidation (hourly) as a safety net, and a nightly batch job that checks for stale data and triggers bulk revalidation. Listing owners can also manually request a page refresh through their dashboard.
Can this architecture work with a headless CMS instead of Supabase? Yes, but with trade-offs. A headless CMS setup like Sanity or Contentful works well for the content management side, but you'll likely still need a database for search and complex queries. We've built directory projects where the editorial content lives in a headless CMS and the listing data lives in Postgres — it's a valid hybrid approach.