Listings stored in Supabase PostgreSQL with PostGIS extensions, synced via event-driven pipelines to Elasticsearch 8.x for geo-indexed faceted search. Next.js App Router with ISR generates programmatic city-category pages at edge, with Sanity CMS providing editorial content blocks. Claim workflows modeled as finite state machines with Supabase RLS enforcing ownership boundaries.
Where enterprise projects fail
A pretty hard one. Once you're past 50,000 listings, search performance starts degrading in ways that are genuinely painful to diagnose and fix. Query times balloon. Users notice. And when users notice slow search, they leave. It's not complicated: slow directory equals abandoned directory. Traffic drops, and ad revenue or lead generation revenue drops with it. You're suddenly in a hole that's difficult to climb out of. We've watched this happen to directories in markets like Chicago, Atlanta, and Brisbane -- places where listing volume grew faster than the underlying architecture could handle. The operators weren't doing anything wrong. They'd just outgrown their infrastructure. And honestly, that's what makes it so frustrating -- there's no obvious moment where you made a mistake. You just kept adding listings, kept growing, and one day search is returning results in 4 seconds instead of 400 milliseconds and you don't know exactly when it got this bad. That's the real problem. It's not a code quality issue or a bad engineering decision early on -- it's a scaling threshold that catches most teams completely off guard. Usually at the worst possible moment, like when you're finalizing a partnership deal or pitching advertisers on your traffic numbers. The infrastructure fails quietly, then all at once. So what actually fixes it? Architecture built for scale from the start, not bolted on after things break.
But in practice, they're actually catastrophic. New listings sit in limbo -- a business opens in Denver on Tuesday, but your directory doesn't reflect it until Thursday after someone manually triggers a deploy. That's stale content. Which means missed indexing opportunities, frustrated users, and deployment pipelines that buckle under the pressure of generating thousands of pages at once. And honestly? Your dev team starts dreading every push to production. That's a morale problem as much as a technical one.
Or two people claim the same business. Or a previous owner disputes a transfer six months later -- and now you're sorting through email threads trying to reconstruct what happened. Without a proper dispute resolution process and a clear audit trail, you're exposed. Legally and reputationally. There's no paper trail, no defined process, no way to demonstrate who had access and when. That erodes trust fast, both with the businesses listed and with the users relying on accurate ownership information.
Listings with wrong coordinates show up in the wrong city. Duplicate entries split your review counts. Categories get inconsistent -- "Restaurant" vs. "Restaurants" vs. "Food & Dining" -- which wrecks filtering entirely. But without an automated pipeline catching these problems continuously, you're either paying someone to manually clean records or just living with the mess. And the mess compounds. Every week of inaccurate listings is another week of eroding user trust and declining search relevance. It doesn't fix itself.
What we deliver
Enterprise Directory Platforms That Actually Scale
Most directory platforms hit a wall somewhere between 10,000 and 50,000 listings. Search slows down. Category pages take seconds to render. The claim workflow breaks under concurrent requests. And programmatic page generation — the backbone of directory SEO — becomes a brittle mess of timeouts and stale caches.
We build directory platforms designed from day one to handle 190,000+ listings with sub-second search, programmatic city-category page generation, and production-grade claim workflows. Not by bolting Elasticsearch onto a WordPress plugin, but by building purpose-built systems where every component — from geo-indexing to ISR page generation — is designed for the dataset size you actually have.
Why In-House Teams Struggle with Directory Platforms at Scale
Directory platforms look deceptively simple. A listing has a name, address, category, and maybe some reviews. But enterprise-scale directories introduce compounding complexity that most teams badly underestimate.
The Search Problem
Relational databases can't do geo-radius queries across 190K listings with faceted filtering in under 200ms. Teams start with PostgreSQL LIKE queries, graduate to PostGIS, then realize they need Elasticsearch for the full picture — faceted search, fuzzy matching, geo-distance sorting, and autocomplete. By that point, they've built three search implementations and none of them work well together.
The Programmatic Pages Problem
A directory with 500 cities and 40 categories generates 20,000 city-category landing pages. Each needs unique content signals, proper internal linking, structured data, and fast load times. Most teams try to generate these at build time — and hit 45-minute builds that fail intermittently. Or they go fully dynamic and lose the SEO benefits entirely.
The Claim Workflow Problem
Business owners need to find their listing, verify ownership, and gain edit access. This sounds like a simple CRUD flow until you account for duplicate claims, ownership disputes, verification via phone/email/postcard, role-based permissions for multi-location businesses, and audit trails for compliance. Most teams ship a basic version and spend the next six months patching edge cases.
The Data Integrity Problem
With 190K listings, data quality is a constant battle. Duplicate detection, geocoding accuracy, category normalization, and stale listing cleanup all require automated pipelines — not manual review queues.
Our Architecture for Enterprise Directory Platforms
We've built and shipped directory platforms managing 137,000+ listings in production. Here's the architecture pattern that works.
Search Layer: Elasticsearch with Geo-Point Indexing
Every listing is indexed in Elasticsearch with geo-point fields, enabling radius-based search with faceted filtering in a single query. The index schema supports:
- Geo-distance sorting — results ranked by proximity to user location or searched city centroid
- Multi-facet filtering — category, subcategory, city, state, rating, verification status, all applied simultaneously without performance degradation
- Fuzzy matching and autocomplete — typo-tolerant search with completion suggesters for instant results
- Synonym expansion — "dentist" matches "dental office," "dental clinic," etc.
We sync from the source database (typically Supabase or a headless CMS like Sanity) to Elasticsearch via event-driven pipelines. When a listing updates, the index updates within seconds — not on a nightly batch job.
Programmatic Page Generation: ISR on Next.js or Astro
City-category pages are generated using Incremental Static Regeneration on Next.js or on-demand rendering in Astro. This gives us:
- Static performance — pages serve from CDN at edge, sub-100ms TTFB globally
- Dynamic freshness — pages regenerate on a configurable interval (typically 60-300 seconds) or on-demand when listings change
- Build scalability — we don't generate 20,000 pages at build time. We generate the top 2,000 by traffic and let ISR handle the long tail on first request
Each programmatic page includes dynamically assembled content blocks: listing counts, top-rated businesses, category descriptions pulled from CMS, breadcrumb navigation, and JSON-LD structured data for local business search results.
Claim Workflow: State Machine with Audit Trail
We model the claim workflow as a finite state machine with these states: unclaimed → claim_requested → verification_pending → verified → disputed → transferred. Each transition triggers specific actions:
- claim_requested — sends verification challenge (email OTP, phone verification, or document upload)
- verification_pending → verified — grants role-based edit access, notifies admin, logs to audit trail
- disputed — freezes edit access, escalates to admin review queue with full history
This runs on Supabase with Row Level Security policies enforcing that verified owners can only edit their own listings. Admin dashboards built in Next.js give operations teams full visibility into the claim pipeline.
Data Pipeline: Automated Quality Enforcement
For a 190K listing dataset, we build automated pipelines handling:
- Geocoding validation — every address is geocoded via Google Maps Platform or Mapbox, with confidence scores. Low-confidence results get flagged for review.
- Duplicate detection — fuzzy matching on name + address using Elasticsearch's
more_like_thisquery identifies potential duplicates for merge review - Category normalization — listings are mapped to a canonical taxonomy, with AI-assisted categorization for ambiguous cases
- Stale listing detection — automated checks against Google Places API or business registration databases flag listings that may be closed or relocated
Technology Stack in Production
Our directory platform stack is proven across multiple enterprise deployments:
- Next.js 14+ with App Router for the frontend and API routes
- Elasticsearch 8.x for geo-indexed search with vector search capability for future AI features
- Supabase (PostgreSQL + Auth + Realtime) for the listing database, user management, and claim workflow state
- Sanity CMS for editorial content — city descriptions, category pages, blog content that supports programmatic pages
- Vercel for deployment with edge functions and ISR
- Mapbox GL JS for interactive map views with clustering at zoom levels
- Node.js workers on Railway or Vercel Cron for data pipeline jobs
Proven at Scale: 137,000+ Listings in Production
Our NAS directory platform manages 137,000+ business listings with geo-indexed search, programmatic city-category pages, and a full claim-and-verify workflow. Key production metrics:
- Search latency: p95 under 120ms for geo-radius queries with 3+ active facets
- Page generation: 91,000+ pages indexed by Google, all serving with Lighthouse scores above 95
- Claim completion rate: 73% of initiated claims reach verified status within 48 hours
- Zero downtime during a bulk import of 40,000 listings — Elasticsearch reindexing ran in parallel with zero search degradation
Delivery Model and SLA
Enterprise directory platforms typically run 14-20 weeks depending on data complexity and custom workflow requirements.
Phase 1: Data Architecture and Search (Weeks 1-5)
Schema design, Elasticsearch index mapping, geo-indexing pipeline, search API with faceted filtering.
Phase 2: Programmatic Pages and Frontend (Weeks 4-10)
City-category page generation, listing detail pages, map integration, structured data, ISR configuration.
Phase 3: Claim Workflow and Admin (Weeks 8-14)
State machine implementation, verification flows, admin dashboard, role-based access control.
Phase 4: Data Pipeline and Launch (Weeks 12-18)
Bulk import tooling, duplicate detection, geocoding validation, performance testing, production deployment.
We overlap phases where dependencies allow and run weekly architecture reviews with your engineering team throughout.
Post-Launch Support
All enterprise directory engagements include 90 days of post-launch support covering search tuning, index optimization, and workflow refinement based on real user behavior data.
See this capability in action
Frequently asked
How does Elasticsearch handle geo-indexed search across 190K+ listings?
Elasticsearch stores each listing with geo-point fields, which means radius queries, geo-distance sorting, and bounding-box filtering all run against one index -- not across joined tables doing expensive cross-referencing. Combine that with faceted filtering on categories, ratings, and verification status, and you're hitting p95 latency under 120ms across 190K+ documents. We've stress-tested this setup to 500,000 listings without touching the structural architecture. The real kicker? Most directories never need to change anything once it's built right the first time.
How do you generate thousands of city-category pages without breaking builds?
We use Incremental Static Regeneration on Next.js -- pretty straightforward in concept, but the implementation details matter a lot. The top 2,000 pages by traffic get pre-built at deploy time. Everything else generates on first request and caches at the edge. Each page revalidates on a configurable interval, so a new listing in Portland shows up within minutes rather than after a full rebuild. It scales to 50,000+ programmatic pages without making your CI pipeline miserable. And honestly, your developers will thank you for it.
What does the business claim workflow look like technically?
We model claims as a finite state machine: `unclaimed → claim_requested → verification_pending → verified → disputed → transferred`. Each transition triggers automated actions -- verification challenges, role grants, admin notifications, audit logs. Supabase Row Level Security enforces that verified owners can only edit their own listings, not anyone else's. The whole flow is fully auditable. And it handles multi-location businesses -- a franchise with 200 locations, say -- without special-casing each scenario. That matters more than it sounds.
Can you migrate our existing directory data into this platform?
Yes. We build custom ETL pipelines for bulk imports that handle geocoding validation, duplicate detection, and category normalization upfront. We've imported 40,000+ listings in a single batch with zero search downtime by running Elasticsearch reindexing in parallel with the import. Your existing data gets cleaned, geocoded, and deduplicated as part of the migration -- you're not just dumping raw records into a new system and hoping for the best. The pipeline does the work.
How do you handle SEO for programmatic directory pages?
Each city-category page gets unique content signals: dynamic listing counts, top-rated business highlights, CMS-managed category descriptions, breadcrumb navigation, and JSON-LD LocalBusiness structured data. Internal linking between related cities and categories builds topical authority across the whole site. Across our directory deployments, we've hit 91,000+ indexed pages with Lighthouse scores above 95. That combination -- scale plus performance plus structured data -- is what actually moves organic traffic numbers.
What's the typical timeline and budget for an enterprise directory platform?
Enterprise directory platforms run 14 to 20 weeks across four overlapping phases: data architecture and search, programmatic pages and frontend, claim workflow and admin tools, then data pipeline and launch. Budget ranges from $80,000 to $250,000 depending on listing volume, custom workflow complexity, and integration requirements. All engagements include 90 days of post-launch support -- because launch is never actually the end of the project.
Why not use an off-the-shelf directory solution like eDirectory or Brilliant Directories?
Look, off-the-shelf solutions work fine under 20,000 listings. But push past that -- especially toward 190K+ -- and things start breaking in ways that are hard to patch. Search slows down. Page generation chokes. Claim workflows fall apart under concurrent verification requests. Custom architecture gives you full data ownership, sub-200ms search at any scale, and programmatic page generation that actually ranks in Google. At enterprise scale, that difference shows up directly in your organic traffic numbers. It's not a philosophical argument for custom builds -- it's just what the data shows.
Browse all 15 enterprise capability tracks or compare with our SME-scale industry solutions.
Schedule Discovery Session
We map your platform architecture, surface non-obvious risks, and give you a realistic scope — free, no commitment.
Schedule Discovery Call
Let's build
something together.
Whether it's a migration, a new build, or an SEO challenge — the Social Animal team would love to hear from you.