Last year, we migrated a 34,000-page e-commerce site from a monolithic WordPress installation to a headless architecture using Next.js and a headless CMS. The client's organic traffic accounted for 72% of their revenue. No pressure, right?

The migration took 14 weeks of planning and 6 weeks of execution. When we flipped the switch, organic traffic dipped 3.2% in week one, recovered by week three, and was up 11% by month two. That's not luck -- it's process.

I've seen migrations go catastrophically wrong. A competitor of that same client had migrated six months earlier and lost 40% of their organic traffic overnight. Eight months later, they still hadn't recovered. The difference between a successful large-scale migration and a disaster comes down to preparation, redirect management, and having a rollback plan you actually trust.

This article walks through everything we do when migrating sites with tens of thousands of pages. It's the same process whether you're moving from WordPress to Next.js, Drupal to Astro, or any other platform shift.

Table of Contents

How to Migrate a 30,000-Page Website Without Losing SEO

Why Large-Scale Migrations Fail

Most migration failures share the same root causes. Understanding them upfront saves you from joining the graveyard of botched launches.

The Redirect Problem

On a 500-page site, you can manually map every URL. On a 30,000-page site, you can't. Teams end up writing regex-based redirect rules that cover 90% of URLs and assume the remaining 10% will sort itself out. That remaining 10%? It's 3,000 pages. Many of which are your highest-performing content.

A 2025 Ahrefs study found that sites losing more than 15% of their indexed pages during migration experienced an average organic traffic decline of 34%. And recovery took 4-8 months on average.

The Parity Problem

Google doesn't just care about content -- it cares about structure. Internal linking patterns, heading hierarchies, structured data, canonical tags, pagination handling, faceted navigation. Change too many of these simultaneously and Google essentially has to re-evaluate your entire site from scratch.

The Timing Problem

I've seen teams spend months perfecting the new site and then rush the actual migration because leadership is impatient. You don't migrate a 30,000-page site on a Friday afternoon. You don't migrate during your peak traffic season. And you definitely don't migrate without a tested rollback plan.

Phase 1: Pre-Migration Audit and Crawl

Before you touch anything, you need a complete picture of what exists today. This is your baseline, and you'll reference it constantly throughout the migration.

Full Site Crawl

Run a complete crawl using Screaming Frog, Sitebulb, or a cloud-based crawler like Lumar (formerly Deepcrawl). For 30,000+ pages, you'll want the cloud option -- desktop crawlers choke on sites this size, and you need the crawl data to be shareable across your team.

Capture everything:

  • Every URL and its HTTP status code
  • Title tags and meta descriptions
  • H1 tags
  • Canonical tags
  • Hreflang tags (if applicable)
  • Internal links (both inbound and outbound per page)
  • Structured data types present
  • Page load times
  • Word count per page
  • Images and alt text

Analytics Baseline

Export the last 12 months of Google Analytics data and Google Search Console data. You need:

  • Top 1,000 landing pages by organic sessions
  • Top 5,000 queries by clicks and impressions
  • Crawl stats (pages crawled per day, response times)
  • Core Web Vitals scores
  • Index coverage report (indexed, excluded, errors)

Tag your top 500 organic landing pages. These are the pages that cannot break. Period. Every one of them gets individually verified during and after migration.

Pull backlink data from Ahrefs, Semrush, and Google Search Console. Cross-reference to find every URL that has external links pointing to it. These URLs need perfect 301 redirects -- losing backlink equity on high-authority pages is one of the fastest ways to tank rankings.

# Example: Export and deduplicate backlinked URLs
ahrefs-export.csv + semrush-export.csv + gsc-export.csv
| sort -u 
| awk -F',' '{print $1}' 
> unique_backlinked_urls.txt

wc -l unique_backlinked_urls.txt
# Output: 8,247 unique URLs with backlinks

Phase 2: URL Mapping and Redirect Strategy

This is where migrations are won or lost. On a 30,000-page site, you need a systematic approach that combines automated mapping with manual verification for critical pages.

Building the Redirect Map

Start by categorizing your URLs into patterns. Most large sites have a relatively small number of URL patterns that account for the majority of pages:

URL Pattern Example Page Count Strategy
Product pages /products/blue-widget-123 18,000 Regex + ID mapping
Category pages /category/widgets 450 Manual mapping
Blog posts /blog/2024/03/post-title 3,200 Slug preservation
Tag/filter pages /products?color=blue 6,500 Evaluate: redirect or noindex
Static pages /about, /contact 85 Manual mapping
Paginated pages /category/widgets/page/3 1,800 Map to new pagination

The Three-Tier Approach

Tier 1: Manual mapping (top 500 pages) Your highest-traffic, highest-revenue pages get individually mapped. A human verifies each redirect. No exceptions.

Tier 2: Pattern-based mapping (next ~25,000 pages) Write transformation rules that convert old URL patterns to new ones. Test these rules against your full URL list before deployment.

# Example redirect rule generation
import csv
import re

def generate_redirect(old_url):
    # Product pages: /products/blue-widget-123 -> /shop/blue-widget
    product_match = re.match(r'/products/([a-z-]+)-(\d+)$', old_url)
    if product_match:
        slug = product_match.group(1)
        return f'/shop/{slug}', 301
    
    # Blog posts: /blog/2024/03/post-title -> /blog/post-title
    blog_match = re.match(r'/blog/\d{4}/\d{2}/(.+)$', old_url)
    if blog_match:
        slug = blog_match.group(1)
        return f'/blog/{slug}', 301
    
    return None, None

# Process all URLs
with open('all_urls.csv') as f:
    reader = csv.reader(f)
    unmapped = []
    for row in reader:
        old_url = row[0]
        new_url, status = generate_redirect(old_url)
        if new_url is None:
            unmapped.append(old_url)
    
    print(f"Unmapped URLs: {len(unmapped)}")

Tier 3: Remaining unmapped pages (~4,500 pages) These are your edge cases. Go through them manually. Some will be pages you're intentionally sunsetting (redirect to nearest relevant page). Some will be URLs you missed in your pattern analysis. Don't leave any 404s for pages that had traffic or backlinks.

Redirect Chains and Loops

If the old site already has redirects in place, your new redirects might create chains (A → B → C). Resolve these before launch. Every redirect should go directly from old URL to final destination in a single hop. Redirect chains bleed PageRank -- Google's John Mueller has confirmed multiple times that while they'll follow chains, a direct redirect is always preferable.

How to Migrate a 30,000-Page Website Without Losing SEO - architecture

Phase 3: Technical SEO Parity Checklist

The new site needs to maintain technical SEO parity with the old site -- and ideally improve on it. Here's what we check:

Critical Parity Items

  • Title tags: Same or improved. Never leave them blank during migration.
  • Meta descriptions: Carry them over, even if you plan to rewrite later.
  • H1 structure: One H1 per page, matching the old site's keyword targeting.
  • Canonical tags: Self-referencing canonicals on every page. If the old site had cross-domain canonicals, preserve them.
  • Robots.txt: Don't accidentally block Googlebot on launch. I've seen this happen more than I'd like to admit.
  • XML Sitemaps: Generate new sitemaps with all new URLs. Submit within hours of launch.
  • Structured data: Migrate all schema markup. Product schema, FAQ schema, breadcrumb schema -- all of it.
  • Internal linking: The new site's internal link graph should closely mirror the old site's.

Performance Requirements

Google's Core Web Vitals are ranking factors. Your new site should meet or beat the old site's performance:

Metric Good Threshold Target
LCP (Largest Contentful Paint) ≤ 2.5s ≤ 2.0s
INP (Interaction to Next Paint) ≤ 200ms ≤ 150ms
CLS (Cumulative Layout Shift) ≤ 0.1 ≤ 0.05
TTFB (Time to First Byte) ≤ 800ms ≤ 400ms

This is one area where migrating to a modern stack like Next.js or Astro actually gives you an advantage. Static generation and edge rendering can dramatically improve TTFB. We've seen TTFB drop from 1.2s to under 200ms when moving from traditional WordPress to Next.js with ISR or Astro with static output.

Phase 4: Content Migration and Validation

Automated Content Extraction

For 30,000 pages, you need automated content extraction. We typically build custom scrapers or use the CMS's export APIs to pull content into a structured format (usually JSON or CSV) before importing into the new headless CMS.

Key validations after import:

  • Character encoding (watch for broken special characters)
  • Image references (do all images resolve?)
  • Internal links (are they updated to new URL patterns?)
  • Embedded media (videos, iframes, widgets)
  • Table formatting
  • Code blocks

Content Diff Testing

We run automated comparisons between old and new pages for our top 500 URLs. The script fetches both versions, strips HTML, and compares the text content. Any page with less than 95% text similarity gets flagged for manual review.

// Simplified content comparison
const { diff } = require('fast-diff');
const cheerio = require('cheerio');

async function comparePages(oldUrl, newUrl) {
  const oldHtml = await fetch(oldUrl).then(r => r.text());
  const newHtml = await fetch(newUrl).then(r => r.text());
  
  const oldText = cheerio.load(oldHtml)('main').text().trim();
  const newText = cheerio.load(newHtml)('main').text().trim();
  
  const changes = diff(oldText, newText);
  const unchanged = changes
    .filter(([type]) => type === 0)
    .reduce((sum, [, text]) => sum + text.length, 0);
  
  const similarity = unchanged / Math.max(oldText.length, newText.length);
  
  return {
    similarity: Math.round(similarity * 100),
    oldLength: oldText.length,
    newLength: newText.length,
    needsReview: similarity < 0.95
  };
}

Phase 5: Staging Environment Testing

Never launch a migration without thorough staging testing. Here's what we validate:

Redirect Testing

Test every single redirect. Yes, all 30,000. Use a script that follows the redirect chain and validates the final destination:

# Test redirects from mapping file
while IFS=, read -r old_url new_url; do
  response=$(curl -s -o /dev/null -w "%{http_code} %{redirect_url}" "$old_url")
  status=$(echo $response | cut -d' ' -f1)
  redirect=$(echo $response | cut -d' ' -f2)
  if [ "$status" != "301" ] || [ "$redirect" != "$new_url" ]; then
    echo "FAIL: $old_url -> $status $redirect (expected 301 $new_url)"
  fi
done < redirect_map.csv

Rendering Validation

If you're using client-side rendering (CSR) or hydration-heavy approaches, verify that Googlebot can actually see your content. Use Google's Rich Results Test or the URL Inspection tool in Search Console to check rendered output.

This is a particularly common issue with React-based frameworks. If your content requires JavaScript to render and you haven't implemented SSR or SSG properly, Google might see a blank page. We always use server-side rendering or static generation for SEO-critical pages.

Phase 6: Launch Day Execution

The Launch Checklist

  1. DNS TTL: Lower DNS TTL to 300 seconds at least 48 hours before migration
  2. Deploy redirects: Get all 301 redirects live on the old server/CDN
  3. Switch DNS: Point domain to new infrastructure
  4. Verify redirects: Run automated redirect tests against production
  5. Submit sitemaps: Submit new XML sitemaps in Google Search Console
  6. Request indexing: Use the URL Inspection tool to request indexing of your top 50 pages
  7. Monitor: Watch real-time analytics for anomalies
  8. Verify robots.txt: Confirm Googlebot isn't blocked
  9. Check CDN/caching: Ensure redirect headers aren't being cached incorrectly

Timing

Launch on a Tuesday or Wednesday morning. Never Friday. You want at least 3 full business days to monitor and fix issues before the weekend. Avoid launching during high-traffic periods or major shopping events.

We also make sure someone is monitoring through the night after launch. Google often crawls more aggressively during off-peak hours, and if your redirects have issues, you want to catch them fast.

Rollback Plan

Have a tested rollback plan that can be executed in under 15 minutes. This usually means keeping the old infrastructure running in parallel for at least 2 weeks post-migration. The cost of maintaining two environments temporarily is nothing compared to the cost of a failed migration.

Phase 7: Post-Migration Monitoring

Daily Monitoring (Weeks 1-2)

  • Crawl errors: Check Google Search Console daily for new 404s and server errors
  • Index coverage: Monitor the index coverage report for drops
  • Organic traffic: Compare daily organic sessions to your baseline
  • Rankings: Track your top 200 keywords daily
  • Server logs: Analyze Googlebot's crawl patterns on the new site
  • Core Web Vitals: Verify field data as it starts coming in

Weekly Monitoring (Weeks 3-8)

  • Compare organic traffic week-over-week
  • Monitor for ranking volatility
  • Check for new crawl issues
  • Verify redirect chains haven't been accidentally created
  • Monitor backlink profile for lost links

Expected Traffic Patterns

A well-executed migration typically shows:

  • Week 1: 5-15% traffic dip (Google is processing the changes)
  • Week 2-3: Recovery to pre-migration levels
  • Week 4-8: If the new site is technically superior, you'll often see a traffic increase

If you see a 30%+ drop that doesn't recover by week 3, something went wrong with your redirects or technical implementation. Dig into Search Console immediately.

Redirect Implementation at Scale

Where you implement redirects matters. For 30,000+ redirects, don't stuff them all into an .htaccess file or a Next.js redirects config array -- that kills performance.

Edge-level redirects (best for performance) Implement redirects at the CDN/edge level using Cloudflare Workers, Vercel Edge Middleware, or Netlify's _redirects file. Edge redirects execute before your application code, so they're extremely fast.

// Vercel Edge Middleware example
import { NextResponse } from 'next/server';
import type { NextRequest } from 'next/server';

// Load redirect map (pre-built at deploy time)
import redirectMap from './redirects.json';

export function middleware(request: NextRequest) {
  const path = request.nextUrl.pathname;
  const redirect = redirectMap[path];
  
  if (redirect) {
    return NextResponse.redirect(
      new URL(redirect.destination, request.url),
      redirect.permanent ? 301 : 302
    );
  }
  
  return NextResponse.next();
}

Database-backed redirects (best for flexibility) Store redirects in a database and look them up at request time. This lets you add, modify, and audit redirects without redeploying. Add aggressive caching (Redis or similar) so the database lookup doesn't add latency.

Hybrid approach (what we usually do) Pattern-based redirects at the edge, individual redirects in a database. Best of both worlds.

Handling International and Multi-Language Sites

If your 30,000-page site includes multiple languages or regions, the complexity multiplies. Each language version needs its own redirect map. Hreflang tags need to be updated to reference new URLs. And you need to verify that the language/region targeting in Search Console still works correctly.

Common pitfalls:

  • Forgetting to update hreflang annotations across all language versions simultaneously
  • Breaking the hreflang reciprocal requirement (if page A points to page B, page B must point back to page A)
  • Losing language-specific URL structures that Google uses as signals

Common Mistakes That Kill Rankings

  1. Using 302 instead of 301: Temporary redirects don't pass full link equity. Triple-check your redirect status codes.
  2. Blocking the staging site and forgetting to unblock: Your robots.txt on staging says Disallow: /. You deploy staging to production. Googlebot can't crawl anything.
  3. Changing content and URLs simultaneously: Google sees a new URL with different content. Is it a new page? A moved page? Reduce ambiguity -- migrate URLs first, change content later.
  4. Redirecting everything to the homepage: Lazy redirect implementations that send all old URLs to the homepage destroy your long-tail rankings instantly.
  5. Ignoring JavaScript rendering: Your new React app looks great in Chrome. Googlebot sees an empty <div id="root"></div>.
  6. Not handling trailing slashes consistently: /products/widget and /products/widget/ are different URLs. Pick one and redirect the other.
  7. Removing pages without redirects: If a page had traffic, it needs a redirect. Even if you're sunsetting that content, redirect to the nearest relevant page.

Tools and Stack We Use

Tool Purpose Cost (2026)
Screaming Frog Desktop crawling $259/year
Lumar (Deepcrawl) Cloud crawling for large sites Custom pricing
Ahrefs Backlink analysis, rank tracking From $129/month
Google Search Console Index monitoring, crawl stats Free
Redirectchecker.com Bulk redirect testing Free tier available
ContentKing Real-time SEO monitoring From $99/month
Custom Python/Node scripts Redirect mapping, content diffing Your time

For the actual site build, we typically use Next.js or Astro depending on the project's needs, paired with a headless CMS like Sanity, Contentful, or Storyblok. If you're planning a migration and want to discuss architecture, check our pricing or get in touch.

FAQ

How long does it take to migrate a 30,000-page website?

Expect 12-20 weeks total. The planning and URL mapping phase takes the longest -- usually 8-14 weeks. The actual technical migration and launch is typically 4-6 weeks. Rushing the planning phase is the single biggest predictor of migration failure.

Will I definitely lose some SEO traffic during migration?

A temporary dip of 5-15% is normal and expected, even with a perfect migration. Google needs time to process tens of thousands of redirects and re-crawl your new site. The dip typically resolves within 2-3 weeks. If you see a larger drop or it doesn't recover, investigate your redirects and technical implementation immediately.

Should I change my URL structure during migration?

Only if there's a strong reason to do so. Every URL change adds risk. If your current URL structure is functional and descriptive, keep it. If it's genuinely bad (e.g., URLs with query parameters instead of clean paths), the migration is a good opportunity to fix it -- but plan your redirect map accordingly.

Can I migrate my site in phases instead of all at once?

Yes, and for very large sites it's often the safer approach. You can migrate section by section -- blog first, then product pages, then category pages. This reduces risk but increases complexity because you're running two platforms simultaneously, usually behind a reverse proxy. We've done this successfully several times, but it requires careful routing configuration.

What happens to my Google Ads during migration?

Update your ad landing page URLs to the new URLs before or immediately after migration. If you have redirects in place, your ads will still work, but the redirect adds latency and Google Ads quality scores can be negatively affected by redirect chains. Updating the URLs directly is always better.

How do I handle pages I want to remove during migration?

If the page had organic traffic or backlinks, redirect it to the most relevant existing page on the new site. If it had neither, you can let it return a 404 or 410 (Gone) status. Don't redirect irrelevant pages to your homepage -- Google treats mass homepage redirects as soft 404s.

Should I use 301 or 308 redirects?

Use 301 for most cases. Both are permanent redirects, but 301 is universally understood by all bots and browsers. 308 preserves the HTTP method (POST stays POST), which matters for API endpoints but not for SEO-focused page redirects.

When should I remove the old redirects?

Keep them for at least one year, preferably indefinitely. Redirects are cheap to maintain, and removing them means any old bookmarks, external links, or cached search results will hit 404s. There's almost never a good reason to remove working 301 redirects.