Multilingual Website Development: Build for 5+ Languages Without Breaking Routing
Your third language ships and the routing config collapses. Content editors open the CMS and can't tell what's translated, what's draft, what's live in German but missing in Japanese. Your bundle size hits 800KB because every locale loads on every page. And hreflang tags? Nobody remembers them until the staging link goes to the client on Thursday before launch. If you're building for five or more languages, these architecture decisions need to lock in before you write a single route — not patched in when the translator sends their first invoice. Here's the routing strategy, CMS structure, and bundle approach that actually survives contact with real translators.
We've shipped multilingual sites supporting 8-14 languages for clients across fintech, healthcare, and e-commerce. Here's what I've learned after doing this enough times: the difference between a site that handles 2 languages and one that handles 12 isn't complexity. It's having the right abstractions. This guide covers everything from URL strategy and i18n routing to CMS modeling, translation workflows, and performance optimization.
Table of Contents
- Why Most Multilingual Implementations Fail at Scale
- URL Strategy: Subdomains vs Subdirectories vs TLDs
- Framework Selection for Multilingual Sites
- i18n Routing Architecture
- Headless CMS Modeling for Multilingual Content
- Translation Workflow Automation
- SEO for Multilingual Sites
- Performance Optimization Across Locales
- Right-to-Left (RTL) Language Support
- Testing and QA for Multilingual Sites
- FAQ
Why Most Multilingual Implementations Fail at Scale
Same story every time. A team builds a site in English, somebody asks them to add Spanish, they drop in a translation library, hardcode some locale switching logic, and ship it. Then French gets requested. Then German. Then Japanese. By language five, they're drowning in:
- Routing spaghetti: Locale prefixes that blow up the second you introduce dynamic routes
- Content drift: Translations falling weeks behind the source language — sometimes months, if we're being honest
- Bundle bloat: Every translation string loaded regardless of which locale the user actually needs
- SEO blindspots: Missing or broken hreflang annotations, duplicate content penalties absolutely tanking rankings
- Layout breakage: German text running 40% longer than English, Japanese needing completely different font stacks
The root cause? Teams treat multilingual as a feature. It's not a feature. When you're supporting 5+ languages, localization touches routing, data modeling, build pipelines, CDN configuration, and deployment strategy. You can't npm install something on a Friday afternoon and call it done. It's foundational — or it's a mess.
URL Strategy: Subdomains vs Subdirectories vs TLDs
Your URL structure is the single most consequential decision for multilingual SEO. And it's nearly impossible to change after launch without torching your rankings. Three real options on the table:
| Strategy | Example | SEO Authority | Implementation Complexity | Cost |
|---|---|---|---|---|
| Subdirectories | example.com/fr/about | Consolidated (single domain) | Low | Low |
| Subdomains | fr.example.com/about | Split (treated as separate sites) | Medium | Low |
| ccTLDs | example.fr/about | Independent per country | High | High ($10-50/domain/year × n) |
| Query params | example.com/about?lang=fr | Poor (not recommended) | Low | Low |
Our recommendation for 5+ languages: subdirectories. Here's why:
- Domain authority consolidation: All backlinks benefit every language version. With 8 languages on subdomains, you're basically building authority for 8 separate sites. That's brutal — and totally unnecessary.
- Simplified infrastructure: One deployment, one SSL cert, one CDN configuration. Done.
- Easier analytics: Single GA4 property with locale dimensions vs. cross-domain tracking nightmares. If you've ever burned a Thursday afternoon debugging cross-domain GA setup, you know exactly what I'm talking about.
- Lower cost: No domain registration per locale.
The exception? When you need genuinely different content per country — not just language. A German site for Germany vs. a German site for Switzerland with different pricing, legal terms, and product availability? That's a real distinction. ccTLDs or subdomains with country-specific content models actually make sense there.
# Recommended subdirectory structure
example.com/ → English (default)
example.com/fr/ → French
example.com/de/ → German
example.com/ja/ → Japanese
example.com/ar/ → Arabic
example.com/pt-br/ → Brazilian Portuguese
Note the pt-br instead of just pt. When you support 5+ languages, you'll inevitably run into language-vs-locale distinctions. Brazilian Portuguese and European Portuguese are different enough that users notice — and trust me, they will let you know about it. Plan for language-region codes from day one using BCP 47 tags. Retrofitting this later is painful in ways I can't fully convey until you've lived through it.
Framework Selection for Multilingual Sites
Not all frameworks handle i18n equally. Here's where the major players stand for 5+ language support in 2026:
| Framework | Built-in i18n Routing | Static + Dynamic | Bundle Splitting by Locale | RTL Support | Best For | |---|---|---|---|---| | Next.js 15 | ✅ (App Router) | ✅ | ✅ (with config) | Manual | Full-stack apps, dynamic content | | Astro 5 | ✅ (manual + Starlight) | ✅ | ✅ (automatic per-page) | Manual | Content-heavy, marketing sites | | Nuxt 3 | ✅ (@nuxtjs/i18n) | ✅ | ✅ | Manual | Vue ecosystem projects | | Remix / React Router 7 | ❌ (manual) | ✅ | Manual | Manual | Complex interactive apps | | SvelteKit | ❌ (manual) | ✅ | Manual | Manual | Performance-critical apps |
Next.js 15 Multilingual Architecture
Next.js has the most mature i18n story right now, mostly thanks to the App Router. The [locale] dynamic segment pattern gives you clean routing without middleware hacks:
// app/[locale]/layout.tsx
import { notFound } from 'next/navigation';
const locales = ['en', 'fr', 'de', 'ja', 'ar', 'pt-br', 'es', 'ko'];
export function generateStaticParams() {
return locales.map((locale) => ({ locale }));
}
export default function LocaleLayout({
children,
params: { locale },
}: {
children: React.ReactNode;
params: { locale: string };
}) {
if (!locales.includes(locale)) notFound();
return (
<html lang={locale} dir={locale === 'ar' ? 'rtl' : 'ltr'}>
<body>{children}</body>
</html>
);
}
For translation string management, next-intl has basically become the standard. It supports ICU MessageFormat, server components, and — this is the big one — per-locale bundle splitting so your Japanese users aren't downloading German translations. That matters way more than most people think.
// i18n/request.ts
import { getRequestConfig } from 'next-intl/server';
export default getRequestConfig(async ({ locale }) => ({
messages: (await import(`../messages/${locale}.json`)).default,
}));
We cover this architecture in depth in our Next.js development capabilities.
Astro for Content-Heavy Multilingual Sites
Astro's content collections are ridiculously well-suited for multilingual marketing sites and docs. Each piece of content gets organized by locale with zero JavaScript overhead:
src/content/
blog/
en/
getting-started.md
pricing-guide.md
fr/
getting-started.md
pricing-guide.md
de/
getting-started.md
Astro 5's content layer API makes it dead simple to query content by locale and generate static pages for all languages at build time. For a 200-page site in 8 languages, Astro generates 1,600 static HTML pages in under 30 seconds — each fully optimized with zero runtime JavaScript unless you explicitly add interactivity. Think about that for a second. That's kind of insane.
More on this in our Astro development practice.
i18n Routing Architecture
Middleware-Based Locale Detection
For the best UX, you want to detect the user's preferred language on first visit and redirect accordingly. In Next.js middleware:
// middleware.ts
import createMiddleware from 'next-intl/middleware';
export default createMiddleware({
locales: ['en', 'fr', 'de', 'ja', 'ar', 'pt-br', 'es', 'ko'],
defaultLocale: 'en',
localeDetection: true, // Uses Accept-Language header
localePrefix: 'as-needed', // No /en/ prefix for default locale
});
export const config = {
matcher: ['/((?!api|_next|_vercel|.*\\..*).*)'],
};
Detection priority should go like this:
- Explicit URL locale (
/fr/about) — always wins, no exceptions - Cookie (
NEXT_LOCALE) — respects the user's previous choice - Accept-Language header — browser preference
- GeoIP — use cautiously; plenty of expats and travelers browse in a language that doesn't match their location
- Default locale — fallback
Locale Switching Without Full Page Reloads
Here's a mistake we see constantly: implementing locale switching as full navigations. When someone switches from English to French on /en/about, they should land on /fr/about — not /fr/. Nobody wants to get dumped back to the homepage. You need path mapping across locales:
// components/LocaleSwitcher.tsx
'use client';
import { usePathname, useRouter } from 'next/navigation';
export function LocaleSwitcher({ currentLocale, locales }) {
const pathname = usePathname();
const router = useRouter();
const switchLocale = (newLocale: string) => {
// Replace current locale segment with new one
const newPath = pathname.replace(`/${currentLocale}`, `/${newLocale}`);
router.push(newPath);
};
return (
<select
value={currentLocale}
onChange={(e) => switchLocale(e.target.value)}
>
{locales.map((locale) => (
<option key={locale} value={locale}>
{new Intl.DisplayNames([locale], { type: 'language' }).of(locale)}
</option>
))}
</select>
);
}
Quick tip: use Intl.DisplayNames to show language names in their own script (Français, Deutsch, 日本語) instead of in English. Small detail. Users absolutely notice though.
Headless CMS Modeling for Multilingual Content
A headless CMS is non-negotiable for 5+ languages. WordPress with WPML turns into a maintenance nightmare past three locales — we've watched it happen too many times to count. Here's how the major headless platforms stack up:
| CMS | Localization Model | Translation Workflow | API Query Pattern | Pricing Impact |
|---|---|---|---|---|
| Contentful | Field-level locales | Built-in + external integrations | ?locale=fr |
Each locale counts toward entry limits |
| Sanity | Document-level (recommended) | Plugin-based (Sanity Translate) | GROQ filter by language | No per-locale pricing impact |
| Storyblok | Field-level with folder-based | Built-in translation UI | Dimension API | Included in all plans |
| Hygraph | Field-level locales | Stage-based workflow | locales: [fr] in GraphQL |
Locales count toward plan limits |
| Payload CMS | Field-level or collection-level | Custom workflow | Filter by locale field | Self-hosted, no per-locale cost |
Document-Level vs Field-Level Localization
This is the most important CMS architecture decision for multilingual sites. Most agencies get this wrong.
Field-level localization (Contentful, Storyblok): Each field in a content entry holds values for every locale. A single blog post entry contains the English title, French title, German title, etc. — all crammed into one place.
Document-level localization (Sanity's recommended pattern): Each locale gets its own document, linked by a shared reference ID.
For 5+ languages, we strongly recommend document-level localization for long-form content and field-level localization for structured data — product names, metadata, UI labels. The reasoning:
- With field-level localization across 8 languages, editing a blog post means scrolling past 7 other languages' worth of content to find the field you need. Content editors hate this. Like, genuinely, viscerally hate it.
- Document-level keeps editor UIs clean — your French editors see only French content
- Translation status tracking becomes way simpler per document (draft, in-review, published per locale)
- Content can diverge by locale when it needs to — different hero images, different CTAs for different markets
In Sanity, this looks like:
// schemas/blogPost.ts
export default defineType({
name: 'blogPost',
type: 'document',
fields: [
defineField({
name: 'language',
type: 'string',
options: {
list: [
{ title: 'English', value: 'en' },
{ title: 'French', value: 'fr' },
{ title: 'German', value: 'de' },
// ...
],
},
}),
defineField({
name: 'translationGroup',
type: 'string', // Shared UUID across all translations of this post
hidden: true,
}),
defineField({ name: 'title', type: 'string' }),
defineField({ name: 'body', type: 'portableText' }),
],
});
Learn more about how we structure headless CMS projects at our CMS development page.
Translation Workflow Automation
Manual translation doesn't scale past 3 languages. Period. At 8 languages, a single blog post generates 7 translation tasks — and if your content team publishes 4 posts a week, that's 28 translations weekly. The math gets ugly fast.
Machine Translation as First Draft
The 2026 approach that actually holds up: use AI/machine translation for first drafts, then have human translators polish. DeepL Pro ($25/month) and Google Cloud Translation V3 deliver 85-92% accuracy for European languages, though accuracy drops noticeably for CJK.
// scripts/auto-translate.ts
import * as deepl from 'deepl-node';
const translator = new deepl.Translator(process.env.DEEPL_API_KEY);
async function translateContent(
text: string,
sourceLang: deepl.SourceLanguageCode,
targetLang: deepl.TargetLanguageCode
): Promise<string> {
const result = await translator.translateText(text, sourceLang, targetLang, {
preserveFormatting: true,
formality: 'more', // Business-appropriate tone
tagHandling: 'html', // Preserve HTML/markdown structure
});
return result.text;
}
Translation Management Systems (TMS)
For enterprise-grade workflows, you'll want a dedicated TMS:
- Phrase (formerly Memsource): From $25/month, integrates with most headless CMSs
- Crowdin: From $40/month, excellent developer experience with GitHub/GitLab sync
- Lokalise: From $120/month, best Figma integration for design-to-translation workflows
- Transifex: From $150/month, strong API-first approach
Here's the workflow we've landed on for most clients:
- Content author publishes in the source language (usually English)
- Webhook triggers translation job creation in the TMS
- Machine translation generates a first draft
- Human translator reviews and approves
- Approved translation gets pushed back to the CMS via API
- Webhook triggers rebuild/revalidation of affected pages
That's a lot of moving parts — I won't pretend it isn't. But once it's wired up, content teams barely notice the machinery underneath. They just write and publish.
SEO for Multilingual Sites
Hreflang Implementation
Hreflang tags tell search engines which language version to serve in which market. Get these wrong and Google shows your German page to French users. We've had that conversation with a client. It wasn't fun.
Every page needs hreflang annotations pointing to all its language variants:
<!-- On /fr/about -->
<link rel="alternate" hreflang="en" href="https://example.com/about" />
<link rel="alternate" hreflang="fr" href="https://example.com/fr/about" />
<link rel="alternate" hreflang="de" href="https://example.com/de/about" />
<link rel="alternate" hreflang="ja" href="https://example.com/ja/about" />
<link rel="alternate" hreflang="ar" href="https://example.com/ar/about" />
<link rel="alternate" hreflang="x-default" href="https://example.com/about" />
The x-default tag is critical — it tells search engines which version to show when no locale matches. Don't skip it.
Automation is mandatory at scale. With 200 pages × 8 languages, you're managing 1,600 pages each needing 9 hreflang tags (8 languages + x-default). That's 14,400 hreflang annotations. You're not doing that by hand. Generate them programmatically:
// lib/generateHreflang.ts
export function generateHreflangTags(
path: string,
currentLocale: string,
locales: string[],
baseUrl: string
) {
return locales.map((locale) => ({
rel: 'alternate',
hreflang: locale,
href: `${baseUrl}${locale === 'en' ? '' : `/${locale}`}${path}`,
})).concat({
rel: 'alternate',
hreflang: 'x-default',
href: `${baseUrl}${path}`,
});
}
Multilingual Sitemaps
For sites with 5+ languages, use a sitemap index file pointing to per-locale sitemaps:
<!-- sitemap-index.xml -->
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap><loc>https://example.com/sitemap-en.xml</loc></sitemap>
<sitemap><loc>https://example.com/sitemap-fr.xml</loc></sitemap>
<sitemap><loc>https://example.com/sitemap-de.xml</loc></sitemap>
<!-- ... -->
</sitemapindex>
Each locale sitemap should include xhtml:link elements for hreflang cross-references. Google's John Mueller has confirmed this is the most reliable hreflang implementation method. Just do it this way.
Performance Optimization Across Locales
Translation Bundle Splitting
Don't ship all locale strings to every user. A typical 8-language site with 2,000 translation keys per locale generates ~400KB of uncompressed JSON. Load only what the active locale needs:
// Load translations dynamically
const messages = await import(`@/messages/${locale}.json`);
With Next.js 15 and next-intl, this happens automatically with server components — translation strings get rendered server-side and never ship as JavaScript to the client. Problem solved.
Font Loading for CJK Languages
Chinese, Japanese, and Korean fonts are massive. Noto Sans JP is 5.7MB for full character coverage. That'll absolutely wreck your Core Web Vitals if you're not careful. Here's what works:
- Use
unicode-rangesubsetting: Load only the characters used on each page - Google Fonts with
display=swap: Automatic subsetting for CJK - Variable fonts where available: Single file, multiple weights
/* Only load Japanese font for Japanese locale */
@font-face {
font-family: 'NotoSansJP';
src: url('/fonts/NotoSansJP-subset.woff2') format('woff2');
unicode-range: U+3000-9FFF, U+F900-FAFF; /* CJK subset */
font-display: swap;
}
CDN and Edge Caching
Configure your CDN to cache by locale. On Vercel, this happens automatically with the [locale] segment. On Cloudflare:
Cache-Key: ${URI}-${Accept-Language}
Vary: Accept-Language
But be careful with Vary: Accept-Language — it can fragment your cache in ugly ways. Better to use explicit locale URL paths (subdirectories) so each locale gets its own clean cache entry without header-based variation. Yet another reason subdirectories win.
Right-to-Left (RTL) Language Support
If any of your 5+ languages include Arabic, Hebrew, Persian, or Urdu, RTL support isn't optional. It touches everything:
- Document direction:
<html dir="rtl"> - CSS layout: Flexbox and Grid handle direction automatically.
margin-leftdoesn't — use logical properties instead. - Icons: Directional icons (arrows, navigation chevrons) need mirroring
/* Use CSS logical properties — works for both LTR and RTL */
.card {
margin-inline-start: 1rem; /* replaces margin-left */
padding-inline-end: 2rem; /* replaces padding-right */
border-inline-start: 3px solid blue; /* replaces border-left */
}
Tailwind CSS 3.4+ supports RTL variants out of the box:
<div class="ml-4 rtl:mr-4 rtl:ml-0">
<!-- Or better, use logical utilities -->
<div class="ms-4"> <!-- margin-inline-start -->
Test RTL layouts with pseudo-localization before actual Arabic translations arrive. Tools like pseudolocalize can mirror your English text to expose layout issues early — way before they become an awkward conversation during client QA. Ask me how I know.
Testing and QA for Multilingual Sites
Automated Testing Strategy
// e2e/multilingual.spec.ts (Playwright)
import { test, expect } from '@playwright/test';
const locales = ['en', 'fr', 'de', 'ja', 'ar', 'pt-br', 'es', 'ko'];
for (const locale of locales) {
test(`homepage loads correctly in ${locale}`, async ({ page }) => {
await page.goto(`/${locale}`);
// Verify HTML lang attribute
const lang = await page.getAttribute('html', 'lang');
expect(lang).toBe(locale);
// Verify hreflang tags exist for all locales
for (const l of locales) {
const hreflang = page.locator(`link[hreflang="${l}"]`);
await expect(hreflang).toHaveCount(1);
}
// Verify x-default exists
await expect(page.locator('link[hreflang="x-default"]')).toHaveCount(1);
// Verify no untranslated strings (English appearing on non-EN pages)
if (locale !== 'en') {
const h1 = await page.textContent('h1');
expect(h1).not.toBe('Welcome'); // English fallback detection
}
});
}
Visual Regression Testing
German text averages 30-40% longer than English. Japanese can be shorter but needs different line-height. Use Percy or Chromatic to catch layout breakage across locales — set up snapshots for every supported language at both desktop and mobile breakpoints.
The investment in multilingual testing infrastructure pays for itself after the second content update that would've silently broken three locales. And there's always a second content update. Always.
Look, if this all sounds like a lot to coordinate — it is. But it's engineering work we do regularly. Reach out to discuss your multilingual project, or check our pricing for an estimate.
FAQ
How much does it cost to build a multilingual website with 5+ languages?
For a headless setup (Next.js or Astro + headless CMS), expect $30,000-$80,000 for the initial build depending on page count and complexity. On top of that, budget $500-$2,000/month for translation management tooling and ongoing translation costs of $0.08-$0.20 per word for professional human translation. Machine translation with human review can cut those per-word costs by 40-60%.
Should I use a translation plugin or build custom i18n?
For WordPress sites under 3 languages, plugins like WPML ($79/year) or Polylang work fine. Past 5 languages though, the overhead of plugin-based translation on a monolithic CMS gets unmanageable. A headless CMS with a dedicated TMS integration is the scalable path — the CMS handles content modeling, the TMS handles workflow, and your frontend framework handles routing and rendering. Clean separation of concerns.
What's the best headless CMS for multilingual websites?
Depends entirely on what you're optimizing for. Storyblok has the most polished built-in multilingual editing experience with its visual editor and field-level localization. Sanity gives you the most flexibility through document-level localization and custom workflows — it's ideal when your content models get complex. Contentful is the safest enterprise pick with strong TMS integrations, but watch the pricing — each locale counts against entry limits. There's no universal answer.
How do I handle SEO for multilingual websites?
Three non-negotiable requirements: correct hreflang tags on every page pointing to all language variants, per-locale XML sitemaps with cross-references, and an x-default hreflang pointing to your canonical/default language version. Use subdirectory URL structure (/fr/, /de/) for consolidated domain authority. Submit locale-specific sitemaps in Google Search Console and Bing Webmaster Tools. And monitor indexing per locale weekly for the first three months — you'll catch problems early instead of discovering them when organic traffic craters.
Can I use Google Translate or AI to translate my website?
Not as your production translation without human review. Google Cloud Translation V3 and DeepL hit 85-92% accuracy for European language pairs, dropping to 70-80% for CJK languages. The workflow that actually works: machine translate for first draft, human translator reviews and corrects, then publish. This hybrid approach cuts translation costs by 40-60% while maintaining quality. And never auto-translate legal, medical, or financial content without expert human review. Just don't.
How do I handle URL slugs in different languages?
Translated URL slugs (/fr/a-propos instead of /fr/about) improve SEO and user experience but add real complexity. You need a slug mapping table in your CMS and bidirectional lookup during routing. For 5+ languages, we recommend translated slugs for top-level pages and key landing pages, but keeping blog post slugs in the original language or using a transliterated version. Maintaining hundreds of translated URLs across a dozen locales is a burden that compounds fast.
What's the performance impact of supporting many languages?
With proper architecture? Near zero. Static site generation with Astro or Next.js pre-renders each locale as independent HTML pages — the server and CDN serve the French page just as fast as the English one. The main performance risks are loading all locale translation bundles at once (solved by per-locale code splitting), CJK font loading (solved by subsetting), and cache fragmentation at the CDN layer (solved by URL-based locale routing instead of header-based).
How long does it take to add a new language to an existing multilingual site?
With the right architecture already in place, adding a 9th language to an 8-language site takes 1-2 days of engineering work: add the locale to routing config, create the CMS locale/dimension, configure the TMS for the new language, and update hreflang generation. The bottleneck is always content translation, not engineering. A 50-page site with 200 translation keys takes roughly 2-3 weeks for professional translation and review.