Building a 10,000+ Page HTS Code Lookup Tool with Programmatic SEO
Last year, we built a customs tariff lookup tool that generates over 12,000 pages from the Harmonized Tariff Schedule database. Within six months, it was pulling 40,000+ organic visits per month from importers, customs brokers, and logistics professionals searching for specific HS codes. The project taught us a ton about programmatic SEO at scale, tariff data structures, and the weird edge cases you hit when turning government datasets into something people actually want to use.
This is the full breakdown of how we did it -- the architecture, the data pipeline, the SEO strategy, and the mistakes we made along the way.
Table of Contents
- What Are HS Codes and HTS Codes?
- Why Tariff Data Is Perfect for Programmatic SEO
- The Data Pipeline: From USITC to Your Database
- Page Architecture for 10,000+ HTS Code Pages
- Building the Lookup Tool Frontend
- SEO Strategy for Tariff Code Pages
- Performance and Infrastructure
- Monetization Strategies for Customs Data Sites
- Common Pitfalls and How We Fixed Them
- FAQ

What Are HS Codes and HTS Codes?
Before we get into the technical build, let's make sure we're speaking the same language. The Harmonized System (HS) is an international nomenclature developed by the World Customs Organization (WCO). It's used by more than 200 countries to classify traded goods. Every product that crosses a border gets tagged with an HS code.
Here's where it gets interesting for the US market: the United States uses the Harmonized Tariff Schedule (HTS), which extends the international 6-digit HS code to 8 or 10 digits for more granular classification. The first 6 digits are internationally standardized. Digits 7-8 are US-specific tariff lines. Digits 9-10 are statistical suffixes used by Census for trade data.
The Scale of the Data
The HTS contains roughly:
| Level | Digits | Approximate Count | Purpose |
|---|---|---|---|
| Chapter | 2 | 99 | Broad categories (e.g., Chapter 61: Knitted apparel) |
| Heading | 4 | ~1,200 | Product groups |
| Subheading | 6 | ~5,000 | International standard level |
| US Tariff Line | 8 | ~10,000 | US-specific duty rates |
| Statistical Suffix | 10 | ~17,000 | Census reporting detail |
That's 10,000+ unique tariff lines, each with its own duty rate, unit of quantity, special program eligibility, and related notes. Every single one of them is something a customs broker, importer, or logistics company might Google.
Why Tariff Data Is Perfect for Programmatic SEO
Programmatic SEO works best when you have a large dataset with consistent structure, where each entry answers a specific search query. Tariff data checks every box:
High search volume in aggregate. Individual HTS codes might get 50-200 searches per month, but multiply that across 10,000 codes and you're looking at serious traffic.
Clear search intent. When someone Googles "HTS code 6110.30.30" or "tariff rate for cotton sweaters," they want a specific answer. You can deliver it.
Underserved market. The official USITC HTS website (hts.usitc.gov) is functional but not user-friendly. It's a PDF-based system that hasn't been meaningfully updated in years. Most competing sites are either paywalled (like those from customs brokerage firms) or poorly built.
Commercial intent. People searching for tariff codes are doing business. They're importing goods. They're spending money. That means the traffic has real value -- either for lead generation, SaaS subscriptions, or advertising.
I've seen programmatic SEO projects built around zip codes, recipe variations, and product comparisons. Tariff data sits in a sweet spot because the queries are specific enough to avoid competing with massive authority sites, but commercial enough to monetize.
The Data Pipeline: From USITC to Your Database
This is where most people give up. Getting the tariff data into a usable format is genuinely annoying. Here's how we approached it.
Data Sources
The US International Trade Commission publishes the HTS in several formats:
- PDF files -- the official format, organized by chapter. Useless for programmatic use.
- XML/JSON feeds -- the USITC has an API at
api.usitc.govthat provides structured data. This is your primary source. - Excel downloads -- available from the USITC website, decent for one-time imports but not for staying current.
We used the USITC API as our primary data source, with Excel files as a fallback for validation.
The Ingestion Script
Here's a simplified version of our data ingestion pipeline in Python:
import requests
import json
from datetime import datetime
USITC_API_BASE = "https://api.usitc.gov/hts/v1"
def fetch_chapters():
"""Fetch all HTS chapters from USITC API"""
response = requests.get(f"{USITC_API_BASE}/chapters")
return response.json()
def fetch_headings(chapter_id):
"""Fetch all headings within a chapter"""
response = requests.get(f"{USITC_API_BASE}/chapters/{chapter_id}/headings")
return response.json()
def fetch_subheadings(heading_id):
"""Fetch tariff lines for a heading"""
response = requests.get(f"{USITC_API_BASE}/headings/{heading_id}")
return response.json()
def build_tariff_record(raw_data):
"""Transform API response into our internal schema"""
return {
"hts_code": raw_data["htsno"],
"description": raw_data["description"],
"general_rate": raw_data.get("general", "Free"),
"special_rate": raw_data.get("special", ""),
"column_2_rate": raw_data.get("other", ""),
"unit_of_quantity": raw_data.get("units", ""),
"chapter": raw_data["htsno"][:2],
"heading": raw_data["htsno"][:4],
"last_updated": datetime.utcnow().isoformat(),
"notes": raw_data.get("footnotes", []),
}
Data Enrichment
Raw HTS data is dry. To make pages that actually rank and provide value, we enriched each record with:
- Plain-English descriptions -- the official HTS descriptions are written in legal/trade jargon. We used GPT-4 to generate human-readable summaries, then had a trade compliance consultant review them.
- Related codes -- links to parent headings, sibling codes, and commonly confused alternatives.
- Historical duty rates -- we maintain a changelog showing rate changes over time, which is especially valuable given recent tariff shifts.
- Trade program eligibility -- whether the code qualifies for GSP, USMCA, CAFTA-DR, and other preferential programs.
- Section 301/232 applicability -- critical for anyone importing from China. We cross-reference with the USTR exclusion lists.
Keeping Data Current
Tariffs change. A lot. Between Section 301 tariffs, anti-dumping duties, and the 2025 tariff adjustments, the HTS gets updated frequently. We run a daily cron job that checks for changes and flags any modified records for review.
# Daily sync cron job
0 4 * * * /usr/bin/python3 /app/scripts/sync_hts_data.py --notify-on-changes

Page Architecture for 10,000+ HTS Code Pages
This is where the web development gets interesting. You need a URL structure and page template system that scales to tens of thousands of pages while maintaining quality.
URL Structure
We settled on this hierarchy:
/hts/ → Main lookup tool
/hts/chapter/{chapter}/ → Chapter overview (99 pages)
/hts/heading/{heading}/ → Heading detail (1,200 pages)
/hts/code/{hts-code}/ → Individual tariff line (10,000+ pages)
/hts/search?q={query} → Search results
Each level links up and down the hierarchy. A tariff line page links to its parent heading, which links to its parent chapter. This creates a strong internal linking structure that search engines love.
The Page Template
Every HTS code page follows the same template but feels like a unique, valuable resource. Here's what we include:
- Hero section -- HTS code number, official description, and general duty rate displayed prominently
- Duty rate table -- General (Column 1), Special (FTA rates), and Column 2 rates
- Plain-English explanation -- what products this code covers, with examples
- Section 301/232 status -- whether additional tariffs apply
- Related codes -- similar or commonly confused HTS numbers
- Breadcrumb navigation -- Chapter → Heading → Subheading → Code
- Import data (when available) -- aggregate trade statistics from Census
- Notes and rulings -- relevant customs rulings that clarify classification
Implementation with Next.js
We built this with Next.js using static generation with incremental static regeneration (ISR). For a project like this, you really want the pages to be pre-rendered for performance and SEO, but you also need them to update when tariff data changes.
// app/hts/code/[code]/page.tsx
import { getHTSCode, getAllHTSCodes } from '@/lib/tariff-data';
import { notFound } from 'next/navigation';
export async function generateStaticParams() {
const codes = await getAllHTSCodes();
return codes.map((code) => ({
code: code.hts_number.replace(/\./g, '-'),
}));
}
export async function generateMetadata({ params }) {
const code = await getHTSCode(params.code);
if (!code) return {};
return {
title: `HTS Code ${code.hts_number} - ${code.short_description} | Duty Rate & Details`,
description: `Look up HTS code ${code.hts_number}: ${code.description}. General duty rate: ${code.general_rate}. Find tariff details, Section 301 status, and trade program eligibility.`,
};
}
export default async function HTSCodePage({ params }) {
const code = await getHTSCode(params.code);
if (!code) notFound();
return (
<article>
<Breadcrumbs chapter={code.chapter} heading={code.heading} />
<h1>HTS Code {code.hts_number}</h1>
<DutyRateTable rates={code.rates} />
<ProductDescription description={code.enriched_description} />
<Section301Status code={code.hts_number} />
<RelatedCodes codes={code.related} />
<HTSCodeSchema code={code} /> {/* JSON-LD structured data */}
</article>
);
}
export const revalidate = 86400; // Revalidate daily
If you're considering building something like this, our team at Social Animal has deep experience with Next.js development for exactly these kinds of data-heavy programmatic SEO builds.
Building the Lookup Tool Frontend
The static pages drive organic traffic, but the interactive lookup tool is what makes people bookmark the site and come back. Here's what ours includes:
Search Functionality
Users search for HTS codes in two ways: by code number or by product description. We built a search that handles both.
For code-based search, we use prefix matching with a trie data structure loaded into memory. Typing "6110" instantly shows all codes starting with those digits.
For text-based search, we use a combination of PostgreSQL full-text search and Typesense for instant results. The key insight: people don't search using official HTS terminology. They search for "cotton t-shirt tariff" not "knitted or crocheted apparel of cotton, other." We built a synonym mapping table with about 5,000 entries to bridge that gap.
// Simplified search endpoint
export async function GET(request: Request) {
const { searchParams } = new URL(request.url);
const query = searchParams.get('q');
// Detect if query looks like an HTS code
const isCodeSearch = /^[\d.]+$/.test(query?.trim() || '');
if (isCodeSearch) {
return searchByCode(query);
} else {
return searchByDescription(query);
}
}
Tariff Calculator
We added a duty calculator that lets users input an HTS code and declared value, then shows the estimated duty amount. It accounts for:
- Ad valorem rates (percentage-based)
- Specific rates (per unit, like "$0.15/kg")
- Compound rates (combination of both)
- Section 301 additional tariffs
- Applicable trade program discounts
This feature alone generates significant engagement and positions the tool as more than just a data lookup.
SEO Strategy for Tariff Code Pages
Generating 10,000 pages is the easy part. Getting them to rank is where the real work happens.
Structured Data
We implement JSON-LD on every page. There's no official schema.org type for tariff data, but we use a combination of Dataset, WebPage, and custom properties:
{
"@context": "https://schema.org",
"@type": "WebPage",
"name": "HTS Code 6110.30.30 - Cotton Sweaters",
"description": "Tariff details for HTS 6110.30.30",
"mainEntity": {
"@type": "Dataset",
"name": "HTS Code 6110.30.30 Duty Rates",
"description": "Current duty rates and tariff information"
}
}
Internal Linking Strategy
This is crucial for programmatic SEO. Each page links to:
- Its parent heading and chapter (upward links)
- 5-8 related codes at the same level (lateral links)
- Relevant blog posts explaining classification nuances
- The main search/lookup tool
We also built "category hub" pages for common product types ("Apparel Tariff Codes," "Electronics HTS Codes," etc.) that serve as topical clusters.
Avoiding Thin Content Penalties
Google's helpful content update hit a lot of programmatic SEO sites hard. Here's how we kept our pages above the quality threshold:
| Risk Factor | Our Solution |
|---|---|
| Duplicate/boilerplate content | Each page has unique enriched descriptions, not just template variables |
| No unique value vs. source | Added plain-English explanations, Section 301 cross-references, and calculators |
| Shallow pages | Minimum 300 words of unique content per page, including related code analysis |
| Poor internal linking | Hierarchical + lateral link structure with meaningful anchor text |
| Missing E-E-A-T signals | Trade compliance consultant reviews, dated updates, cited sources |
Sitemap Strategy
With 10,000+ pages, you need multiple sitemaps. We generate them programmatically:
sitemap-chapters.xml-- 99 URLssitemap-headings.xml-- ~1,200 URLssitemap-codes-01.xmlthroughsitemap-codes-20.xml-- ~500 URLs eachsitemap-index.xml-- ties them all together
We submit these through Google Search Console and monitor indexing rates weekly. Expect it to take 2-3 months for Google to fully crawl and index a site this size.
Performance and Infrastructure
Hosting and Build
Our stack:
- Framework: Next.js 14 with App Router
- Database: PostgreSQL on Supabase for tariff data
- Search: Typesense (self-hosted on a $20/month Hetzner box)
- Hosting: Vercel Pro ($20/month)
- CDN: Vercel Edge Network (included)
- Data sync: Python scripts on a Railway cron job ($5/month)
Total infrastructure cost: roughly $50-60/month. That's absurdly cheap for a site serving 40,000+ monthly visitors with 10,000+ pages.
Build times were a challenge initially. Generating 10,000+ static pages on Vercel would take 30+ minutes and hit memory limits. We switched to ISR with on-demand revalidation, which cut the initial build to under 5 minutes. Pages are generated on first visit and cached.
For sites of this scale, we've also had success with Astro, which can handle static generation of huge page counts more efficiently than Next.js in some cases. The trade-off is less interactivity out of the box.
Core Web Vitals
Targets we hit:
- LCP: 1.2s (mostly static content, optimized images)
- FID/INP: 45ms (minimal JavaScript on code pages)
- CLS: 0 (no layout shifts -- everything server-rendered)
Monetization Strategies for Customs Data Sites
Once you're getting traffic from importers and customs brokers, there are several monetization paths:
Lead generation for customs brokers. Customs brokerage firms will pay $50-200 per qualified lead. A "find a customs broker" feature with geographic matching converts well.
SaaS subscription. Offer premium features like bulk code lookup, API access, duty change alerts, and classification history. We've seen tools in this space charge $49-199/month.
Advertising. Trade publication ads and supply chain software companies pay premium CPMs. Even basic display ads from Google AdSense perform well because the traffic has commercial intent.
Affiliate partnerships. Customs compliance software (like Descartes, Amber Road/E2open), trade finance platforms, and freight forwarding services all have affiliate programs.
Data licensing. If you've enriched the raw HTS data significantly (better descriptions, Section 301 mappings, classification guides), companies will pay for API access to your enriched dataset.
Common Pitfalls and How We Fixed Them
Pitfall 1: Stale data. Tariff rates change without much notice. We got burned when Section 301 rates were modified and our site showed outdated information for two weeks. Fix: daily automated sync plus a manual review trigger when the Federal Register publishes tariff notices.
Pitfall 2: Duplicate content across code levels. Chapter, heading, and individual code pages can end up saying very similar things. Fix: each level has a distinct content focus. Chapters discuss the product category broadly. Headings compare related products. Individual codes give specific duty rates and classification guidance.
Pitfall 3: Indexation issues. Google was slow to index pages beyond the first 2,000. Fix: proper sitemap segmentation, internal linking improvements, and we also used IndexNow via Bing/Yandex to accelerate crawl discovery. Patience helps too -- it took about 10 weeks to get 90%+ indexation.
Pitfall 4: Legal concerns. HTS data itself is public domain (government data), but some value-added datasets have licensing restrictions. Make sure you're sourcing from the USITC directly, not scraping from commercial databases.
Pitfall 5: User intent mismatch. Some users land on a code page but need help with classification -- they're not sure they have the right code. We added a "Not sure this is the right code?" section with links to related codes and a classification guide. This reduced bounce rate by 15%.
If you're interested in building a programmatic SEO project around trade data or any other large dataset, we specialize in this kind of headless CMS development. Feel free to reach out to discuss your project.
FAQ
What's the difference between an HS code and an HTS code?
HS (Harmonized System) codes are the international standard, consisting of 6 digits used by over 200 countries. HTS (Harmonized Tariff Schedule) codes are the US-specific extension, going up to 10 digits. The first 6 digits of any HTS code match the international HS code. The additional digits provide US-specific duty rate and statistical detail.
Is HTS tariff data freely available for use on websites?
Yes. The Harmonized Tariff Schedule is published by the US International Trade Commission and is public domain government data. You can freely use, reproduce, and build tools around it. However, be careful about using value-added datasets from commercial providers -- those often have licensing restrictions.
How often does the HTS get updated?
The HTS is updated multiple times per year. Major revisions typically happen in January, with interim modifications published throughout the year via Federal Register notices. In 2024-2025, updates have been especially frequent due to Section 301 tariff modifications, anti-dumping duty changes, and trade program adjustments. Your data pipeline needs to account for this cadence.
How many pages should an HTS code lookup tool have?
A thorough tool should cover all approximately 10,000 8-digit tariff lines, plus pages for the 99 chapters and ~1,200 headings. Including 10-digit statistical suffixes can push you past 17,000 pages. For programmatic SEO purposes, the 8-digit level is the sweet spot -- it maps directly to duty rates and generates the most search traffic.
What's the best tech stack for building a programmatic SEO site with thousands of pages?
Next.js with Incremental Static Regeneration (ISR) is our go-to for sites under 50,000 pages. Astro is excellent for purely static sites with minimal interactivity. For very large sites (100,000+ pages), consider a hybrid approach with server-side rendering and aggressive caching at the CDN level. PostgreSQL handles the data layer well, and Typesense or Meilisearch provide fast search without the cost of Algolia.
How long does it take for Google to index 10,000+ programmatic pages?
In our experience, expect 8-12 weeks for full indexation of a new domain with 10,000+ pages. Google crawls new sites conservatively. You can accelerate this with proper XML sitemaps, Search Console submission, IndexNow protocol, and strong internal linking. Sites with existing domain authority will index much faster -- sometimes within 2-3 weeks.
Do programmatic SEO pages get penalized by Google's Helpful Content Update?
They can, if you're generating thin, templated pages with no unique value. The key is ensuring each page provides information that users can't easily get from the source data. In our case, we add plain-English descriptions, Section 301 cross-references, related code suggestions, and duty calculators. Google's guidance is clear: programmatic content is fine as long as it's genuinely helpful to searchers.
What's the revenue potential of an HTS code lookup tool?
A well-built tool generating 40,000-100,000 monthly visits from trade professionals can realistically generate $3,000-15,000/month through a combination of display advertising, lead generation for customs brokers, and premium subscription features. The traffic has high commercial intent -- these are business users actively importing goods -- so RPMs tend to be significantly higher than general web traffic.