Sitemap Validator

Validate sitemap XML structure, count URLs, detect duplicates, and check sitemap compliance.

Back to all tools on ToolForge

More in SEO Tools

Validation Report

About Sitemap Validator

This sitemap validator parses XML sitemap files and reports on structure, URL count, duplicate detection, and format compliance. It supports both standard sitemaps (urlset) and sitemap indexes (sitemapindex), helping SEO specialists and developers verify sitemap correctness before submission to search engines.

The validator uses browser-based XML parsing (DOMParser) to analyze sitemap structure without sending data to any server. All processing happens locally in your browser.

Sitemap XML Structure Overview

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>https://example.com/page1</loc>
    <lastmod>2026-01-15</lastmod>
    <changefreq>weekly</changefreq>
    <priority>0.8</priority>
  </url>
  <url>
    <loc>https://example.com/page2</loc>
    <lastmod>2026-01-10</lastmod>
  </url>
</urlset>

Elements:
  <loc>     - Required. Absolute URL (including protocol)
  <lastmod> - Optional. Last modification date (ISO 8601)
  <changefreq> - Optional. always/hourly/daily/weekly/monthly/yearly/never
  <priority> - Optional. 0.0 to 1.0 (default 0.5)

Sitemap Index Format

For sites with more than 50,000 URLs or files larger than 50MB (uncompressed), use a sitemap index:

<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <sitemap>
    <loc>https://example.com/sitemap1.xml</loc>
    <lastmod>2026-01-15</lastmod>
  </sitemap>
  <sitemap>
    <loc>https://example.com/sitemap2.xml</loc>
    <lastmod>2026-01-14</lastmod>
  </sitemap>
</sitemapindex>

JavaScript Validation Algorithm

// Sitemap Validator Implementation

function validateSitemap(xmlString) {
  // Parse XML using DOMParser
  const parser = new DOMParser();
  const doc = parser.parseFromString(xmlString, "text/xml");

  // Check for parse errors
  const parseError = doc.querySelector("parsererror");
  if (parseError) {
    return { valid: false, error: "Invalid XML: " + parseError.textContent };
  }

  const root = doc.documentElement;
  const report = [];

  // Check root element
  report.push("Root element: " + root.nodeName);

  if (root.nodeName === "urlset") {
    // Standard sitemap
    const urls = Array.from(doc.getElementsByTagName("url"));
    const locs = urls.map(url => {
      const loc = url.getElementsByTagName("loc")[0];
      return loc ? loc.textContent.trim() : "";
    }).filter(Boolean);

    // Detect duplicates
    const duplicates = locs.filter(
      (loc, index) => locs.indexOf(loc) !== index
    );

    report.push("URL count: " + locs.length);
    report.push("Duplicate URLs: " + duplicates.length);
    report.push("Sample URLs: " + locs.slice(0, 5).join(", "));

  } else if (root.nodeName === "sitemapindex") {
    // Sitemap index
    const sitemaps = Array.from(doc.getElementsByTagName("sitemap"));
    report.push("Nested sitemap count: " + sitemaps.length);
  }

  return { valid: true, report: report };
}

Sitemap Requirements and Limits

Requirement Specification Notes
Max URLs per sitemap 50,000 URLs Use sitemapindex for more
Max file size 50MB uncompressed Compressed .gz doesn't count
URL format Absolute URL required Must include https://
Encoding UTF-8 Declare in XML header
Namespace sitemap.org/schemas/sitemap/0.9 Required for urlset
Required element <loc> All other elements optional

Common Sitemap Errors

Error Cause Fix
Invalid XML Missing closing tags, unescaped characters Use XML validator, escape & < >
Relative URLs Missing protocol/domain Use full https://example.com/page
Duplicate URLs Same URL listed multiple times Remove duplicates, canonicalize
Wrong namespace Missing or incorrect xmlns attribute Add xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
Exceeds limits >50,000 URLs or >50MB Split into multiple sitemaps with index
Non-canonical URLs Redirects, 404s, noindex pages Only include canonical, indexable URLs

Sitemap Best Practices

How to Submit Sitemap to Search Engines

  1. Google Search Console:
    • Verify site ownership in Search Console
    • Navigate to "Sitemaps" in left menu
    • Enter sitemap URL (e.g., sitemap.xml)
    • Click "Submit"
    • Monitor coverage and errors in report
  2. Bing Webmaster Tools:
    • Verify site in Bing Webmaster Tools
    • Go to "Sitemaps" section
    • Submit sitemap URL
    • Review crawl statistics
  3. robots.txt reference:
    • Add line: Sitemap: https://example.com/sitemap.xml
    • Place at top or bottom of robots.txt
    • Search engines discover sitemap automatically

Sitemap Priority and Change Frequency Reference

Element Values Default Notes
<priority> 0.0 to 1.0 0.5 Relative importance within site
<changefreq> always/hourly/daily/weekly/monthly/yearly/never N/A Hint for crawlers (often ignored)
<lastmod> ISO 8601 date (YYYY-MM-DD) N/A Most important for crawl prioritization

Sitemap Example for Different Site Types

E-commerce Site Sitemap Structure:

sitemap.xml (index file)
├── sitemap-products.xml (product pages)
├── sitemap-categories.xml (category pages)
├── sitemap-brands.xml (brand pages)
├── sitemap-blog.xml (blog posts)
└── sitemap-static.xml (about, contact, etc.)

Each sub-sitemap contains up to 50,000 URLs.
Index file references all sub-sitemaps.
Submit only the index file to Search Console.

Frequently Asked Questions

What is a sitemap and why is it important?
A sitemap is an XML file that lists all important URLs on a website to help search engines discover and crawl content. It provides metadata about each URL (last modified date, change frequency, priority) and is essential for SEO, especially for large sites, new sites without many backlinks, and sites with deep page hierarchies.
What is the difference between urlset and sitemapindex?
A urlset is a standard sitemap containing individual URLs (up to 50,000 URLs max). A sitemapindex is a sitemap of sitemaps, used when a site exceeds 50,000 URLs or 50MB uncompressed. Each entry in a sitemapindex points to another sitemap file. Google can process both formats and will follow sitemapindex references automatically.
What are the sitemap XML requirements?
Sitemap XML must: use UTF-8 encoding, have proper XML declaration, use the sitemap.org namespace (http://www.sitemaps.org/schemas/sitemap/0.9), contain valid URL elements with tags, escape special characters (&, <, >), and be well-formed XML. File size must be under 50MB uncompressed with max 50,000 URLs per sitemap.
How do I submit a sitemap to Google?
Submit sitemaps via Google Search Console: verify site ownership, go to Sitemaps section, enter sitemap URL (e.g., sitemap.xml), click Submit. Also reference sitemap in robots.txt using 'Sitemap: https://example.com/sitemap.xml'. Google will crawl the sitemap and add URLs to its crawl queue.
What common sitemap errors should I avoid?
Common errors: including non-canonical URLs (redirects, 404s, noindex pages), exceeding 50,000 URLs or 50MB limit, using incorrect namespace, missing XML declaration, including URLs blocked by robots.txt, having duplicate URLs, not updating lastmod dates, and using absolute vs relative URL inconsistencies.
How often should I update my sitemap?
Update sitemaps whenever content changes: new pages added, URLs modified, content significantly updated. For dynamic sites, generate sitemaps automatically. For static sites, update manually after each change. Search engines use dates to prioritize crawling, so accurate dates improve crawl efficiency.