XML Sitemap Checker

Fetches the sitemap, validates the XML, counts URLs, expands sitemap indexes, supports gzip, and flags common issues (duplicates, off-domain URLs, future lastmod dates, invalid priority, oversize files).

XML Sitemaps: Your Blueprint for Getting Every Page Discovered and Indexed

An XML sitemap is essentially a roadmap you hand to search engines, saying "here are all the important pages on my website, and here's some additional context about each one." It doesn't guarantee that every URL in your sitemap will be crawled or indexed — Google makes its own decisions about that — but it dramatically increases the likelihood that your pages will be discovered, especially on larger sites where some content might be several clicks away from the homepage and therefore harder for crawlers to find through link following alone.

For small websites with a handful of pages, a sitemap is less critical — Google will usually find all your content through normal crawling. But for sites with hundreds or thousands of pages, a well-structured XML sitemap is genuinely essential infrastructure.

The Basic Structure of an XML Sitemap

At its simplest, an XML sitemap is a list of URLs wrapped in XML markup. Each URL entry (called a <url> element) contains at minimum the page's location (<loc>) and optionally includes metadata like the last modification date (<lastmod>), how frequently the page changes (<changefreq>), and the page's priority relative to other pages on the site (<priority>).

However, be aware that Google has publicly stated it largely ignores changefreq and priority values in sitemaps because site owners routinely set them inaccurately (everyone marks everything as high priority). The lastmod date, on the other hand, is actively used by Google — if it's accurate and consistent, it helps Googlebot prioritize re-crawling updated content more efficiently.

Sitemap Index Files

A single XML sitemap file has a maximum limit of 50,000 URLs and 50 MB (uncompressed). Large websites frequently need to split their content across multiple sitemap files. A sitemap index file is a special sitemap that simply lists the locations of all your individual sitemap files. This lets you have a main entry point at /sitemap.xml that search engines can reference, while your actual URL lists are organized into logical sub-sitemaps — one for blog posts, one for product pages, one for category pages, and so on.

Specialized Sitemaps: Images and Videos

Beyond standard page sitemaps, Google supports specialized sitemap extensions for images and videos. An image sitemap tells Google about images embedded in your pages that it might miss during normal crawling — particularly images loaded via JavaScript or displayed in ways that aren't easily parseable from the HTML source. A video sitemap provides metadata about video content including title, description, thumbnail URL, and duration, helping Google Surface your videos in video search results.

Keeping Your Sitemap Accurate

The most damaging thing you can do with a sitemap is include URLs that return errors. If your sitemap lists 500 URLs and 200 of them return 404 errors, you're explicitly pointing Google to broken pages. This wastes crawl budget and signals poor site maintenance. Most modern CMS platforms (WordPress, Shopify, etc.) generate sitemaps automatically and keep them updated, but it's worth auditing periodically to make sure only live, indexable pages are included. Use our XML Sitemap Checker to quickly scan any domain's sitemap and catch issues before they affect your crawl efficiency.