Glossary · Definition
XML sitemap
An XML sitemap is an XML file that lists every URL you want search engines to crawl and (usually) index. It's not a ranking factor, but it helps Google discover and re-crawl your pages faster.
Definition
An XML sitemap is an XML file that lists every URL you want search engines to crawl and (usually) index. It's not a ranking factor, but it helps Google discover and re-crawl your pages faster.
What it means
A sitemap follows the Sitemaps.org spec: an XML file with a root <urlset> containing one <url> per page, with optional <lastmod>, <changefreq>, and <priority> tags. Modern sitemaps can be up to 50,000 URLs or 50MB per file — beyond that, use a sitemap index that points to multiple sub-sitemaps (common pattern: one per section). Submit your sitemap via Google Search Console and Bing Webmaster Tools, and reference it in robots.txt. For large sites, per-section sitemaps (products, posts, categories) are better than one giant file — easier to debug and faster to update.
Advertisement
Why it matters
For a new site, a sitemap dramatically speeds up initial indexing. For a large site, it ensures deep pages don't get missed. For a fast-changing site, <lastmod> helps Google re-crawl just the pages that changed. Missing a sitemap won't stop Google from finding your content, but it slows discovery meaningfully — especially for deep content.
Example
<?xml version="1.0" encoding="UTF-8"?> <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> <url> <loc>https://example.com/</loc> <lastmod>2026-04-20</lastmod> <priority>1.0</priority> </url> </urlset>
Related free tools
Frequently asked questions
Do priority and changefreq matter?
Google mostly ignores them. The essential fields are <loc> and <lastmod>. Don't spend time fine-tuning the other fields.
Should I submit sitemap to Google?
Yes — via Google Search Console. This is the fastest way to get new URLs discovered.
What if my site has more than 50,000 URLs?
Use a sitemap index — a small XML file that lists multiple sitemaps. Most CMS and frameworks support this out of the box.
Related terms
- Definitionrobots.txtrobots.txt is a small text file served at /robots.txt that instructs search-engine crawlers which parts of a site they can and can't crawl. It's a suggestion, not a lock — well-behaved bots honor it.
- DefinitionCanonical URLA canonical URL is the one 'official' URL for a piece of content, declared to search engines via a <link rel="canonical"> tag. It tells Google 'if you find this page at multiple URLs, treat this one as the main version.'