What is XML Sitemap?
Definition
An XML sitemap is a machine-readable file listing website URLs and metadata that helps search engines discover, crawl, and understand site structure.
Why xml sitemap matters
XML sitemaps matter because they provide search engines with a complete roadmap of your site. Rather than relying solely on link discovery, sitemaps ensure crawlers know about all important pages.
Sitemaps communicate page metadata that helps search engines prioritize crawling. Last modified dates signal when content has changed, prompting re-crawling of updated pages.
For large or complex sites, sitemaps are essential for crawl efficiency. They help ensure crawl budget is spent on important pages rather than wasted on duplicate or low-value content.
Key concepts and types
- •URL listing
The comprehensive list of pages you want search engines to index. - •Lastmod element
Timestamp indicating when a page was last modified. - •Changefreq element
Hint about how often page content changes (often ignored by Google). - •Priority element
Relative importance of pages on your site (often ignored by Google). - •Sitemap index
File linking to multiple sitemaps for sites exceeding 50,000 URLs.
Common misconceptions
- ✕Sitemaps guarantee pages will be indexed
- ✕Priority and changefreq significantly impact crawling
- ✕You only need to submit a sitemap once
- ✕Every URL should be in your sitemap
- ✕Sitemaps replace the need for internal linking
Related terms
FAQs
What's the difference between XML and HTML sitemaps?
XML sitemaps are for search engines—machine-readable files listing URLs and metadata. HTML sitemaps are for users—web pages listing site links to aid navigation. Both can be valuable.
How do you submit an XML sitemap to Google?
Submit through Google Search Console by adding your sitemap URL in the Sitemaps section. You can also reference it in robots.txt. Google will then regularly check for updates.
What shouldn't be included in an XML sitemap?
Exclude pages with noindex tags, redirect URLs, duplicate content, error pages, and anything you don't want indexed. Include only canonical, indexable URLs you want to rank.