Glossary Term
Sitemaps
History and Purpose of Sitemaps
- Google introduced Sitemaps 0.84 in June 2005
- Google, Yahoo!, and Microsoft announced joint support for the Sitemaps protocol in November 2006
- Ask.com and IBM announced support for Sitemaps in April 2007
- State governments of Arizona, California, Utah, and Virginia announced they would use Sitemaps in May 2007
- Sitemaps protocol is based on ideas from Crawler-friendly Web Servers
- Sitemaps are beneficial for websites with unavailable areas through the browsable interface
- Useful for websites with rich Ajax, Silverlight, or Flash content not processed by search engines
- Helps with large websites to avoid overlooking new or updated content
- Effective for websites with isolated or poorly linked pages
- Useful for websites with few external links
File Format and Element Definitions
- Sitemaps use XML tags and can be UTF-8 encoded
- Sitemaps can also be plain text lists of URLs
- Sitemaps can be compressed in .gz format
- Sitemap index files are necessary for large sites with a maximum size of 50MiB or 50,000 URLs
- Sitemap index files reference separate sitemaps
- The 'urlset' element is required and contains the Sitemap
- The 'url' element is required and serves as the parent element for each entry
- The 'sitemapindex' element is required for Sitemap index files
- The 'sitemap' element is required and serves as the parent element for each entry in the index
- The 'loc' element is required and provides the full URL of the page or sitemap
Other Formats and Search Engine Submission
- Sitemaps can be in the form of a simple list of URLs in a text file
- Syndication feeds can be used to submit URLs to crawlers
- Having a syndication feed as a delta update can supplement a complete sitemap
- Search engine submission of Sitemaps provides status information and processing errors
- Sitemap location can be included in the robots.txt file or specified in the search engine's submission URL.
Sitemap Limits
- Sitemap files have a limit of 50,000 URLs and 50MiB (52,428,800 bytes) per sitemap.
- Sitemaps can be compressed using gzip, reducing bandwidth consumption.
- Multiple sitemap files are supported, with a Sitemap index file serving as an entry point.
- Sitemap index files may not list more than 50,000 Sitemaps and must be no larger than 50MiB and can be compressed.
- You can have more than one Sitemap index file.
Additional Sitemap Types
- A number of additional XML sitemap types outside of the scope of the Sitemaps protocol are supported by Google.
- Video and image sitemaps are intended to improve the capability of websites to rank in image and video searches.
Subgroup: Video Sitemaps
- Video sitemaps indicate data related to embedding and autoplaying, preferred thumbnails to show in search results, publication date, video duration, and other metadata.
- Video sitemaps are also used to allow search engines to index videos that are embedded on a website, but that are hosted externally, such as on Vimeo or YouTube.
Subgroup: Image Sitemaps
- Image sitemaps are used to indicate image metadata, such as licensing information, geographic location, and an image's caption.
Subgroup: Google News Sitemaps
- Google supports a Google News sitemap type for facilitating quick indexing of time-sensitive news subjects.