History and Purpose of Sitemaps
– Google introduced Sitemaps 0.84 in June 2005
– Google, Yahoo!, and Microsoft announced joint support for the Sitemaps protocol in November 2006
– Ask.com and IBM announced support for Sitemaps in April 2007
– State governments of Arizona, California, Utah, and Virginia announced they would use Sitemaps in May 2007
– Sitemaps protocol is based on ideas from Crawler-friendly Web Servers
– Sitemaps are beneficial for websites with unavailable areas through the browsable interface
– Useful for websites with rich Ajax, Silverlight, or Flash content not processed by search engines
– Helps with large websites to avoid overlooking new or updated content
– Effective for websites with isolated or poorly linked pages
– Useful for websites with few external links
File Format and Element Definitions
– Sitemaps use XML tags and can be UTF-8 encoded
– Sitemaps can also be plain text lists of URLs
– Sitemaps can be compressed in .gz format
– Sitemap index files are necessary for large sites with a maximum size of 50MiB or 50,000 URLs
– Sitemap index files reference separate sitemaps
– The ‘urlset’ element is required and contains the Sitemap
– The ‘url’ element is required and serves as the parent element for each entry
– The ‘sitemapindex’ element is required for Sitemap index files
– The ‘sitemap’ element is required and serves as the parent element for each entry in the index
– The ‘loc’ element is required and provides the full URL of the page or sitemap
Other Formats and Search Engine Submission
– Sitemaps can be in the form of a simple list of URLs in a text file
– Syndication feeds can be used to submit URLs to crawlers
– Having a syndication feed as a delta update can supplement a complete sitemap
– Search engine submission of Sitemaps provides status information and processing errors
– Sitemap location can be included in the robots.txt file or specified in the search engine’s submission URL.
– Sitemap files have a limit of 50,000 URLs and 50MiB (52,428,800 bytes) per sitemap.
– Sitemaps can be compressed using gzip, reducing bandwidth consumption.
– Multiple sitemap files are supported, with a Sitemap index file serving as an entry point.
– Sitemap index files may not list more than 50,000 Sitemaps and must be no larger than 50MiB and can be compressed.
– You can have more than one Sitemap index file.
Additional Sitemap Types
– A number of additional XML sitemap types outside of the scope of the Sitemaps protocol are supported by Google.
– Video and image sitemaps are intended to improve the capability of websites to rank in image and video searches.
Subgroup: Video Sitemaps
– Video sitemaps indicate data related to embedding and autoplaying, preferred thumbnails to show in search results, publication date, video duration, and other metadata.
– Video sitemaps are also used to allow search engines to index videos that are embedded on a website, but that are hosted externally, such as on Vimeo or YouTube.
Subgroup: Image Sitemaps
– Image sitemaps are used to indicate image metadata, such as licensing information, geographic location, and an image’s caption.
This article is written like a manual or guide. (March 2021)
Sitemaps is a protocol in XML format meant for a webmaster to inform search engines about URLs on a website that are available for web crawling. It allows webmasters to include additional information about each URL: when it was last updated, how often it changes, and how important it is in relation to other URLs of the site. This allows search engines to crawl the site more efficiently and to find URLs that may be isolated from the rest of the site's content. The Sitemaps protocol is a URL inclusion protocol and complements
robots.txt, a URL exclusion protocol.
1912 NW 143rd Ave #24,
Portland, OR 97229, USA