Examples of scraper websites
– Search engines like Google scrape content from other websites to present it to their users.
– Some dating websites use scraping techniques, often combined with facial recognition.
– Scraper websites are used for general image analysis and identifying images of crops with pests and diseases.
– Scraper sites made for advertising are called ‘Made for AdSense’ sites.
– Some scraper sites link to other sites to improve their search engine ranking.
Made for advertising
– ‘Made for AdSense’ sites have no value except to generate ad clicks.
– These sites are considered search engine spam and dilute search results.
– Some scraper sites use private blog networks to improve their ranking.
– Auto blogs, a type of scraper site, were common among black-hat marketers.
– Scraper sites can be used to manipulate search engine results.
Legality
– Scraper sites may violate copyright law, even when scraping from open content sites.
– Some licenses, like GFDL and CC-BY-SA, require republishers to inform readers and give credit to the original author.
– Scraping without respecting licenses is a copyright violation.
– Copyright infringement can occur even when scraping from Wikipedia.
– Violating copyright licenses can have legal consequences.
Techniques
– Different scraper sites target websites based on their objectives.
– Some scraper sites target sites with large amounts of content, like airlines or department stores, to gather pricing information.
– Other scraper sites pull snippets and text from high-ranking websites to improve their own search engine ranking.
– RSS feeds are vulnerable to scraping.
– Some scraper sites consist of advertisements and random paragraphs of words.
Domain hijacking
– Scraper sites may purchase recently expired domains to utilize their SEO power.
– Expired domains can be used to maintain backlinks and historical ranking ability.
– Spammers may match the topic or copy existing content from the Internet Archive to maintain authenticity.
– Some expired domain registration agents provide services to find and gather HTML from expired domains.
– Domain hijacking can be used to create new sites or power private blog networks.
This article needs additional citations for verification. (August 2011) |
A scraper site is a website that copies content from other websites using web scraping. The content is then mirrored with the goal of creating revenue, usually through advertising and sometimes by selling user data.
Scraper sites come in various forms: Some provide little if any material or information and are intended to obtain user information such as e-mail addresses to be targeted for spam e-mail. Price aggregation and shopping sites access multiple listings of a product and allow a user to rapidly compare the prices.