Manage, don't mitigate: What you need to know about bots and scrapers
What are web bots and scrapers?
A bot, or web robot, is a software application that runs automated tasks over the Internet. Web bots and screen scrapers – which scrape information from website pages – may be deployed by search engines, competitors, and other third parties. Examples range from so-called good bots, such as Googlebot, which collect information to index and display website pages, to bots that overtax your servers and can appear to be Distributed Denial-of-Service (DDoS) attacks.
What are web bots used for?
Web bots, scrapers, and other automated apps may be used to gathercompetitive intelligence, aggregate content, track location and serve metasearch engines. For example, a web bot may be used by a financial aggregator to scrape portfolio data quickly. In other cases, a competitor may deploy a bot to grab inventory details and sell similar products at a lower price.
How are web bots identified?
Bots can be identified in a variety of ways, including the information in a request, the volume and rate of requests made, and other behavioral indicators. They are defined by four primary categories, based upon their desirability and aggressiveness:
- Highly desired with low aggression
- Low desirability and highly aggressive
- Highly desired and highly aggressive
- Low desirability with low aggression
Why are web bots a problem?
From an IT perspective, aggressive or poorly coded bots can overtax the web infrastructure, cause site slowdowns, and increase latency. In some cases observed by Akamai, bots have made several thousand requests per second – far in excess of what human users generate through a web browser. This volume of bot traffic, even from good bots, can impose an undesired load on IT infrastructure. For many organizations, bots can represent 40-60% of overall site traffic.
From a business perspective, web bots can have both a positive and negative impact. Good web bots, like Googlebot, are essential to your online business strategy. Other bots, such as those operated by competitors, content aggregators or scalpers, get between you and your customers. This results in less control over your online business strategy, with fewer sales opportunities and damage to customer relationships.
What is the best approach to lower the negative impacts of web bots?
Traditionally, organizations applied a one-size-fits-all approach to mitigating (blocking) identified bots. This can lead to lower searchengine visibility and negatively affect online business objectives. In addition, it is ineffective over the long term, as blocked bots often mutate and return better disguised.
Reducing the negative IT and business impacts from bots should include identification, categorization, and differential treatment of all bot types on your site. Akamai recommends a manage, don’t mitigate approach to bot intervention. Site owners should apply different management policies to each bot category, depending on their desirability and load.
By implementing a manage, don’t mitigate approach to web bots, site owners can lower operating costs by reducing the infrastructure and IT overhead needed to handle the additional bot traffic, improve (human) user experience, maintain a competitive advantage, and combat fraudulent activity.