The internet is saturated with goods and services of all types, including protected content like books, music, video and branded products. Most often this protected content is classified as intellectual property and is protected by law through copyrights, patents and trademarks.
Intellectual property is the lifeblood of many companies. Due to substantial investments in production and advertising, the profits are often realized only after the products and services are sold through authorized distribution channels.
Without the protection of the law, the production of these goods and services is not feasible for financial reasons. Thankfully, web scraping is now emerging as the new hero that can be deployed to protect intellectual property for the benefit of both producers and consumers.
About the author
Andrius Palionis is VP Enterprise Solutions at Oxylabs
Piracy is on the rise
As more users from countries all over the world come online, the downloading of copyrighted content is increasing dramatically. According to recent research, 38% of consumers aged 16-64 download copyrighted music and illegal sales of ebooks resulted in a staggering loss of approximately $315 million in 2017. On top of that, roughly one third of all users in selected surveyed countries admitted to watching a movie or TV series from an illegal channel in 2017.
The power of web scraping can be leveraged to identify and report copyrighted material on illegal websites. Bots armed with keywords can be deployed to crawl pre-determined sites to locate the content, and once it’s found complaints can be filed with requests to remove the sites from the search engine index.
Brands are also intellectual property that must be protected
Along with protected media content, brands represent another intangible commodity of immeasurable value.
Branding is vitally important because it makes a substantial difference in how a product is marketed and priced. Products of inferior quality that look similar are often priced at a fraction of the cost of a branded product. That’s due in part to the fact that brands engage in expensive marketing campaigns that carry messages about the company’s core beliefs and this adds substantially to their overall value among consumers in the marketplace.
Since news moves with staggering speed on the internet, brands must be completely in tune and ready to address any attacks from consumers and competitors.
Companies must be on guard to ensure that the conversations about their brands remain positive. Scraping comments on public social media sites and in forums can help companies monitor the conversation.
The nature of our rapidly changing digital landscape means that a single complaint on a profile can reach the other side of the world within minutes. Web scraping helps companies address any issues before they go viral so their brand reputation remains spotless.
Web scraping protects branded goods from counterfeiting
Along with monitoring brand reputation, web scraping can help protect companies from the counterfeiting and sale of their products.
The production and sale of counterfeit goods is increasing as more companies come online from parts of the world that lack regulation. Statistics from the OECD are showing us that the sale of counterfeited goods is increasing worldwide, creating major concerns for branded goods manufacturers.
Footwear accounted for 22 percent of the total value of counterfeit goods seized by customs agents in 2016 and clothing coming in second at 16%. Online sales of counterfeited items are reaching staggering proportions and are currently valued at USD$590 billion per year according to the OECD.
Just as with pirated content and brand monitoring, the power of web scraping can be unleashed to protect companies from counterfeiting.
Using a set of predefined keywords, bots can be deployed to scrape target websites. Once proof of counterfeiting is found, companies can file Digital Millennium Copyright Act (DMCA) complaints with search engines to request removal of the sites from the index.
Overcoming web scraping challenges
While web scraping provides an opportunity to build a stable intellectual property protection process, there are many challenges:
Web scraping solutions must be able to scale up
Sites selling illegal or counterfeited content are growing every day and part of the challenge involves keeping track of new sites while continuing to monitor existing ones. Additionally, the coding of these sites can change frequently through new layouts and other parameters that can make existing in-house web scraping efforts obsolete.
One solution is to continuously upgrade the web scraping code to accommodate site changes. Another is to opt for a ready-to-use solution that takes care of the technical issues so efforts can be focused on analyzing data.
Web scraping solutions must work globally
Illegal activity is prevalent all over the world, however restricted geo-locations pose scraping challenges for some websites.
Infringers monitoring incoming traffic can spot web scrapers if they are coming from a single data center IP. Typical responses include blocking access to the website or showing incorrect data that can throw off cybersecurity analysts.
The solution to this issue is to use residential proxies that leverage a large pool of IP addresses from different locations. Rather than appear as traffic coming from a single IP, these proxies look like ordinary traffic and will rarely be blocked.
For many companies, intellectual property is their most valuable asset. Pirated content and brand counterfeiting on the internet directly cuts into profits and compromises the ability of many companies to continue creating products and services to serve the paying public.
Web scraping is emerging as a new hero that can be deployed to detect counterfeited products and sites sharing protected content. The use of modern web scraping tools is integral to the process and can provide an edge over infringers so businesses can continue to operate with confidence on the digital landscape.