Web scraping by hedge funds is growing rapidly

Despite information pouring in from billions of websites, poor performance plagued the hedge fund industry in 2018 — pushing investment managers to increase their already-massive web scraping programs.

One out every 20 web page visits last year was done by a hedge fund or sell-side research institution scraping websites for information, according to a new report by Opimas Analysis. This comes out to roughly 10.2 billion page visits a day, equal to the daily users of Google’s search function, and expected to “grow rapidly.”

By 2020, managers’ web page visits for the purpose of scraping, or extracting information from a website using an automated software program, will eclipse 17 billion and cost more than $1.8 billion — nearly double what it currently costs — as managers invest in software, talent and outside vendors to clean and store the loads of data.

While the alternative data scene is exploding with new providers offering obscure info, research firm Opimas calls the web “the ultimate dataset.”

Sign up here for our weekly newsletter Wall Street Insider, a behind-the-scenes look at the stories dominating banking, business, and big deals.

“In the coming few years, we will see increasing efforts on the part of investment firms to harness and leverage web data in their decision-making processes,” the report says.

Hedge fund managers are pressed to find new sources of alpha-generating data wherever they can as poor performance and high fees have frustrated investors. Spending by asset managers on alternative datasets on subjects like weather trends, oil output and flight patterns is around $3 billion and growing, according to JPMorgan.

See more: A growing alternative data company helps hedge funds determine if CEOs are lying using CIA interrogation techniques

The most prolific scrapers in the investment management space individually record hundreds of millions of web page visits a day, gathering actionable data on agriculture trends, earnings reports, transportation intel, real estate prices and more. Less than half of all web traffic in 2018 was from humans, as web-scraping bots made up 51% of the more than 200 billion web page visits per day.

“Opimas also expects to see more firms trying to monetize their web data by reselling the information that they have gathered through web scraping by making these datasets available to other firms that might not have the necessary scale to engage in large-scale web data harvesting exercises themselves,” the report reads.

Source link