LinkedIn denies exposure of 700 million user records is a data breach

Data relating to 700 million users of the LinkedIn networking platform has appeared for sale, but the firm says it is the victim of data scraping, not a security breach

Alex Scroxton, Security Editor

Published: 30 Jun 2021 15:53

LinkedIn has forcefully denied the exposure of data relating to 700 million users of its workplace networking platform – over 90% of its total user base – which has been offered for sale on the dark web, is a data breach, insisting that since the data was scraped by malicious actors it is not at fault.

According to PrivacySharks, which was first to report the incident on 27 June, a user of RaidForums first stated they were in possession of the data dump on 22 June and provided a sample of a million records as proof.

The organisation’s researchers confirmed the data involved includes full names, gender, email addresses, phone numbers and employment information. The full dump does not appear to include any financial or password records, although users are advised to immediately change their login details as a precaution, and should be keeping an eye out for suspicious activity on their credit cards as a matter of course.

In a statement, LinkedIn said: “Our teams have investigated a set of alleged LinkedIn data that has been posted for sale. We want to be clear that this is not a data breach and no private LinkedIn member data was exposed. Our initial investigation has found that this data was scraped from LinkedIn and other various websites and includes the same data reported earlier this year in our April 2021 scraping update.

“Members trust LinkedIn with their data, and any misuse of our members’ data, such as scraping, violates LinkedIn terms of service. When anyone tries to take member data and use it for purposes LinkedIn and our members haven’t agreed to, we work to stop them and hold them accountable.”

While LinkedIn’s assessment that the dataset is a combination of data from previous leaks and information scraped from public-facing profiles, and that its systems have not themselves been compromised, is likely correct, this does not make the fact that it is being made available for sale to malicious actors any less problematic.

Even without financial records, personal data records of the type contained in the dataset can be easily used in identity theft scams, or to conduct targeted social engineering and phishing attacks that may form the precursor to more serious security incidents, such as ransomware attacks. Data could also end up in the hands of online advertisers and marketing organisations which may be less than scrupulous in how they handle it.

Tim Mackey, principal security strategist at the Synopsys CyRC (Cybersecurity Research Centre), said that even though LinkedIn is technically correct in its assessment, for its users there was no difference between an attack on a company’s servers and the misuse of an application programming interface (API) to obtain data. “Data loss is data loss, and attackers will find the simplest way to obtain the data they need to fund their operations,” he said.

“Data loss is data loss, and attackers will find the simplest way to obtain the data they need to fund their operations”

Tim Mackey, Synopsys CyRC

Indeed, added Mackey, such scraping attacks were likely to become more commonplace going forward. “As successful attacks on infrastructure become more difficult to execute, attackers will naturally shift their focus to abusing legitimate access methods like APIs provided by businesses to access data,” he said.

“Where legitimate users care about terms of service, criminals won’t. This is an important detail for anyone exposing an API on the internet – it’s only a matter of time before your APIs are discovered and abused,” he added. “The key question then becomes, how quickly can you detect abnormal usage and take corrective action? The more powerful your API, the more attractive it will be to criminals.”

Comparitech privacy advocate Paul Bischoff said data scraping was a problem that was hard for online platforms to combat. “To LinkedIn, scrapers are often indistinguishable from legitimate users, which makes it very difficult to block them. No matter what LinkedIn says about enforcing its terms of service, the truth is that scrapers won’t be stopped any time soon,” he said.

“Facebook and other social networks similarly struggle to block scrapers, and Facebook is reportedly trying to normalise the practice after hundreds of millions of its users’ profiles were scraped and dumped online,” he added.

“Although scraping is against most social networks’ terms of service, scrapers aren’t illegal. There are many people who argue that any information that’s publicly accessible is fair game for scrapers, and that scrapers can be used for legitimate purposes like academic research and journalism.

“End users are ultimately responsible for protecting their personal information. If your LinkedIn page or other social media profile contains personal information and is publicly viewable, then you should assume it will be scraped,” said Bischoff.

LinkedIn denies exposure of 700 million user records is a data breach

Data relating to 700 million users of the LinkedIn networking platform has appeared for sale, but the firm says it is the victim of data scraping, not a security breach

Read more about data leaks

Read more on Privacy and data protection

How to write a screen scraper application with HtmlUnit

How to enhance OSINT investigations using AI

How to scrape data from a website

screen scraping