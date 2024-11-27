rimom - Fotolia
Barings Law plans to sue Microsoft and Google over AI training data
Microsoft and Google are using people’s personal data without proper consent to train artificial intelligence models, alleges Barings Law, as it prepares to launch a legal challenge against the tech giants
A Manchester law firm has started on-boarding clients for a probable class action against Microsoft and Google, which it believes to be unlawfully collecting and using peoples’ personal data to train their artificial intelligence (AI) models.
Following a two-year-long investigation into the data practices of the tech giants, Barings Law believes the extensive information being collected about users – including voice data, demographic data, app usage information, metadata, payment details and a range of other personal details – is potentially being shared for the training and development of various AI large language models (LLMs).
Barings claims this is all happening without proper authorisation or consent from users, as while they may understand data is being collected, they may be unaware of the role this data plays in the training of AI LLMs.
“Both companies are collecting data such as the sports teams you follow, the programming languages you prefer, the stocks you track, your local weather or traffic, the route you take to work and what your voice sounds like,” said Adnan Malik, head of data breach at Barings Law. “We are shocked and disgusted to learn about the level of data that has been and continues to be collected.”
Malik added that while the proliferation of AI is transforming the world as we know it, the development of the technology must not come at the expense of people’s right to privacy.
“Individuals have the right to know what data of theirs is being stored and what it is being used for,” he said. “They also have the right to opt out of their behaviours, voice, likeness, habits and knowledge being used to train AI for the profit of tech giants.
“As technologies continue to develop, individual data has become the most valuable commodity in the world. We know that it’s illegal to steal commodities like money, gold and oil. As a society, we cannot accept that it’s acceptable to steal the commodity of personal data.”
Joining the lawsuit
Barings is now inviting anyone with a Microsoft or Google account, or those who’ve used either firms’ services, to join the lawsuit. This includes those who have used platforms and services such as YouTube, Gmail, Google Docs, Google Maps, LinkedIn, OneDrive, Outlook, Microsoft 365 and Xbox.
The firm said it was expecting to be “inundated” with sign-ups, and plans to formally begin court proceedings at the beginning of 2025.
Microsoft and OpenAI, the firm behind ChatGPT, are facing a separate class action lawsuit in the US from Clarkson Law Firm, over allegations they have violated the privacy of hundreds of millions of internet users by secretly scraping vast amounts of personal data to train AI chatbots. Filed with a federal court in San Fransisco on 28 June, that lawsuit is seeking damages of $3bn.
Another lawsuit has also been filed against Google, again by Clarkson Law Firm, which alleges the tech giant has accessed millions of users’ data for use in the development of its AI chatbot, Bard, which has since been rebranded to Gemini. The lawsuit claims Google has surreptitiously stolen “everything ever created and shared on the internet by hundreds of millions of Americans”.
Malik said that while the cases are similar, and taken together are a testament to the growing international concern around data security, Barings is taking action against Microsoft and Google, rather than OpenAI.
“If you are shocked, upset, appalled or annoyed that your data is being used without your knowledge and consent, my message to you is simple – do something about it by joining the fight,” he said. “Sign up today and let’s take the future of our data and AI into our own hands.”
Computer Weekly contacted both Microsoft and Google about the lawsuit. While Microsoft declined to comment, Google did not respond by time of publication.
Other AI developers have already made various arguments to defend their use of people’s personal data and copyrighted material in the training of their models, including that the material falls under “fair use”, (which permits the limited use of copyrighted material without permission, for purposes such as criticism, news reporting, teaching and research).
For example, in a copyright lawsuit filed by music publishers in January 2024 against LLM developer Anthropic AI, the Amazon-backed firm argued that “using works to train Claude is fair as it does not prevent the sale of the original works, and, even where commercial, is still sufficiently transformative”.
Anthropic also argued that “today’s general-purpose AI tools simply could not exist” if AI companies had to pay licences for the material, adding that it’s not alone in using data “broadly assembled from the publicly available internet”; and that “in practice, there is no other way to amass a training corpus with the scale and diversity necessary to train a complex LLM with a broad understanding of human language and the world in general”.