AI firms can’t be trusted to voluntarily share risk information

Workers at frontier AI firms have warned that their employers – including OpenAI, DeepMind and Anthropic – can’t be trusted to voluntarily share information about their systems capabilities and risks with governments or civil society

Sebastian Klovig Skelton, Data & ethics editor

Published: 10 Jun 2024 15:15

Artificial intelligence (AI) companies cannot be relied on to voluntarily share information about system capabilities and risk, say current and former employees, in an open call for greater whistleblower protections.

During the second global AI Summit in Seoul, 16 companies signed the Frontier AI Safety Commitments, which is a voluntary set of measures for how they will safely develop the technology by, for example, assessing the risks posed by their models across every stage of the entire AI lifecycle, setting unacceptable risk thresholds to deal with the most severe threats, and providing public transparency over the whole risk assessment process.

Under one of the key voluntary commitments, the companies said they will also not develop or deploy AI systems if the risks cannot be sufficiently mitigated.

However, less than two weeks after the summit, a group of current and former workers from OpenAI, Anthropic and DeepMind – the first two of which signed the safety commitments in Seoul – have said the current voluntary arrangements will not be enough to ensure effective oversight of AI-developing companies.

They added that while the companies themselves, along with governments and other AI experts, have acknowledged the clear risks posed by the technology – which “range from the further entrenchment of existing inequalities, to manipulation and misinformation, to the loss of control of autonomous AI systems potentially resulting in human extinction” – the firms have “strong financial incentives” to avoid effective oversight.

“We do not believe bespoke structures of corporate governance are sufficient to change this,” they wrote in an open letter dated 4 June 2024.

“AI companies possess substantial non-public information about the capabilities and limitations of their systems, the adequacy of their protective measures, and the risk levels of different kinds of harm. However, they currently have only weak obligations to share some of this information with governments, and none with civil society. We do not think they can all be relied upon to share it voluntarily.”

Confidentiality agreements

The letter – signed by both anonymous and named employees – added that “broad confidentiality agreements” are blocking them from voicing concerns, “except to the very companies that may be failing to address these issues”.

“Ordinary whistleblower protections are insufficient because they focus on illegal activity, whereas many of the risks we are concerned about are not yet regulated,” they added. “Some of us reasonably fear various forms of retaliation, given the history of such cases across the industry. We are not the first to encounter or speak about these issues.”

The letter ends with calls for greater transparency and accountability from the companies, including that they will not enter into or enforce any agreement that prohibits criticism of risk-related concerns; facilitate an anonymous process for employees to raise concerns; and support internal cultures of open criticism that allows both current and former employees to raise concerns with a wide range of groups.

It also calls on the companies to not retaliate against current and former employees who publicly share risk-related confidential information after other processes have failed: “We accept that any effort to report risk-related concerns should avoid releasing confidential information unnecessarily. Therefore, once an adequate process for anonymously raising concerns to the company’s board, to regulators, and to an appropriate independent organisation with relevant expertise exists, we accept that concerns should be raised through such a process initially.

“However, as long as such a process does not exist, current and former employees should retain their freedom to report their concerns to the public.”

The letter was also signed by prominent AI experts Stuart Russell, Geoffrey Hinton and Yoshua Bengio; the latter of whom was selected by 27 countries during the first global AI summit to lead the first-ever frontier AI State of the science report assessing existing research on the risks and capabilities of the technology.

Commenting on the safety commitments made by the 16 AI firms at the time, Bengio said that while he is pleased to see so many leading AI companies sign up – and particularly welcomes their commitments to halt models where they present extreme risks – they will need to be backed up by more formal regulatory measures down the line. “This voluntary commitment will obviously have to be accompanied by other regulatory measures, but it nonetheless marks an important step forward in establishing an international governance regime to promote AI safety,” he said.

AI firms can’t be trusted to voluntarily share risk information

Workers at frontier AI firms have warned that their employers – including OpenAI, DeepMind and Anthropic – can’t be trusted to voluntarily share information about their systems capabilities and risks with governments or civil society

Confidentiality agreements

Read more about artificial intelligence

Read more on Technology startups

MIT study warns of major AI risk. Is governance keeping up?

Trump AI order targets frontier model prerelease review

What CIOs can learn from Anthropic's safety pullback

UK AI alignment project gets OpenAI and Microsoft boost