peterschreiber.media - stock.ado

AI safeguards improving, says UK government-backed body

Inaugural AI Security Institute report claims that safeguards in place to ensure AI models behave as intended seem to be improving

The safeguards in place to ensure that artificial intelligence (AI) models behave appropriately and as intended appear to be improving, or so claims the UK government’s AI Security Institute (AISI), which is today launching an in-depth report drawing on two years of AI research and experimentation in the field of cyber security and other scientific disciplines.

The Frontier AI trends report is a public assessment of how advanced AI systems are evolving and is designed to provide a “clear, evidence-based” view of advanced AI systems and reinforce discussions that are all too often driven by speculation and a lack of evidence.

“This report shows how seriously the UK takes the responsible development of AI. That means making sure protections are robust, and working directly with developers to test leading systems, find vulnerabilities and fix them before they are widely used,” said AI minister Kanishka Narayan.

“Through the world-leading AI Security Institute, we are building scientific capability inside government to understand these systems as they evolve, not after the fact, and to raise standards across the sector,” he said.

“This report puts evidence, not speculation, at the heart of how we think about AI, so we can unlock its benefits for growth, better public services and national renewal, while keeping trust and safety front and centre.”

The AISI said that while every system it tested was vulnerable to some sort of bypass, and protection measures vary wildly, huge strides are still being made. One such stride has been in the length of time it took the institute’s red-teamers to find a universal jailbreak for a model’s safety rules, which increased from minutes to several hours across multiple model generations, marking a significant improvement.

In other matters pertaining to cyber security, the AISI found that AI models working on apprentice-level cyber tasks were successful around half the time, compared with under 10% of the time just 24 months ago.

Moreover, the duration of cyber tasks that AI systems can complete without any human direction appears to be doubling every eight months, and this year, for the first time, said the AISI, an AI model completed an expert cyber task, defined as one that a human would need up to 10 years of work experience to accomplish themselves.

Other key findings – unrelated to cyber security – include insight into the pace of evolution of AI models for software engineering, many of which can now complete hour-long software engineering tasks more than 40% of the time, up from 5% in 2023. And in the fields of biology and chemistry, some systems are supposedly now outperforming PhD-level researchers in scientific knowledge tests and bringing higher-level lab expertise within reach of laypeople.

The AISI’s analysis also identified some early signs of capabilities linked to autonomy, but these were only seen in tightly controlled experimental conditions and none of the AI models tested showed harmful or spontaneous behaviour, although the institute pointed out that such factors needed to be accounted for and tracked sooner rather than later.

Supporting AI decision-makers

The AISI has been careful not to label its report – which it hopes will be the first of many – as a series of policy recommendations for Westminster, but rather frames it as a means to give technology decision-makers clear data on what AI systems can do, improve transparency and prompt clear-headed discussions about further developments.

The government’s role in this will be to continue to invest in evaluation and AI science alongside industry, researchers and international partners, with the intention of helping ensure AI can deliver growth, jobs and improved public services.

The UK will support that work by continuing to invest in evaluation and AI science, working with industry, researchers and international partners to ensure AI delivers growth, new jobs, improved public services and national renewal for hardworking communities.

“This report offers the most robust public evidence from a government body so far of how quickly frontier AI is advancing,” said Jade Leung, AISI chief technology officer and AI adviser to the prime minister.

“Our job is to cut through speculation with rigorous science. These findings highlight both the extraordinary potential of AI and the importance of independent evaluation to keep pace with these developments,” added Leung. 

Read more about AI governance and safety

Read more on Regulatory compliance and standard requirements