
Looker_Studio - stock.adobe.com
Should we trust Humphrey to boost public sector efficiency?
Labour is betting on Humphrey, an AI toolkit named after a Yes Minister character, to drive public sector efficiency
In a twist of bureaucratic brilliance that Yes, Minister fans will appreciate, the civil service is rolling out a suite of AI tools named – yes – Humphrey. Named after Sir Humphrey Appleby, the oh-so-helpful civil servant who was actually a master of obstruction through cooperation, this AI initiative is designed to streamline services, cut delays, and help unlock £45bn in annual productivity gains across the public sector. But for those familiar with the BBC classic, the choice of name feels less like a nod to innovation and more like a cautionary tale. Because just like its namesake, this new digital civil servant might end up subtly steering us in the wrong direction.
Humphrey AI is a suite of tools, including Consult, Parlex, Minute, Redbox, and Lex, which target bureaucratic pain points: duplicated administration, siloed data, and slow decision-making. If executed well, it could reduce the need for external consultants, accelerate decision-making, and enhance the public’s experience.
It is part of a broader initiative to bring the state into the digital age. It will help streamline processes across the public sector by providing online data processing, automating routine administrative tasks, and accelerating time-consuming research that can slow down policy development. By enabling secure, interoperable data flows, Humphrey can improve citizen experiences while reducing civil service costs and overcoming reliance on external consultants to process and analyse data.
No Minister, AI can't fix your own data problems
However, there is a lesson from Yes, Minister that still holds - a well-meaning assistant can mislead while appearing helpful. This is especially true for the latest generation of AI tools. These systems are only as good as the data that feeds them. There's also increasing evidence that as their reasoning and other specialist capabilities improve, these systems tend to "hallucinate" more.
Poorly curated datasets can lead AI tools to deliver confident-sounding but nonsensical results, a risk with serious implications for public trust. One striking example involved a GPT-3.5 model trained on 140,000 internal Slack messages. When prompted to write content, it responded, “I shall work on that in the morning.” Rather than performing the task, the Smart Connections plugin had mimicked the procrastination habits embedded in its training data. It had performed an entirely different function than anticipated, using a fundamentally unsuitable dataset, albeit one that superficially appears appropriate due to its size.
In addition to having the right training data, AI requires access to AI-ready, well-governed task-relevant datasets. Despite a wealth of open data on platforms like data.gov.uk, much of it is not readily usable for training or fine-tuning AI systems. A recent analysis by the Open Data Institute (ODI) revealed that key public datasets used by most AI models do not, as of April 2024, make the most of the statistical and other authoritative data published on such government portals. The 13,556 pages from data.gov.uk that have been scraped for inclusion in a popular AI dataset like CommonCrawl, rarely contributed to answering citizen questions about public services accurately. Across 195 such citizen questions, AI models correctly referenced data.gov.uk statistics in only five cases. Instead, they drew on secondary or unreliable sources, such as social media posts or opinion articles, or simply fabricated answers. This disconnect is dangerous; it opens the door to misinformation being generated by government-deployed AI tools.
A reason for this is that government data is often not published in AI-ready formats, for example, lacking machine-readable metadata or accessible summaries, which essentially renders the information invisible to AI models. Moreover, our understanding of what sources AI-enabled digital services should prioritise is limited. Compare that with the technical solutions that previous-generation AI tools, such as traditional search engines, put in place to ensure that citizen questions about public services rank government pages and other authoritative sources higher than secondary information. We're only just starting out on that journey with generative AI.
Digitising bad decision-making
Using AI to process data, research policy, or write documents requires an understanding of how these technologies work, the data they rely on, and their limitations. This is the only way workers can validate AI’s outputs. However, researchers at Harvard Business School found that while AI offers real value, its unpredictable failure points make both the benefits and risks hard to gauge, for individuals, organisations and governments alike.
The National Data Strategy, published under the previous Conservative Government, acknowledged problems such as ‘a fragmentation of leadership and a lack of depth in data skills at all levels’, and a culture which overemphasises the risks of misusing data, leading to ‘a chronic underuse of data and a woeful lack of understanding of its value’. This urgently needs to change. If civil servants don’t understand how AI works, how can they question its outputs?
Poor understanding at senior levels has particular consequences. For example, school absence data tracks data points such as year group and indicators of disadvantaged backgrounds, such as Child In Need status, but misses granular detail, such as neurodivergence, despite evidence that a very high proportion of children experiencing difficulty with attendance are autistic. This blinds policymakers to the fact that many persistently absent pupils are autistic, encouraging punitive responses like parental fines rather than tailored support. Better AI literacy, supported by the thoughtful use of AI tools themselves, can help civil servants not only understand data but learn how to question it.
Other countries are already moving ahead. Estonia, for example, has introduced Bürokratt, an AI chatbot aimed at reducing civil service workloads and accelerating service delivery. But crucially, Estonia isn’t just investing in tools; it’s investing in training its staff. The Estonian Ministry of Economic Affairs and Communications has launched the Digital State Academy, offering free courses on digital governance, AI, and data handling to civil servants.
Read more articles about Humphrey
- Labour puts Humphrey AI to work for council admin: A tool built on the government’s Humphrey AI toolset is being piloted by 25 councils to take notes during meetings.
- Humphrey AI tool powers Scottish Parliament consultation: AI-powered Consult tool has helped the Scottish Parliament to organise feedback from a public consultation into themes.
Britain should take note. While there have been efforts to upskill the UK civil service, most initiatives have focused on advanced data skills rather than the foundational data literacy required across the board. Policymakers don’t need to code in Python, but if they can’t spot bias in a dataset or question an AI’s output, then no amount of automation will deliver better decisions. It will just hide bad ones behind a sleek digital interface.
Streamlining the “creaking old bureaucratic machine”
In 1980, Minister Jim Hacker optimistically declared in Yes, Minister, “We're going to cut through all the red tape, streamline this creaking old bureaucratic machine”. Over forty years later, the government hopes AI could finally fulfil that promise - and drive broad-based economic growth along the way. In the public sector alone, technology minister Peter Kyle estimates “a £45 bn jackpot” for the public sector if the civil service successfully adopts AI. To unlock that, investment is needed, not just in tools like Humphrey, but in training and infrastructure to support their use.
The ODI is calling for a ten-year National Data Infrastructure Roadmap to do just that. This roadmap would underpin the AI Opportunities Action Plan by focusing on three pillars - interoperability, AI-ready data, and privacy-preserving technologies. While the Plan sets a strong direction, it lacks detail on how standards will be set and monitored and how foundational data infrastructure will be funded.
Transparency about the provenance and lineage of datasets used to train and operate AI in public services is critical. Without it, we can’t scrutinise how AI influences decisions that affect our lives. To build public trust, we need to explore participatory stewardship of key datasets so that the people most affected by public sector algorithms can help shape how their data is used.
This is where frameworks like the ODI’s new Framework for AI-Ready Data are vital. It sets out four core principles for preparing datasets for effective and ethical use in AI: technical optimisation, data quality and standards, legal compliance, and responsible collection. It goes beyond general principles like FAIR, (findable, accessible, interoperable and reusable), pointing to practical steps that non-specialist data publishers can follow to ensure that data is not just machine-readable, but meaningful, lawful, and fair.
To harness data for public good, we must think long-term, build solid data foundations, and above all, stay vigilant about the risks of digitising dysfunction. Otherwise, the most powerful new civil servant in Whitehall won’t be human, it will be an AI called Humphrey. And like its namesake, it will appear endlessly helpful, while subtly shaping outcomes to suit the data it’s trained on. Civil servants risk becoming modern-day Jim Hackers, trying valiantly to streamline a creaking old machine, while being quietly outmanoeuvred by their new digital colleague.
Elena Simperl is the director of research at the ODI.