Rokas - stock.adobe.com
The Netherlands is set to develop its own open language model, GPT-NL, in an important step towards the transparent, fair and testable use of AI. The model will be developed by independent research organisation TNO, the Netherlands Forensic Institute and SURF, a cooperative association of Dutch educational and research institutions in which the members combine their strengths. The Ministry of Economic Affairs and Climate is allocating €13.5m for the development of the Dutch language model.
A language model helps artificial intelligence (AI) to understand a language and communicate in it. It underlies, for example, ChatGPT, which works in multiple languages.
“Large language models [LLMs] such as ChatGPT offer promising technical opportunities to address societal challenges,” TNO said in a news release.
However, there are also concerns about the legal and ethical aspects of these developments, as most models are developed by foreign big tech. Those technology companies are not transparent about how they collect data. “The best-known example is ChatGPT, but other US and Chinese big tech parties are also developing their own models,” TNO said.
“These parties are usually not open to users, researchers and governments about the data used or the trained models, and this raises all kinds of questions.”
There are therefore serious concerns about the fairness of these models. The European Union (EU) has already warned AI organisations that stricter legislation is around the corner, and in the US, artists have already sued AI companies for copyright infringement. “We aim for a much fairer and more responsible model,” Selmar Smit, founder of GPT-NL, told Dutch news channel RTL Z. “The source data and the algorithm will be completely public.”
This will allow anyone to see how the underlying software works and how the AI system comes to certain conclusions. This also means that anyone can search that data, and if someone disagrees with the use of certain data, it is possible to object, Smit explains. This transparency is missing from many commercial AI initiatives
Digital sovereignty at stake
With GPT-NL, the Netherlands is taking an important step in developing public expertise and experience in GenAI language models. Having its own open language model strengthens its position on this topic, and also means a boost for AI research and innovation. Technology can make a significant contribution to solutions for societal challenges in, for example, healthcare, the labour market, mobility, energy and many other industries. TNO said LLMs will play an important role in this.
Due to the lack of understanding of how foreign models are trained, there is a legitimate concern whether Dutch social, legal and ethical values are being pursued. How do algorithms avoid bias, how does a model make decisions, and what about its interpretability? Are confidentiality, privacy and intellectual property respected? Are international or national laws or policy frameworks such as the EU AI Act respected? And can such models actually be used for important decisions? “Our Dutch digital sovereignty on a critical technology like AI is at stake here,” TNO argues.
The Netherlands is therefore building its own language model and ecosystem, developed according to Dutch values and guidelines. The open language model will be a virtual facility with an ecosystem of academic institutions, researchers, companies, governments and users. GPT-NL can be used by anyone who wants to do research on AI or develop something with it, and the creators of the Dutch LLM expect at least academic institutions, researchers and governments to use it.
“It enables them to research and try out language models in general, including specific applications in the fields of security, health, education, services and numerous other domains,” SURF states.
The new AI language model makes the Netherlands less dependent on commercial parties while contributing to more openness, transparency and protection of users’ data privacy.
In addition, the developing parties are paying attention to sustainability aspects, such as the energy consumption of this type of AI. The Dutch developers want GPT-NL to contribute to solutions for major societal challenges in line with the Sustainable Development Goals. Such as, for example, reducing inequality, and promoting digital inclusiveness and quality education. In time, the technology can also help alleviate labour market tightness and boost Conversational AI for better information exchange between people and AI systems, said TNO.
Retaining Dutch AI talent
With GPT-NL, the Netherlands is strengthening its strategic autonomy, knowledge and technology in the field of large language models. “We also hope the initiative will contribute to the recruitment and retention of AI talent, ensuring our country a healthy and competitive ecosystem for artificial intelligence,” said TNO.
In doing so, GPT-NL aligns seamlessly with the objectives of the Top Sector ICT, the Knowledge and Innovation Agenda Security of the Dutch Research Council, and the Dutch AI Coalition for applying digital innovations and the key technologies of AI, Data Science, and Data Spaces.
It will still take some time until the language model can be used. This is because the implementation of the project consists of two phases. This first year focuses on the actual development of the Dutch language model. This will actively involve the academic sector. In the phase that follows, the language model will be run on a computer.
To this end, a connection will be made with the Dutch national supercomputer. SURF is developing its own language model for education and research in parallel with the project.
Whether the €13.5m promised by the government will be enough remains to be seen. AI expert Remy Gieling pointed out in an interview in RTL Z’s studio that big tech firms, such as Microsoft and Google, are investing billions in AI. OpenAI is even estimated to be worth $90bn. “I hope we can get by with €13.5m, but I am healthily skeptical,” he said.
Read more about ChatGPT
- ChatGPT took the spotlight for AI-generated content, and Google answered with Bard. While Bard and ChatGPT might perform similar tasks, there are differences between the two.
- Many business use cases for ChatGPT are emerging, but organisations must decide which best fit their specific needs. Consider 10 pragmatic example applications.
- While the new product does not use customer prompts or company data in training OpenAI products, it fails to address other concerns surrounding hallucinations and copyright.