Laurent - stock.adobe.com
Not only has ChatGPT made a big splash, but other applications are coming out too, many of which use small, embedded devices. “I think we can safely say that AI is reaching its iPhone mode,” says Moeyersoms. “The mass adoption of this technology is coming, and we are seeing breakthroughs at a very fast pace.”
AI will have at least as much impact on society as previous technological breakthroughs, such as the personal computer or the internet. But the exponential growth in applications will result in exponential growth in computational workloads. Machine learning algorithms already require billions of operations, leaving behind a very big carbon footprint – and the problem is getting much bigger, very fast.
The fancy state-of-the-art neural networks underpinning most of today’s AI applications are increasing computational demands by a factor of 100 every two years, far surpassing Moore’s Law. “If we do not address power consumption, we are sure to hit a roadblock,” warns Moeyersoms.
“What’s even more challenging is that AI models will need to run on embedded devices in the near future,” says Steven Latré, who leads AI research at Imec, and is also a part-time professor at the University of Antwerp.
“You don’t want to go to the cloud for every query. You need to get intelligence into an embedded device, a smartphone or something else at the edge. If you have that type of computational power at the edge, there’s a lot you can do, from automotive to life science applications.”
Julie Moeyersoms, Imec
“The challenge, of course, is power efficiency,” says Latré. “If you brought the current neural networks to an edge device, it would just burn a lot of energy. In fact, it wouldn’t even be possible to run the application on any type of battery that exists today.”
Overcoming the bottleneck
Imec researchers are rising to this challenge, applying a full-stack approach that involves innovation across the application stack to achieve systems technology cooptimisation. Cooptimisation means two or more components are optimised in an interdependent manner. Improvements are made to one component, based on the assumption that the other components will be present.
Starting with a good knowledge of the workload of a given application, researchers adapt machine learning algorithms and hardware architecture. Both the software and hardware are tailored to the application. Moreover, the software is optimised to run on specific hardware and vice-versa.
“We are slowly moving away from the general-purpose von Neumann architectures, towards more domain-specific, and even application-specific, architectures – especially when there is a high demand for computing resources and power consumption is high. Example applications are automotive, healthcare and robotics,” says Moeyersoms.
“Artificial intelligence technology runs both in the cloud and on these embedded devices,” she adds. “Each environment needs a specific architecture. We would like to have the power of a ChatGPT on an embedded device. And that will be needed because these edge devices have computational requirements that are increasing exponentially as well.”
The applications that will require embedded AI include complex sensor fusion, where multiple sensors pre-process raw data so the data can be fused with other sensor data to serve intelligent cars, for example. Collaborative robots (cobots) are another example. They need to interact freely with their environment, without having to query a central system. This means they need embedded AI.
“Being able to do the inference as well as the learning on the device itself will be crucial to success of a lot of applications,” says Moeyersoms.
Imec’s approach to energy-efficient embedded AI is based on intense hardware and software codesign using an agile development method. Both hardware and software are changed to reach the required levels of performance and power consumption – and even more importantly, software and hardware are finely tuned to one another for a specific application.
But Imec has also been developing expertise in neuromorphic computing – an approach that might make embedded AI more practical.
Applying neuromorphic computing to embedded AI
“The human brain is able to do one exaflop with 20 watts of power,” says Latré. “Compare that with the fastest supercomputer in the world, Frontier at Oakridge National Labs – to do one exaflop, Frontier needs 20 megawatts. So, in a very basic sense, the human brain is one million times more power efficient than the world’s fastest supercomputer.”
Imec does a lot of research and development on neuromorphic computing, building software and hardware based on what we know about how the human brain works. It already uses neuromorphic chips as the basis for smart sensors.
“We have several types of neuromorphic chips, each supported by a neural network tailored to the type of chip,” says Moeyersoms. “These neuromorphic chips are not general-purpose AI accelerators. Instead, they are designed to process very specific sensory inputs.”
Ilja Ocket, Imec
Smart sensors will be an important part of future AI applications and will be implemented with some combination of embedded AI and neuromorphic computing. Rather than process entire images, neuromorphic sensors process only the parts of an image where change has occurred. This and the concept of distributing intelligence are further examples of the way technology can mimic nature.
“Biological systems process data at the edge, before sending it to the brain,” explains Ilja Ocket, programme manager for neuromorphic computing at Imec. “They perform perception tasks on the edge, for example.”
“We will need to do the same thing for future applications. You can’t keep sending raw data to the cloud or to some central computer. The sensors will need embedded AI to perform some of the tasks at the edge, and that will only be possible when we bring the power consumption down.”
With a new bag of tricks to bring power consumption down, the future is looking better for AI applications.