When AlexNet, a convolutional neural network, won the ImageNet image recognition competition in 2012, the world became enamoured with the possibilities of artificial intelligence (AI).
For the first time, a neural network powered by graphics processing units (GPUs) performed much faster and more accurately than classical algorithmic approaches. Since then, these networks have been able to recognise objects more accurately than humans can.
“And with more compute and data, we can get even higher performance algorithms in essentially simple programs that are taught to learn from data, creating fundamental breakthroughs,” said Nigel Toon, co-founder and CEO of Graphcore, a UK-based chip startup.
That’s starting to play out in the field of natural language processing (NLP). In May 2019, researchers from Google published a paper on a bidirectional transformer network called Bert, that learns, unsupervised, from corpuses of written information, paving the way for improved sentiment analysis.
But as the number of parameters in the models used by such networks increases exponentially, reaching hundreds of millions in some cases, a new computing paradigm is needed.
“We need to move towards compute that’s a lot more sparse to support much larger models,” said Toon at an AI seminar organised by Nanyang Technological University in Singapore. “We need new machines to compute for AI; we cannot use existing CPUs or GPUs to move forward.”
Sensing the opportunities presented by the growing interest in the use of AI in industry, Toon and his co-founder Simon Knowles started Graphcore in 2016, to develop what the company calls an intelligence processing unit (IPU).
Toon said the IPU was designed to solve some of the problems that stand in the way of achieving “machine intelligence”.
“Today, when you look at machine intelligence, we’re primarily classifying static data, training systems using huge amounts of labelled data, and deploying systems that do not learn from experience.
“What we really need are machines that will understand the sequences of words, the meaning of those words and the context of the conversation,” he said, using NLP as an example. “Then, they memorise the context of the conversation and use that to understand future conversations.”
To improve processing speed and efficiency, the IPU itself holds the machine learning model and the data, with no external memory sitting next to it to minimise latency. “If the memory is far away and very large, we can stream it in later but during compute, all the data is held inside the processor,” said Toon.
Each IPU has 1,216 specialised cores, each with six independent program threads operating in parallel on the machine learning model. The chip is capable of delivering 125 teraflops of compute power.
“So, rather than look at a model layer by layer, we’re looking across layers,” he said, adding that multiple IPUs can be joined up to solve a problem. “We’re able to understand the context, and we’re able to deal with many more parallel processes across the processors.”
Read more about AI and machine learning
- Cloud-powered drones equipped with machine learning algorithms will soon be able to detect crocodiles lurking in the waters of Queensland, Australia.
- Amazon Web Services seeks to cut the cost of running artificial intelligence inference engines with the introduction of a Nvida V100 GPU-based instance.
- CISOs must start thinking about how to engage with intelligent, adaptive, non-human attackers, says Trend Micro’s Rik Ferguson.
- An artificial intelligence system trained by Microsoft Research Asia is now nearly as good as human players in the ancient Chinese game of Mahjong.
Toon said the IPU architecture is also suited for both training and inferencing, which are often treated as two separate tasks in machine learning today.
“In future, when we have probabilistic systems that learn from experience, those systems will continue to learn from experience after they have been deployed,” he said. “The division between training and inference will disappear.”
On the software side, the IPU supports well-known machine learning frameworks such as TensorFlow, as well as Graphcore’s own graph framework that lets developers build their own library functions and neural network capabilities.
Since its inception, Graphcore has raised over $300m in funding. Toon said the company has “large amounts of cash on our balance sheet” and expects significant revenues next year.