Intel kicked off CES 2017 in Las Vegas with the declaration that Moore’s Law is still relevant as it slated its first 10nm (nanometre) processor chips for release later this year.
Despite this, engineers are facing real issues in how to continue to push system performance to cope with the growing demands of new and emerging datacentre workloads.
This isn’t the first time the end of Moore’s Law has been proclaimed, but Intel and other chip makers have so far found new tricks for shrinking transistors to meet the goal of doubling density every two years, with a knock-on boost for compute performance.
Intel chief executive Brian Krzanich said at CES: “I’ve been in this industry for 34 years and I’ve heard the death of Moore’s Law more times than anything else in my career. And I’m here today to really show you and tell you that Moore’s Law is alive and well and flourishing. I believe Moore’s Law will be alive well beyond my career, alive and well and kicking.”
Approaching physical limits
Yet the pace is slowing as Intel works at developing 7nm and 5nm technologies to follow on from 10nm. The introduction of 10nm itself has already been delayed by a year because of difficulties with the manufacturing process, and these difficulties are likely to increase as the size approaches physical limits on how small the on-chip circuitry can be made.
Read more about next generation datacentres
More and more businesses are adopting a hybrid cloud strategy for their datacentre needs. Computer Weekly looks at what’s on offer.
HPE has developed a concept computing architecture which it says will power future generations of applications. We find out how it will change IT.
“I can’t see them getting much beyond 5nm, and Moore’s Law will then run out because we will have reached the end of the silicon era,” says Ovum principal analyst Roy Illsley. Some industry observers think this will happen in the next 10 years or so.
As to what will ultimately replace silicon, such as optical processing or quantum computing, there appears no consensus so far. However, this does not mean that compute power will cease to expand, as both hardware and software in the datacentre have evolved since the days of single-chip servers and monolithic applications.
“The way apps are written has changed,” says Illsley. “They are now distributed and scalable, so Moore’s Law is a rather pointless metric for what a computer can do, anyway.”
In fact, the industry hit a similar crisis some time ago, when Intel discovered that its single-core chips simply overheated when ever-increasing clock speeds started to approach 4GHz. The solution then was to change tack and deliver greater processing power by using the extra transistors to put multiple processor cores onto the same chip, and comparable architectural shifts will enable the industry to continue to boost processing power.
Such an approach can be seen in the growing interest in complementing conventional central processing units (CPUs) with specialised accelerators that may be better suited to handling specific tasks or workloads. A good example of this is the graphics processing unit (GPU), which has long been used to accelerate 3D graphics, but which has also found its way into high-performance compute (HPC) clusters thanks to the massively parallel architecture of a GPU which makes it excellent for performing complex calculations on large datasets.
In 2016, Nvidia launched its DGX-1 server, which sports eight of its latest Tesla GPUs with 16GB memory apiece and is aimed at applications involving deep learning and artificial intelligence (AI) accelerated analytics. “Nvidia’s system can do what would have taken a whole datacentre of servers a few years ago, at a pretty competitive price,” says Illsley.
Another example is the field programmable gate array (FPGA), which is essentially a chip full of logic blocks that can be configured to perform specific functions. It provides a hardware circuit that can perform those functions much faster than can be done in software, but which can be reconfigured under software control, if necessary.
One notable adopter of FPGAs is Microsoft, which uses the technology in its Azure datacentre servers to speed up Bing searches and accelerate software-defined networking (SDN).
Intel is also working on integrating FPGA circuitry into some of its Xeon server chips, which could lead to broader adoption. In 2016, the firm showed off a Xeon coupled with a discrete FPGA inside a chip package, but its goal is to get both onto a single piece of silicon.
Meanwhile, Intel prefers to push its Xeon Phi platform rather than GPU acceleration for demanding workloads. These “many integrated core” chips combine a large number of CPU cores (up to 72 in the latest Knights Landing silicon) which are essentially x86 cores with 512-bit vector processing extensions, so they can run much of the same code as a standard Intel processor.
However, one issue with having so many cores on one chip is getting access to data in system memory for all those cores. Intel has addressed this by integrating 16GB of high-speed memory inside each Xeon Phi chip package, close to the CPU cores.
HPE has shown a different approach with The Machine, its experimental prototype for a next-generation architecture. This has been described as memory-driven computing, and is based around the notion of a massive, global memory pool that is shared between all the processors in a system, enabling large datasets to be processed in memory.
A working version, demonstrated at HPE Discover in December 2016, saw each processor directly controlling eight dual inline memory modules (DIMMs) as a local memory pool, with a much larger global pool of memory comprising clusters of eight DIMMs connected via a memory fabric interface that also links to the processors. In the demo, all the memory was standard DRAM, but HPE intended The Machine to have a non-volatile global memory pool.
In fact, focusing on processors overlooks the fact that memory and storage are a bigger brake on performance, as even flash-based storage takes several microseconds to read a block of data, during which time the processor may execute millions of instructions. So anything that can speed memory and storage access will deliver a welcome boost to system performance, and a number of technologies are being developed, such as Intel and Micron’s 3D XPoint or IBM’s Phase-Change Memory, which promise to be faster than flash memory, although their cost is likely to see them used at first as a cache for a larger pool of slower storage.
These are being developed alongside new I/O interfaces that aim to make it quicker and easier to move data between memory and the processor or accelerator. Examples include Nvidia’s NVLink 2.0 for accelerators and the Gen-Z standard that aims to deliver a high-speed fabric for connecting both memory and new “storage-class” memory technologies.
One thing Illsley thinks we may see in the future is systems that are optimised for specific workloads. Currently, virtually all computers are general-purpose designs that perform different tasks by running the appropriate software. But some tasks may call for a more specialised application-specific architecture to deliver the required performance, especially if AI approaches such as deep learning become more prevalent.
Moore’s Law, which started out as an observation and prediction on the exponential growth of transistors in integrated circuits by Intel founder Gordon Moore, has lasted five decades. We may be reaching the point where it no longer holds true for silicon chips, but whatever happens, engineers will ensure that compute power continues to expand to meet the demands thrown at it.