But there is another way to use flash in datacentres – server-side flash.
To this end, numerous suppliers have developed software that loads server-located flash storage with cache copies of hot data.
But take-up has been limited, which is surprising, given the apparent virtues of such software. Here we provide an overview of flash-based caching, its pros and cons, an overview of industry adoption and our outlook for this technology.
What is server-side caching?
Caching has been part of mainframe, server, PC and storage array architectures for decades.
The concept is to put a copy of hot data into a storage tier that is faster than the others, and often closer to the processors that access the data.
Server-side flash caching software does this by loading frequently-accessed data into server-installed flash drives. This offers a way to solve performance problems for targeted, specific apps. It incurs only limited costs, and – very importantly – involves little or no disruption to existing infrastructures, and no data migration into new arrays.
Because flash is in the servers that need to access data, the performance boost is potentially much greater than from flash drives in hybrid or all-flash arrays that are on the other side of latency-inducing storage networks. Common applications have included databases, and storage for virtual desktops and servers.
Where flash is used in the datacentre
More on flash storage
- Write-through, write-around, write-back: Cache explained
- Big storage takes two routes in flash array shakedown
- Flash array market roundup: The startups
- Big storage turns the tide in the hybrid flash array market
- PCIe SSD flash storage roundup: The few become fewer
- Flash caching software market roundup
- Flash hits the motherboard with memory channel storage
In the latest survey of enterprise storage professionals conducted by 451 Research's TheInfoPro service, 21% of respondents named performance as a storage pain point.
This is by far the biggest reason for the widespread and growing use of flash in datacentres. Flash storage avoids the complications of trying to tune disk storage (by short-stroking) to boost performance, and in any case is now reaching price parity with high-speed disk.
The most common way to use flash in datacentres is within hybrid disk and flash storage arrays. The Wave 18 survey found that 67% of respondents use flash in hybrid arrays, and a further 13% plan to do so within the next 18 months.
The second approach is to use flash within all-flash arrays. The survey found that 8% of respondents use flash in all-flash arrays, with a further 19% expecting to do so within the next 18 months.
The third way to use flash is in flash drives and PCIe cards installed in servers. Server-side flash drives or PCIe cards can be used to store entire data volumes or files containing, for example, complete databases or operating systems.
Although this has proved popular as a straightforward way to boost performance, it has been restricted by the amount of flash required to store entire datasets for small databases or parts of databases, as well as operating systems.
The other type of data that can be stored in server-side flash is a copy of frequently-accessed hot data known as a cache. This subset of data is copied from the master copy of data that continues to be stored in back-end storage attached to the server by a SAN or storage network.
This data is identified and copied into server-side drives by flash caching software installed on servers. This can be a low-cost but very effective way to boost performance for specific, targeted apps, such as databases and virtual desktops.
The Wave 18 survey found 25% of respondents used server-side flash, while 40% of those respondents – or 10% of the total survey – reported that their organisations use server-side flash in conjunction with caching software.
Caching: pros and cons
There are multiple advantages to server-side flash caching architecture:
- Minimal capital expenditure: Because server-side caching only requires enough flash capacity to store a small subset of hot data, packages of caching software and flash drives costing well under $10,000 per server can provide a huge boost in performance for multiple apps.
- Very high performance in terms of low latency: Server-side caching performance exceeds that of array-based flash by orders of magnitude. Standalone storage systems are connected to servers via storage networks or SANs that introduce performance-sapping latency. Flash drives installed inside servers do not suffer that extra latency. While latency from all-flash arrays usually averages one millisecond or less, PCIe flash card latencies are measured in microseconds.
- Very rapid and non-disruptive deployment: The master copy of data continues to be held in back-end storage, which eliminates the difficulties of migrating data to a new all-flash or hybrid array and then hooking up that array to disaster-recovery and backup systems. It also eliminates the problem that any new array is likely to introduce a separately-managed storage silo.
However, as with any engineering solution to a problem, there are also drawbacks to server-side flash caching:
- Not all customers believe they need extremely low latency: Although caching product suppliers and all-flash array suppliers stress the low latencies of their products, enterprises are apparently more interested in performance measured in IOPS, according to the Wave 18 survey.
- Latency is not as consistent as it is in all-flash arrays: Even the best caching algorithms cannot predict with 100% certainty which data will be accessed by an application. When caching algorithms fail to make the right prediction, a “cache miss” occurs, and data must be read from back-end storage, hugely increasing latency for that particular I/O operation, often to levels well above that of an all-flash array, because the back end for server-side caching will usually be disk.
- The performance boost is not guaranteed for all applications: Some apps have data access patterns that are hard for caching algorithms to identify. These non-cache-friendly apps will suffer frequent cache misses. Also, the size of the hot or frequently accessed data – also known as the working set – varies. Because of these factors, application I/O profiles must be understood to predict the performance.
- Support for virtualised servers can be limited, depending on the sophistication of the caching software.
Perceived coherency issues
Caching software makers claim their products require no changes to existing storage environments. This is not quite true. Read-only caches will contain invalid data whenever the master copy of data held in the back-end array is rolled back in time or restored to an earlier array-based snapshot.
For read and write caching, this problem of maintaining data coherency or consistency extends into array-based snapshot creation, as well as array-to-array replication of snapshots for disaster recovery.
Caching vendors are unanimously adamant that these problems can be easily solved by automatically flushing or emptying a cache before snapshots are taken, or are restored from. One company has pointed out that for many years the same automated flushing of DRAM caches in servers and storage arrays has been happening to ensure snapshot coherency, with no problems for customers.
Nevertheless, 56% of respondents that answered a question about flash cache coherency in the Wave 18 survey indicated the issue was extremely or very important. This does not mean vendors cannot fully eliminate enterprise concerns, but it does show that the issue is top of mind for many potential buyers.
This is an advantage for incumbent suppliers of back-end storage systems that have branched out into server-side flash caching. They can more easily soothe worried customers by arguing that because they provide caching software and the back-end array, they are in the best position to ensure there will be no data coherency issues.
A promised future of cooperative caching
Some incumbent storage providers have said that in future their arrays will cooperate with server-side caches more deeply so that the server-side cache will become an extension of an array, and more specifically the array's caching and tiering mechanisms.
This might involve the array sending “hints” to the caching software about which data is best to cache. As yet, these plans have not been implemented. If they are, they could give incumbent vendors an advantage over others, depending on the effectiveness of the array cooperation and hints.
However, support for server-side flash caching in general is not strong among incumbent storage providers. 451 Research believes that while caching technology may eventually establish a solid customer base, the current adoption rate suggests it will struggle to become as widely implemented as the other major ways of using flash in the datacentre.
Simon Robinson is research vice president and Tim Stammers is senior analyst at 451 Research.