Open source storage: It’s an idea that makes so much sense. After all, the storage systems most of us buy simply comprise a bunch of disks with proprietary controller software on top. Such disk systems cost people the largest chunk of their storage spending, and a proprietary system locks them into their vendor’s roadmap and support structure.
Open source storage is a potential solution to this Faustian accord, however. Unlike a fully commercial product, the controller software is open source. This doesn’t mean it’s free necessarily, but it can be, or nearly so. And unless you choose to buy a preconfigured system from an open source vendor, you are free to build your storage with commodity drives. Whatever you do with open source, it’s likely to cost you far less than proprietary storage and offer some benefits of flexibility that you wouldn’t get by striking a pact with a fully commercial vendor.
The fundamental concept of all open source products is that the development community produces the software and opens the source code to anyone who wants it. If companies want to develop products incorporating that code, under the terms of the GPL (General Public License) they must make any alterations to that source code freely available too.
Consequently, while there's no restriction on selling open source software, the most common business model is to provide the software free and sell a package of support services on top. Linux OS distributor Red Hat is probably the most successful and widely known company to rely on this business model.
The benefits of open source storage are the same as those of other open source categories. If you need a new feature or modifications to the product, you can suggest them to the developers or develop new features yourself.
With open source storage, you may not get the latest enterprise features, such as replication, data deduplication or thin provisioning, although these are found in an increasing number of open source products. But, what open source storage will deliver is reliable, high-performing software, together with the ability for users to tailor its configuration to the precise needs of their situation.
Vendors and developers include Nexenta, whose NexentaStor is based on OpenSolaris, and FreeNAS, whose product is based on FreeBSD. Both offer community and enterprise editions of the product -- the latter requires a fee and includes a support package -- and both utilise Sun's open source file system ZFS. Both sets of products are aimed at the NAS market, share files using CIFS and FTP, and will attach using NFS and iSCSI. Openfiler, a Linux-based open source storage system, offers similar features.
All the above include snappy GUIs that help to simplify and automate management tasks. However, support for Apple's file-sharing protocol, AFP, lags as neither Nexenta nor Openfiler support it, although Nexenta is considering it.
If brand name familiarity is important, Red Hat offers a fully featured version of Linux in the form of its Enterprise Linux Advanced Platform, and although this is not a storage-specific implementation, it has a file system and storage management elements.
Having downloaded and installed open source storage software, you will need to marry it to the disk hardware. Some open source suppliers do this for you. In the case of FreeNAS, iXsystems, the commercial sponsor of the software, can provide a hardware platform with FreeNAS installed.
Open source storage projects plug the gap between simplified, home-user products from companies such as Iomega and NetGear and enterprise-level systems from companies such as EMC and NetApp. Open source storage generally does not compete in large enterprises, where storage systems are complex and where the cost of acquisition is dwarfed by the need for and cost of high-end features such as global management, single namespace and active failover.
An exception is Gluster, which has developed an open source, distributed file system for managing huge data stores in data centres; the company claims the file system offers high performance and can scale quickly to hundreds of petabytes using a single namespace. The system is designed to run on and manage large numbers of nodes based on inexpensive hardware.
Nexenta planned to launch similar technology with its Namespace Cluster product in August 2011. It will offer a single namespace and management for up to 256 machines.
Generally speaking, while open source storage is behind open source server software in terms of its evolution, there seem to be few technological barriers to its future success. Server software has already blazed the trail for open source software in general so any hurdles are likely to be technological rather than in the acceptability of open source.
Open source storage at Vesk
Vesk, a provider of hosted virtual desktops, uses Nexenta's open source storage software to deliver services to its customers. According to James Mackie, Vesk's technical director, "People need high-speed desktops as they use them all day, so they need high performance. The performance comes from storage—it's about IOPS on the disk rather than the servers."
Vesk has about 150 TB under its command in three JBODs attached using LSI enclosures and host bus adapters (HBAs), with two of them mirrored using ZFS RAID-Z and the other as a hot standby. "It gives us about 100 TB of usable storage," Mackie said, adding that the architecture is fairly simple and expansion is easy: Vesk just buys extra disk capacity.
"Each JBOD is connected by SAS so the head node can fail and you get an eight-second failover," Mackie said. "The VMs [containing the virtual desktops] all remain logged in and working. We use SSDs for ZFS' read and write cache, all mirrored in each JBOD so anything can fail, including the power supply unit, and you still have the data and the performance."
Mackie said Vesk had tried systems from Dell EqualLogic but they were not fast enough. "With more than 150 users, it hit performance snags," he said. For Mackie, the other problem was that the Dell system was "a lot more expensive, and it didn't do what we needed. We learned the hard way that we had to do everything ourselves."
Going open source while aiming for zero downtime meant testing for reliability, performance and security with configuration done using the command line. "Nexenta command lines are very powerful, but you can just type help and the command, and it tells you how to set it up," Mackie said.
There are drawbacks, of course. "You need to manage it more closely than if you had a NetApp or EqualLogic," Mackie said. "The tools in Nexenta are great, but we have to manage disk failure, and we need to understand how the high availability works. There's just more work that has to go in with open source.”
"For example, are you prepared to manage disks, JBODs, software, applications? That all needs to be thought about and managed. Everything else is a benefit, and the benefits outweigh the drawbacks. We have four third-line guys who all know how the storage works. All of us understand that and have documented it and how to fix it."
Mackie said it would be possible to bring in outside consultants to manage the systems but that this would raise costs to the level of proprietary systems. "We build our own storage," he said. "Moving away from that would not be good for us."
Oxford Archaeology goes open source
Oxford Archaeology is an independent archaeology and heritage practice with about 350 specialist staff and permanent offices in Oxford, Lancaster and Cambridge.
CIO Chris Puttick said the company uses open source storage for its flexibility, features and lower costs. For Puttick, that applies particularly to TCO, including exit costs. "Support was the main requirement—it had to be clear where third-party support could be sourced from and that there were multiple sources. The quality of available documentation was also considered before choosing the technology."
The company uses two open source storage technologies. One is a 3 TB Oracle Solaris OpenStorage-based system, which manages a Sun Storage J4200 chassis; the other is an Ubuntu Linux-based solution managing 24 TB of expandable storage built on Dell PowerVault MD1000 hardware. Puttick’s team members put the systems together themselves and maintain them without outside help.
"The Linux-based solution is the future and has the capacity for growth. It's flexible; it can deliver the data volumes as iSCSI or as SMB shares, among other options," he said. Puttick plans to convert the system to BTRFS (B-tree File System) once it is stable. "This will give feature parity or better with proprietary solutions, with the added advantage of many of the features being at a file system level, and thus the storage layer itself becomes portable to other front ends," he said.
Open source software manages most of the company's storage and provides backup targets. "Over the coming 12 months, all data will be migrated to the main 20 TB file store, and an asynchronous replica set up in a branch office to provide business continuity," Puttick said. "Replication will be asynchronous because of bandwidth limitations, and we estimate [a delay of] a few hours … at peak creation and churn times."
The benefits of open source storage for Puttick are that feature parity "is more or less already there, and costs are considerably lower. Additionally, open source can survive the financial collapse of a given supplier, whereas closed source software does not." The main drawback is that "the solution is not fronted by a single shiny GUI, so each element has to be configured separately, which requires the staff involved to be of the more capable variety," he said.
The challenge for open source generally is one of brand recognition. "Without the sales and marketing money available to open source storage vendors, buyers tend to not recognise it as valid," Puttick said. "Instead of properly assessing solutions available to them, they tend to ignore those they do not know."