polygraphus - Fotolia
The French Alternative Energies and Atomic Energy Commission (CEA) has chosen object storage software from Scality for a testbed project to develop a storage architecture for its high-performance computing (HPC) operations.
The CEA hopes to be able to incorporate Scality as part of its storage architecture in its Military Applications Division (DAM). Benefits it hopes for are the ability to become storage hardware-agnostic and to gain throughput efficiencies on disk media.
The CEA/DAM site near Paris carries out multiple petaflop research with processing on hundreds of thousands of CPUs and with tens of petabytes of storage capacity.
Currently, its storage resources comprise DDN for bulk storage and faster Xyratech file- and block-based storage using the Lustre parallel file system.
The DAM has commenced a test programme using Scality's Ring distributed object-based software-defined storage, with a view to replacing file and block storage for HPC operations by 2020.
As well as gaining the ability to use any storage hardware – Scality is a software-based product – it hopes to achieve much greater input/output (I/O) efficiency and improve bandwidth to physical storage, said Jacques-Charles Lafoucrière, head of the scientific computing complex department at CEA.
“Based on access via block and file models we get bandwidth efficiency of 20% to 30% from spinning disk drives. We think if we move to the object storage interface that will increase to nearer 100%,” he said.
The inefficiencies arise, said Lafoucrière, due to the Posix interface and tree-like file structures of block and file storage which do not allow the application and storage to get the best from the hardware.
“If we go to an object interface it will be able to handle objects of hundreds of megabytes and carry out I/O optimisation and not have to manage the complexity of data and metadata,” he said.
“File- and block-access storage has limitations, from its tree-like structure and sets of data in multiple directories that must be collected and must follow the Posix semantics. This results in problems scaling and with hundreds of thousands of CPUs going to the same set of data.”
Lafoucrière added: “The object storage interface interacts with the I/O load at a higher level than with file and block. It doesn't see small blocks but the whole object at one time. File and block aren't built for many parallel streams at one time. With Posix the lower level mechanism can get in the way. Object storage is more flexible in terms of size and interaction with the data.”
Scality Ring software runs on commodity hardware and uses an object storage core to scale as a single distributed system across multiple sites and potentially thousands of standard x86 servers. Its architecture provides concurrent access to data.
Read more about object storage
- All but one of the big six storage vendors have object storage products that target public and private cloud environments and/or archiving use cases
- Object storage from specialist vendors aim at cloud, archiving, HPC and big data with products that vary by features and between hardware and software-only products
Scality and other object storage vendors use the representational state transfer (Rest) protocol to store very large amounts of data in a flat system where files are identified solely by metadata.
This contrasts with traditional file systems that use a tree-like hierarchical structure. This places limits on file systems because performance overheads increase as the file system grows into the millions and billions of files.
Object storage aims to sidestep these performance difficulties with its flat structure. It protects data using a form of the Reed-Solomon erasure coding method, which splits objects up and stores parts in different places. The method can reconstruct data if any are lost.
“We chose Scality because we think it is a promising technology and object storage will allow us the flexibility to provide the best I/O for our storage system,” said Lafoucrière.
“Our main interest in object storage is to replace file and block access and we expect it to get close to 100% bandwidth usage on our storage devices, or at least 80% or 90%.”