Total shuns flash storage for 17PB SGI HPC supercomputer

Oil company Total unveils 17PB €60m supercomputer based on short-stroked Sata drives and tape. Flash storage rejected as too costly until the price comes down

French oil company Total has implemented a €60m 17PB high-performance computing (HPC) platform using SGI compute and storage hardware.

Total rejected the use of flash storage for the project, due to cost, and is instead relying on hundreds of 7,200rpm Sata drives and tape.

The SGI HPC platform is the latest in a line of Cray and SGI supercomputers stretching back to 1983 at Total that have been used to process data from the company’s oil exploration work.

Total’s investment in the new supercomputer – called Pangea – will allow researchers to develop 3D visualisations of seismic landscapes and run simulations at 10 times the resolution of existing oil and gas reservoir models.  

The previous SGI platform was installed in 2008. This delivered 0.5PFlop (half a quadrillion floating point operations per second) by 2010, but had become too slow for the volume and complexity of exploration data in use.

“We can now complete the processing of a seismic survey in nine days that would have taken four-and-a-half months on the previous platform," said Philippe Malzac, CIO for exploration and production at Total. "This is because there is more data, we use a more sophisticated algorithm and we are modelling to a higher resolution.”

After an evaluation process that saw Total also consider products from IBM, Cray, Bull, HP and Fujitsu, the company implemented a water-cooled SGI ICE X HPC platform with SGI InfiniteStorage in a tiered storage arrangement in a project that spanned 18 months.

Total's Pangea supercomputer

The supercomputer – claimed by Total to be the world’s largest commercial system – is based at the company’s Jean Feger Scientific and Computing Centre at Pau in France. It is a 2.3PFlop system based on Intel Xeon E5-2670 processors, totalling 110,592 cores with 442 terabytes of memory and using the Lustre file system.

Storage for the Pangea platform – which has a total cost of ownership of €60m over four years – comprises a 4PB tape library into which raw seismic data is first ingested. From here, data is moved via a 10Gbps link to an intermediate 7,200rpm Sata drive tier totalling 500 drives and 6PB to support processing operations.

As data is processed, it is written to a 600-drive 7PB “scratch” storage Sata tier which affords rapid access – via a 300Gbps Infiniband connection – to existing datasets, and provides a mirror to the intermediate storage for data protection purposes.

Flash storage too expensive

Total decided not to go for flash storage to provide rapid access to data. Instead it has used large numbers of the slowest type of spinning disk – 7,200rpm Sata – “short stroked” and connected via a very high-speed network to the compute element of the platform. Short stroking is where drives are not used to full capacity, so data resides on the outer tracks of the platters and is accessed most quickly.

Malzac said flash drives are too expensive now, but Total would consider them in the future.

“For the time being, flash is not more cost-effective for us than spinning disk – 7PB of flash is going to cost a fortune," he said. "Of course, it’s more efficient, but it comes at a very high cost. But as the cost decreases it will be an option for us in future.”

Image: Thinkstock

Read more on SAN, NAS, solid state, RAID