Storage Is Key to HPC
Often overlooked in the bandolier of high-performance computing components is storage, where speed equals results and reliability equals success.
Engineering Computing News
Engineering Computing Resources
August 1, 2013
EMC Isilon Platform Nodes and Accelerators offer scale-out network-attached storage that promises to increase performance for file-based data applications and workflows.
All storage is not created equal. Different types of storage solutions are applicable for different types of applications. Vendors of storage technology are keenly aware of those nuances and are striving to offer solutions that can balance speed against scale. One of the more important elements affecting HPC storage today is the increased adoption of Lustre, a parallel-distributed file system, generally used for large-scale cluster computing. Lustre derives its name from the combination of Linux and cluster, is commonly used with supercomputers, and is highly scalable.
Lustre can support multiple compute clusters with tens of thousands of client nodes, tens of petabytes (PB) of storage on hundreds of servers, and more than a terabyte per second (TB/s) of aggregate I/O throughput. That makes the Lustre file systems a popular choice for businesses with large data centers.
Of course, there are multiple paths of delivering stored data to HPC solutions. The type of HPC environment in use normally dictates those paths. For example, a single HPC cluster or workstation may benefit most from local storage, in the form of cache cards and internal negated and (NAND)/ solid-state drive (SSD) options. Other environments, such as those used for big-data analytics, where multiple systems participate in the parsing of data, normally leverage SAN- or NAS-based file systems that distribute data across multiple machines.
When it comes to HPC and storage, there is no-one-size-fits-all solution, although vendors are developing platforms that can be used across as many scenarios as possible.
The Vendor Battlefield
Dozens of vendors dominate the field of high-performance storage solutions, many offering their own take on how storage should interact with HPC jobs. Some of those differences are subtle, while others entail a paradigm shift that reinterprets how data flows to and from storage devices on the network.
Take, for example EMC, which offers its Isilon storage platform for HPC environments. EMC’s Isilon is designed as a NAS solution. The advantages offered by NAS over SAN are debatable; however, in Isilon’s case, NAS hits its stride in scalability (as much as 15 petabytes per cluster), speed (more than 100 gigabytes per second throughput) and flexibility (data replication, failover, fallback, hot swap drives).
Isilon natively supports the Hadoop Distributed File System (HDFS) and offers support for many industry standard protocols.
The Panasas ActiveStor parallel storage systems are available in a variety of configurations, including shelves and racks.
EMC competitor NetApp offers several options under its E-Series Storage Platform, the latest of which is the NetApp E5500, a module storage system that is available in either 2Uor 4Uform factors, with the ability to hold as many as 60 drives, for a stacked total of 360 drives. The E5500 series sports eight 6GB serial attached small computer system interface (SAS) and four 40GB InfiniBand ports, and uses a SAN ideology for connectivity into the network. NetApp say the E5500 Series offers performance 2.5 times faster than competitors, and reports the platform can deliver 8,855.70 SPC-2 MB/s. (SPC-2 is the latest tested methodology used by the vendor-neutral Storage Performance Council.)
Other hardware manufacturers, such as Supermicro, also offer storage solutions for the HPC market. The company offers several products, many of which are adaptable to HPC storage needs. Take, for example, the company’s 3Urackmount SuperServer series, which can house 16 3.5-in. hard drives. Supermicro’s storage unit features multiple network connectivity controllers, such as dual 10 Gigabit Ethernet interfaces. Supermicro’s storage solutions allow buyers to custom configure the servers and integrate them into the HPC network. Other offerings from Supermicro include a product line of SuperStorage Solutions, which are available as stackable 2U, 3U and 4U form factors.
Industry giant HP has long been a player in the HPC field, offering everything from servers to data center-level processing to software solutions. HP tackles the HPC storage market with the X9000 series of storage products. The X9000 scales up to a 5U enclosure that can house 70 drives, and can be configured to offer as much as 16 petabytes of storage. HP offers its own StoreAll OS, which can integrate 1,024 nodes into a single global namespace to simplify management.
Dell is offering its Terascala HPC Storage Solution to HPC operators. The company claims that performance reaches 6.2GB/s read and 4.2GB/s write sequential throughput per each active/active base object storage server pair. It sells the Terascala as a “complete, pre-configured and tested solution and provides on-site installation, configuration and customer training to help minimize deployment time.” Its management software simplifies Lustre-based storage management.
Specialty vendor Fusion-io offers multiple products for the HPC market. However, when it comes to storage, it is pretty hard to ignore the company’s Acceleration line of products. When speed is a primary component for HPC, Fusion-io offers Direct Acceleration hardware, which comes incorporated into hardware cards designed for PCI slots. Several different cards are available; all feature integrated NAND flash hardware and storage. As a card-based solution, Fusion-io’s acceleration technology is designed for individual servers and workstations, eliminating much of the latency that is found in NAS- or SAN-based solutions.
Boutique vendor Padova Technologies offers customized supporting products for HPC environments. Along with the company’s HPC cluster and supercomputing products, Padova markets its enterprise SAN, redundant array of independent disks (RAID) and storage systems. At the top of Padova’s storage pinnacle is the Infortrend ESVA Enterprise Storage product line, which incorporates Fibre Channel and iSCSI SAN connectivity. However, Padova’s top contribution to the HPC market comes from the company’s N-series of servers and cluster solutions, which incorporate high-speed local storage to accelerate HPC tasks.
HPC storage vendor Panasas has made a name for itself with its Activestor line of products. Activestor offers multiple shelf configurations that provide as much as 168GB of cache per shelf, and address as much as 83TB per shelf. The shelves are also available as customized implementations, tuned for a given HPC environment. The company offers its PanFS operating system, which enables high-speed, parallel access to a single file system via DirectFlow, NFS and CIFS protocols.
A well-known name in personal computing storage, Western Digital recently shipped its first rack-mount storage system. The WD Sentinel RX4100 is a 1U rack-mount storage server targeted at smaller businesses. It promises simplified connectivity, automated back-up and restore, and collaboration via its “on-premise cloud storage” accessibility. The Sentinel RX4100 comes in 8TB, 12TB and 16 TB configurations with pre-installed Western Digital hard drives that are factory configured in RAID 5.
Of course, the vendors mentioned here are only a sample of what is becoming a large market segment. After all, fast storage systems are needed by more than just the HPC environments of the world, especially with the growth of cloud services, virtualized data centers and virtual desktop infrastructures, all of which need speed, economy, scale and ultimately—reliability.