NVIDIA Software Eliminates HPC Bottlenecks

New Magnum IO Software provides 20x acceleration for data scientists, AI researchers

NVIDIA’s new Magnum IO software suite helps data scientists, as well as AI and HPC researchers, process massive amounts of data up to 20x faster than previously possible. Magnum IO can run on any NVIDIA-powered system, including the DGX SuperPOD pictured here. Image courtesy of NVIDIA.

Engineering Computing News

Engineering Computing Resources

Latest News

America Makes’ Spring 2024 TRX Explores AM Advancements

ELEMENTS Version 4.2.0 Now Released

AMEXCI and Nikon SLM Solutions Collaborate in the Nordics

BMF Gets FDA OK for Ultra Thin Dental Veneer Material

Nexa3D and KVG Scale Defense Manufacturing Capabilities

KonnectAi Launches AI Quality Inspection Tool for Manufacturers

All posts

By DE Editors

November 26, 2019

At last week’s SC19 conference, NVIDIA announced NVIDIA Magnum IO, a suite of software to help data scientists and AI and high performance computing (HPC) researchers process massive amounts of data in minutes, rather than hours.

Optimized to eliminate storage and input/output bottlenecks, Magnum IO delivers up to 20x faster data processing for multi-server, multi-GPU computing nodes when working with massive datasets to carry out complex financial analysis, climate modeling and other HPC workloads.

NVIDIA has developed Magnum IO in close collaboration with industry leaders in networking and storage, including DataDirect Networks, Excelero, IBM, Mellanox and WekaIO.

“Processing large amounts of collected or simulated data is at the heart of data-driven sciences like AI,” said Jensen Huang, founder and CEO of NVIDIA. “As the scale and velocity of data grow exponentially, processing it has become one of data centers’ great challenges and costs.

“Extreme compute needs extreme I/O. Magnum IO delivers this by bringing NVIDIA GPU acceleration, which has revolutionized computing, to I/O and storage. Now, AI researchers and data scientists can stop waiting on data and focus on doing their life’s work,” he added.

Magnum IO leverages GPUDirect, which provides a path for data to bypass CPUs and travel on “open highways” offered by GPUs, storage and networking devices, according to NVIDIA. Compatible with a wide range of communications interconnects and APIs — including NVIDIA NVLink and NCCL, as well as OpenMPI and UCX — GPUDirect is composed of peer-to-peer and RDMA elements.

Its newest element is GPUDirect Storage, which enables researchers to bypass CPUs when accessing storage and quickly access data files for simulation, analysis or visualization.

NVIDIA Magnum IO software is available now, with the exception of GPUDirect Storage, which is currently available to select early-access customers. Broader release of GPUDirect Storage is planned for the first half of 2020, the company said.

GPU-Accelerated Arm Servers

NVIDIA also announced a new reference design platform that enables companies to quickly build GPU-accelerated Arm-based servers.

According to NVIDIA, the platform — consisting of hardware and software building blocks — is a response to growing demand in the HPC community for the ability to harness a broader range of CPU architectures. It allows supercomputing centers, hyperscale-cloud operators and enterprises to combine the advantage of NVIDIA’s accelerated computing platform with the latest Arm-based server platforms.

To build the reference platform, NVIDIA is teaming with Arm and its ecosystem partners — including Ampere, Fujitsu and Marvell — to ensure NVIDIA GPUs can work seamlessly with Arm-based processors. The reference platform also benefits from strong collaboration with Cray, a Hewlett Packard Enterprise company, and HPE, two early providers of Arm-based servers. Additionally, a wide range of HPC software companies have used NVIDIA CUDA-X libraries to build GPU-enabled management and monitoring tools that run on Arm-based servers.

“There is a renaissance in high performance computing,” Huang said. “Breakthroughs in machine learning and AI are redefining scientific methods and enabling exciting opportunities for new architectures. Bringing NVIDIA GPUs to Arm opens the floodgates for innovators to create systems for growing new applications from hyperscale-cloud to exascale supercomputing and beyond.”

Collaboration with Broader HPC Ecosystem

In addition to making its own software compatible with Arm, NVIDIA is working closely with its broad ecosystem of developers to bring GPU acceleration to Arm for HPC applications such as GROMACS, LAMMPS, MILC, NAMD, Quantum Espresso and Relion. NVIDIA and its HPC-application ecosystem partners have compiled extensive code to bring GPU acceleration to their applications on the Arm platform.

To enable the Arm ecosystem, NVIDIA collaborated with leading Linux distributors Canonical, Red Hat, Inc., and SUSE, as well as the industry’s leading providers of essential HPC tools.

Leading supercomputing centers have begun testing GPU-accelerated Arm-based computing systems. This includes Oak Ridge and Sandia National Laboratories, in the United States; the University of Bristol, in the United Kingdom; and Riken, in Japan.