Nvidia finalizes GPUDirect Storage 1.0 to accelerate AI, HPC

The Magnum IO GPUDirect Storage application that Nvidia introduced in late 2019 to speed up AI, analytics and superior-efficiency computing workloads has last but not least achieved one. status right after far more than a year of beta testing.

The Magnum IO GPUDirect Storage driver permits consumers to bypass the server CPU and transfer info specifically concerning superior-efficiency GPU memory and storage, by way of a PCIe swap, to lower I/O latency and maximize throughput with the most demanding info-intensive applications.

Dion Harris, guide complex product or service marketing and advertising manager of accelerated computing at Nvidia, said GPUDirect Storage lowers CPU utilization by a aspect of three and permits CPUs to aim on the get the job done they were being constructed for — managing processing-intensive applications.

At this week’s ISC Large Performance 2021 Digital conference, Nvidia declared that it extra the Magnum IO GPUDirect Storage application to its HGX AI supercomputing system together with the new A100 80 GB PCIe GPU and NDR 400G InfiniBand networking. Nvidia had to collaborate with enterprise network and storage providers to allow the GPUDirect Storage. 

Storage sellers help GPUDirect

Storage sellers with generally offered solutions integrating GPUDirect Storage involve DataDirect Networks, Vast Details and WekaIO. Many others with solutions in the performs involve Dell Technologies, Excelero, Hewlett Packard Organization, Hitachi Vantara, IBM, Micron, NetApp, Pavilion Details Methods and ScaleFlux.

Steve McDowell, a senior engineering analyst at Moor Insights & System, said Nvidia’s GPUDirect Storage application will most often see use with superior-efficiency storage arrays that can provide the throughput expected by the GPUs and help a superior-efficiency distant immediate memory access (RDMA) interconnect this kind of as InfiniBand. Examples of GPUDirect Storage pairings involve IBM’s Elastic Storage System (ESS) 3200, NetApp’s EF600 all-flash NVMe array and Dell EMC’s PowerScale scale-out NAS system, he said.

“GPUDirect Storage is built for production-stage and significant investigate deep-mastering environments,” McDowell said, noting the engineering targets installations with a selection of GPUs working on instruction algorithms the place I/O is a bottleneck. 

Nvidia DGX SuperPod with IBM ESS 3200

IBM declared this 7 days that it had up-to-date its storage reference architectures for 2-, four- and 8-node Nvidia DGX Pod configurations and fully commited to help a DGX SuperPod with its ESS 3200 by the end of the 3rd quarter. SuperPods start off at twenty Nvidia DGX A100 devices and can scale to one hundred forty devices.

Douglas O’Flaherty, method director of portfolio GTM and alliances at IBM, said applying GPUDirect Storage on a two-node Nvidia DGX A100 can approximately double the info throughput, from 40 GB for each 2nd to 77 GB, with a single IBM ESS 3200 managing Spectrum Scale.

“What it showcases for Nvidia is just how substantially info a GPU can start off to get the job done by way of. And what it showcases for us is that, as your builders and applications embrace this, specially for these large info environments, you definitely require to make certain that you haven’t just moved the bottleneck down into storage,” O’Flaherty said. “With our newest model of ESS 3200, we did a incredible total of throughput with just a quite few devices.”

O’Flaherty said prospects most fascinated in Nvidia GPUDirect Storage involve auto producers working on self-driving motor vehicles, telecommunications providers with info-significant all-natural language processing workloads, economical companies companies seeking to lessen latency, and genomics organizations with large, advanced info sets.

Startup Vast Details has now obtained a handful of large orders for GPUDirect Storage-enabled devices, according to CMO and co-founder Jeff Denworth. Examples involve a media studio accomplishing volumetric info capture to generate 3D online video, economical companies companies managing the Apache Spark analytics motor on the Rapids open GPU info science framework, and HPC facilities and producers applying PyTorch machine mastering libraries.

Denworth claimed that applying GPUDirect Storage in Rapids and PyTorch assignments has enabled Vast Details to feed a common Spark or Postgres database about 80 situations quicker than a regular NAS system could.

“We’ve been pleasantly astonished by the total of assignments that we are being engaged on for this new engineering,” Denworth said. “And it definitely isn’t basically a issue of just producing particular AI applications run quicker. There is a entire gamut of GPU-oriented workloads the place prospects are now starting up to gravitate toward this GPUDirect Storage method as the way to feed these really hungry machines.”

Carol Sliwa is a TechTarget senior author covering storage arrays and drives, flash and memory technologies, and enterprise architecture.