Artificial intelligence (AI) is on the crest of a wave right now. And Nvidia is too, arguably. So, at its GTC 2024 event last week, lots of storage players leapt on the opportunity to publicise link-ups with the graphics processing unit (GPU) behemoth.
Storage suppliers’ responses have centred on tackling the input/output (I/O) bottleneck so data can be delivered efficiently to large numbers of (very costly) GPUs.
Those efforts have run the gamut from integrations, via Nvidia microservices – notably NeMo for training and NIM for inference – to storage product validation with Nvidia AI infrastructure offerings (such as BasePOD), and to entire AI infrastructure offerings like those from HPE.
Another thrust evident in recent announcements has been the development of retrieval augmented generation (RAG) pipelines and hardware architectures. RAG aims to validate the findings of AI by integrating them with external, trusted information, in part to tackle so-called hallucinations.
What was Nvidia’s core message at GTC 2024?
Core to Nvidia’s message at GTC 2024, delivered by CEO Jensen Huang, is a shift in the computer industry based around AI workloads and their tendency towards generation of data rather than retrieval of data. Of course, there is a fair bit of retrieval in AI, as data is sucked into training runs.
Core to product development at Nvidia are bigger and more powerful GPUs and their processors, with its new Blackwell chip running to 200 billion transistors and the ability to handle one trillion parameter large language models (LLMs) at a much lower cost and power usage than its predecessor.
Such compute power and GPUs are built by Nvidia into server systems – OGX and DGX (and the OEMed HGX) – and into reference architectures and turnkey infrastructure offerings – BasePOD and SuperPOD.
Here we look at some storage supplier announcements around Nvidia GTC 2024.
Cohesity
Backup provider Cohesity announced it will offer Nvidia NIM microservices and integration of Nvidia AI Enterprise into its Gaia multicloud data platform. Cohesity Gaia allows the use of backup and archive data to form a source of training data and then a source of company intelligence.
Cohesity also announced Nvidia had become an investor.
DataDirect Networks
Long-time high-performance computing (HPC) storage specialist DataDirect Networks (DDN) announced AI400X2 Turbo, which is aimed at AI workloads and provides a 33% bandwidth improvement over its AI400X2 in the same form factor due to an increase in memory and better networking.
DDN is a big player among service providers that offer GPU-as-a-service. Its ability to saturate GPUs has seen it transition from HPC storage provider to a key AI storage player.
The AI400X2 Turbo has a maximum bandwidth of 120GBps compared with the AI400X2’s 90GBps.
Dell
Dell unveiled its Dell AI Factory, which comes as an integrated stack spanning desktop, laptop and server PowerEdge XE9680 compute, PowerScale F710 storage, software and services validated with Nvidia’s AI infrastructure and Spectrum-X Ethernet networking fabric.
Dell AI Factory can be purchased via pay-as-you-go Apex subscriptions.
HPE
HPE announced availability of generative AI (GenAI) supercomputing systems with Nvidia components and Cray AMD compute, GenAI enterprise computing systems with Nvidia components, a RAG reference architecture that uses Nvidia’s NeMo microservices, plans to use Nvidia’s NIM microservices for inference workloads and future products based on Nvidia’s Blackwell platform.
HPE’s enterprise GenAI system focuses on AI model tuning and inference and is pre-configured around ProLiant DL380a servers, Nvidia L40S GPUs, BlueField-3 DPUs and Spectrum-X Ethernet networking, plus HPE’s machine learning and analytics software.
The RAG reference architecture consists of Nvidia’s NeMo Retriever microservices, HPE Ezmeral data fabric software, and GreenLake for File Storage, which is Alletra MP hardware and VAST Data storage software.
Hitachi Vantara
Hitachi Vantara launched Hitachi iQ, which provides industry-specific AI systems that use Nvidia DGX and HGX GPUs with the company’s storage.
Hitachi iQ will begin availability in the second quarter 2024 and will include Nvidia BasePOD certification with a range of Nvidia GPU options, Nvidia AI Enterprise software support, plus the latest release of Hitachi Content Software for File (HCFS) – WekaIO’s rebranded WekaFS file system software – with accelerated storage nodes for AI workloads.
NetApp
NetApp unveiled the Nvidia NeMo Retriever microservice, a RAG software offering that connects directly to OnTap customer hybrid cloud storage. It is available to OnTap customers that subscribe to the Nvidia AI Enterprise software platform and allows LLM access to an enterprise’s unstructured data without having to create a separate repository.
Pure Storage
Pure Storage announced that it has created a RAG pipeline that uses Nvidia NeMo-based microservices in concert with Nvidia GPUs and its storage.
Also in RAG territory, Pure Storage announced RAGs for specific industry verticals – aimed only at financial services for now, but with healthcare and public sector to follow.
Pure also announced it had gained validation for its storage with Nvidia OVX server infrastructure, which adds to existing Nvidia DGX BasePod compute compatibility announced last year.
Weka
Parallel hybrid cloud NAS maker Weka announced the launch of a hardware appliance certified to work with Nvidia’s DGX SuperPod AI datacentre infrastructure.
The WEKApod uses the latest PCIe 5 and comes with performance numbers of 18.3 million input/output operations per second (IOPS) and 765GBps in a single, 1PB (petabyte), eight-node cluster.
Weka is a certified partner for Nvidia DGX BasePod and announced at the show that it will be part of validation for Nvidia OVX.