• Wed. Apr 24th, 2024

Nvidia’s Grace Hopper Superchips for generative AI enter full production


May 29, 2023
Nvidia's Grace Hopper Superchips for generative AI enter full production


Connect with top gaming leaders in Los Angeles at GamesBeat Summit 2023 this May 22-23. Register here.

Nvidia announced that the Nvidia GH200 Grace Hopper Superchip is in full production, set to power systems that run complex AI programs.

Also targeted and high-performance computing (HPC) workloads, the GH200-powered systems join more than 400 system configurations based on Nvidia’s latest CPU and GPU architectures — including Nvidia Grace, Nvidia Hopper and Nvidia Ada Lovelace — created to help meet the surging demand for generative AI.

At the Computex trade show in Taiwan, Nvidia CEO Jensen Huang revealed new systems, partners and additional details surrounding the GH200 Grace Hopper Superchip, which brings together the Arm-based Nvidia Grace CPU and Hopper GPU architectures using Nvidia NVLink-C2C interconnect technology.

This delivers up to 900GB/s total bandwidth — or seven times higher bandwidth than the standard PCIe Gen5 lanes found in traditional accelerated systems, providing incredible compute capability to address the most demanding generative AI and HPC applications.


GamesBeat Summit 2023

Join the GamesBeat community for our virtual day and on-demand content! You’ll hear from the brightest minds within the gaming industry to share their updates on the latest developments.

Register Here

“Generative AI is rapidly transforming businesses, unlocking new opportunities and accelerating discovery in healthcare, finance, business services and many more industries,” said Ian Buck, vice president of accelerated computing at Nvidia, in a statement. “With Grace Hopper Superchips in full production, manufacturers worldwide will soon provide the accelerated infrastructure enterprises need to build and deploy generative AI applications that leverage their unique proprietary data.”

Global hyperscalers and supercomputing centers in Europe and the U.S. are among several customers that will have access to GH200-powered systems.

“We’re all experiencing the joy of what giant AI models can do,” Buck said in a press briefing.

Hundreds of accelerated systems and cloud instances

Taiwan manufacturers are among the many system manufacturers worldwide introducing systems powered by the latest Nvidia technology, including Aaeon, Advantech, Aetina, ASRock Rack, Asus, Gigabyte, Ingrasys, Inventec, Pegatron, QCT, Tyan, Wistron and Wiwynn.

Additionally, global server manufacturers Cisco, Dell Technologies, Hewlett Packard Enterprise, Lenovo, Supermicro, and Eviden, an Atos company, offer a broad array of Nvidia-accelerated systems.

Cloud partners for Nvidia H100 include Amazon Web Services (AWS), Cirrascale, CoreWeave, Google Cloud, Lambda, Microsoft Azure, Oracle Cloud Infrastructure, Paperspace and Vultr.

Nvidia AI Enterprise, the software layer of the Nvidia AI platform, offers over 100 frameworks, pretrained models and development tools to streamline development and deployment of production AI, including generative AI, computer vision and speech AI.

Systems with GH200 Superchips are expected to be available beginning later this year.

Nvidia unveils MGX server specification

Nvidia MGX

To meet the diverse accelerated computing needs of data centers, Nvidia today unveiled the Nvidia
MGX server specification, which provides system manufacturers with a modular reference architecture to quickly and cost-effectively build more than 100 server variations to suit a wide range of AI, high performance computing and Omniverse applications.

ASRock Rack, ASUS, GIGABYTE, Pegatron, QCT and Supermicro will adopt MGX, which can slash development costs by up to three-quarters and reduce development time by two-thirds to just six months.

“Enterprises are seeking more accelerated computing options when architecting data centers that meet their specific business and application needs,” said Kaustubh Sanghani, vice president of GPU products at Nvidia, in a statement. “We created MGX to help organizations bootstrap enterprise AI, while saving them significant amounts of time and money.”

With MGX, manufacturers start with a basic system architecture optimized for accelerated computing for their server chassis, and then select their GPU, DPU and CPU. Design variations can address unique workloads, such as HPC, data science, large language models, edge computing, graphics and video, enterprise AI, and design and simulation.

Multiple tasks like AI training and 5G can be handled on a single machine, while upgrades to future hardware generations can be frictionless. MGX can also be easily integrated into cloud and enterprise data centers, Nvidia said.

QCT and Supermicro will be the first to market, with MGX designs appearing in August. Supermicro’s ARS-221GL-NR system, announced today, will include the Nvidia GraceTM CPU Superchip, while QCT’s S74G-2U system, also announced today, will use the Nvidia GH200 Grace Hopper Superchip.

Additionally, SoftBank plans to roll out multiple hyperscale data centers across Japan and use MGX to dynamically allocate GPU resources between generative AI and 5G applications.

“As generative AI permeates across business and consumer lifestyles, building the right infrastructure for the right cost is one of network operators’ greatest challenges,” said Junichi Miyakawa, CEO at SoftBank, in a statement. “We expect that Nvidia MGX can tackle such challenges and allow for multi-use AI, 5G
and more depending on real-time workload requirements.”

MGX differs from Nvidia HGX in that it offers flexible, multi-generational compatibility with Nvidia products to ensure that system builders can reuse existing designs and easily adopt next-generation products without expensive redesigns. In contrast, HGX is based on an NVLink- connected multi-GPU
baseboard tailored to scale to create the ultimate in AI and HPC systems.

Nvidia announces DGX GH200 AI Supercomputer

Nvidia DGX GH200

Nvidia also announced a new class of large-memory AI supercomputer — an Nvidia DGX supercomputer powered by Nvidia GH200 Grace Hopper Superchips and the Nvidia NVLink Switch System — created to enable the development of giant, next-generation models for generative AI language applications, recommender systems and data analytics workloads.

The Nvidia DGX GH200’s shared memory space uses NVLink interconnect technology with the NVLink Switch System to combine 256 GH200 Superchips, allowing them to perform as a single GPU. This provides 1 exaflop of performance and 144 terabytes of shared memory — nearly 500x more memory than in a single Nvidia DGX A100 system.

“Generative AI, large language models and recommender systems are the digital engines of the modern economy,” said Huang. “DGX GH200 AI supercomputers integrate Nvidia’s most advanced accelerated
computing and networking technologies to expand the frontier of AI.”

GH200 superchips eliminate the need for a traditional CPU-to-GPU PCIe connection by combining an Arm-based Nvidia Grace CPU with an Nvidia H100 Tensor Core GPU in the same package, using Nvidia NVLink-C2C chip interconnects. This increases the bandwidth between GPU and CPU by 7x compared with the latest PCIe technology, slashes interconnect power consumption by more than 5x, and provides a 600GB Hopper architecture GPU building block for DGX GH200 supercomputers.

DGX GH200 is the first supercomputer to pair Grace Hopper Superchips with the Nvidia NVLink Switch System, a new interconnect that enables all GPUs in a DGX GH200 system to work together as one. The previous generation system only provided for eight GPUs to be combined with NVLink as one GPU without compromising performance.

The DGX GH200 architecture provides 10 times more bandwidth than the previous generation, delivering the power of a massive AI supercomputer with the simplicity of programming a single GPU.

Google Cloud, Meta and Microsoft are among the first expected to gain access to the DGX GH200 to explore its capabilities for generative AI workloads. Nvidia also intends to provide the DGX GH200 design as a blueprint to cloud service providers and other hyperscalers so they can further customize it for their infrastructure.

“Building advanced generative models requires innovative approaches to AI infrastructure,” said Mark Lohmeyer, vice president of Compute at Google Cloud, in a statement. “The new NVLink scale and shared memory of Grace Hopper Superchips address key bottlenecks in large-scale AI and we look forward to exploring its capabilities for Google Cloud and our generative AI initiatives.”

Nvidia DGX GH200 supercomputers are expected to be available by the end of the year.

Lastly, Huang announced that a new supercomputer called Nvidia Taipei-1 will bring more accelerated computing resources to Asia to advance the development of AI and industrial metaverse applications.

Taipei-1 will expand the reach of the Nvidia DGX Cloud AI supercomputing service into the region with 64
DGX H100 AI supercomputers. The system will also include 64 Nvidia OVX systems to accelerate local
research and development, and Nvidia networking to power efficient accelerated computing at any scale.
Owned and operated by Nvidia, the system is expected to come online later this year.

Leading Taiwan education and research institutes will be among the first to access Taipei-1 to advance
healthcare, large language models, climate science, robotics, smart manufacturing and industrial digital
twins. National Taiwan University plans to study large language model speech learning as its initial Taipei-1 project.

“National Taiwan University researchers are dedicated to advancing science across a broad range of
disciplines, a commitment that increasingly requires accelerated computing,” said Shao-Hua Sun, assistant
professor, Electrical Engineering Department at National Taiwan University, in a statement. “The Nvidia Taipei-1 supercomputer will help our researchers, faculty and students leverage AI and digital twins to address complex challenges across many industries.”

GamesBeat’s creed when covering the game industry is “where passion meets business.” What does this mean? We want to tell you how the news matters to you — not just as a decision-maker at a game studio, but also as a fan of games. Whether you read our articles, listen to our podcasts, or watch our videos, GamesBeat will help you learn about the industry and enjoy engaging with it. Discover our Briefings.

Leave a Reply

Your email address will not be published. Required fields are marked *