Hopper GPUs represent NVIDIA’s newest gen­er­a­tion of high-per­for­mance graphics proces­sors, purpose-built for AI and high-per­for­mance computing (HPC). Featuring a cutting-edge ar­chi­tec­ture with advanced Tensor Cores, they integrate multiple in­no­v­a­tive tech­nolo­gies to deliver maximum ef­fi­cien­cy. Ideal for a wide range of workloads, Hopper GPUs support AI inference, deep learning training, gen­er­a­tive AI, and more.

What is the ar­chi­tec­tur­al design of NVIDIA’s Hopper GPUs?

The name “Hopper GPU” is derived from the Hopper ar­chi­tec­ture, which is the GPU mi­croar­chi­tec­ture that forms the foun­da­tion of high-per­for­mance graphics proces­sors and is optimized for AI workloads and HPC ap­pli­ca­tions. Hopper GPUs are man­u­fac­tured by TSMC using the 4-nanometer process and have over 80 billion tran­sis­tors, making them some of the most advanced graphics cards available on the market.

With the Hopper ar­chi­tec­ture, NVIDIA combines the latest gen­er­a­tion of Tensor Cores with five ground­break­ing in­no­va­tions: trans­former engine, NVLink/NVSwitch/NVLink switch systems, con­fi­den­tial computing, second-gen­er­a­tion multi-instance GPUs (MIGs) and DPX in­struc­tions. These tech­nolo­gies enable Hopper GPUs to achieve up to 30x AI inference ac­cel­er­a­tion over the previous gen­er­a­tion (based on NVIDIA’s Megatron 530B chatbot — the world’s most com­pre­hen­sive gen­er­a­tive language model).

What are the in­no­v­a­tive features of Hopper GPUs?

Hopper GPUs have several new features that help improve per­for­mance, ef­fi­cien­cy and scal­a­bil­i­ty. We present the most important in­no­va­tions below:

  • Trans­former engine: With the help of the trans­former engine, Hopper GPUs are able to train AI models up to nine times faster. For inference tasks in the area of language models, the GPUs achieve up to 30 times the ac­cel­er­a­tion of the previous gen­er­a­tion.
  • NVLink switch system: The fourth gen­er­a­tion of NVLink delivers a bidi­rec­tion­al GPU bandwidth of 900 GB/s, while NVSwitch ensures better scal­a­bil­i­ty of H200 clusters. This ensures that AI models with trillions of pa­ra­me­ters can be processed ef­fi­cient­ly.
  • Con­fi­den­tial computing: The Hopper ar­chi­tec­ture ensures that your data, AI models and al­go­rithms are also protected during pro­cess­ing.
  • Multi-instance GPU (MIG) 2.0: The second gen­er­a­tion of MIG tech­nol­o­gy allows a single Hopper GPU to be split into up to seven isolated instances. This allows several people to process different workloads si­mul­ta­ne­ous­ly without in­ter­fer­ing with each other.
  • DPX in­struc­tions: DPX in­struc­tions allow dy­nam­i­cal­ly pro­grammed al­go­rithms to be cal­cu­lat­ed up to seven times faster than with GPUs of the Ampere ar­chi­tec­ture.
Note

In our article comparing server GPUs, we present the best graphics proces­sors for your server. You can also find out every­thing there is to know about GPU servers in another of our helpful articles.

What use cases are Hopper GPUs suitable for?

NVIDIA GPUs based on the Hopper ar­chi­tec­ture are designed for a wide range of high-per­for­mance workloads. The main areas of ap­pli­ca­tion for Hopper GPUs are: ¬

  • Inference tasks: The GPUs are among the industry-leading solutions for the pro­duc­tive use of AI inference. Whether rec­om­men­da­tion systems in ecommerce, medical di­ag­nos­tics or real-time pre­dic­tions for au­tonomous driving, Hopper GPUs can process huge amounts of data quickly and ef­fi­cient­ly.
  • Gen­er­a­tive AI: The high-end GPUs provide the necessary computing power to train and execute tools with gen­er­a­tive AI. Parallel pro­cess­ing allows more efficient cal­cu­la­tions for creative tasks such as text, image and video gen­er­a­tion.
  • Deep learning training: With their high computing power, Hopper GPUs are ideal for training large neural networks. The Hopper ar­chi­tec­ture sig­nif­i­cant­ly shortens the training times of AI models.
  • Con­ver­sa­tion­al AI: Optimized for natural language pro­cess­ing (NLP), Hopper GPUs are ideal for AI-powered language systems, such as virtual as­sis­tants and AI chatbots. They ac­cel­er­ate the pro­cess­ing of large AI models and ensure re­spon­sive in­ter­ac­tion that can be seam­less­ly in­te­grat­ed into business processes, such as support.
  • Data analysis and big data: Hopper GPUs handle huge amounts of data at high speed and ac­cel­er­ate complex cal­cu­la­tions through massive parallel pro­cess­ing. This enables companies to evaluate big data faster in order to make forecasts and initiate the right measures.
  • Science and research: As the GPUs are designed for HPC ap­pli­ca­tions, they are ideal for highly complex sim­u­la­tions and cal­cu­la­tions. Hopper GPUs are used, for example, in as­tro­physics, climate modeling and com­pu­ta­tion­al chemistry.

Current models from NVIDIA

With the release of the NVIDIA H100 and the NVIDIA H200, the U.S.-based company has in­tro­duced two Hopper GPUs to the market. In contrast, the NVIDIA A30 is still built on the previous Ampere ar­chi­tec­ture. Tech­ni­cal­ly speaking, the H200 isn’t a com­plete­ly new model but rather an enhanced version of the H100. The following overview high­lights the key dif­fer­ences between these two GPUs:

  • Memory and bandwidth: While the NVIDIA H100 is equipped with an 80 GB HBM3 memory, the H200 GPU has an HBM3e memory with a capacity of 141 GB. The H200 is also clearly ahead in terms of memory bandwidth with 4.8 TB/s compared to 2 TB/s for the H100.
  • Per­for­mance for AI inference: In com­par­i­son, the NVIDIA H200 provides twice the inference per­for­mance for models such as LLaMA 2-70 B. This allows not only faster pro­cess­ing, but also efficient scaling.
  • HPC ap­pli­ca­tions and sci­en­tif­ic computing: The H100 already offers a first-class level of per­for­mance for complex cal­cu­la­tions, which the H200 surpasses. The inference speed is up to twice as high, the HPC per­for­mance around 20 percent higher.
Go to Main Menu