NVIDIA Blackwell Sets New Standard in Generative AI with MLPerf Inference Performance

NVIDIA’s new Blackwell platform has set a new benchmark for generative AI, outperforming the NVIDIA H100 Tensor Core GPU by up to 4x in the MLPerf Inference v4.1 tests. This leap in performance is powered by Blackwell’s second-generation Transformer Engine and FP4 Tensor Cores, making it a game-changer for large language models like Llama 2 70B. The NVIDIA H200 Tensor Core GPU also excelled, particularly with the Mixtral 8x7B mixture of experts (MoE) LLM, demonstrating NVIDIA’s dominance in AI inference.

 

NVIDIA’s platforms, from the data center to the edge, showcased significant performance gains, with the NVIDIA Jetson platform achieving a 6.2x throughput and 2.4x latency improvement on GPT-J workloads. NVIDIA’s relentless software innovation, including enhancements to Triton Inference Server, continues to maximize the value of its AI infrastructure.

 

This latest round of MLPerf Inference solidifies NVIDIA’s leadership in AI-powered applications, driving the future of generative AI across industries.