Business

Industry's First-to-Market Supermicro NVIDIA HGX™ B200 Systems Demonstrate AI Performance Leadership on MLPerf® Inference v5.0 Results

Latest Benchmarks Show Supermicro Systems with the NVIDIA B200 Outperformed the Previous Generation of Systems with 3X the Token Generation Per Second SAN

articleSuper Micro Computer, Inc.April 3, 20255/company/super-micro-computer-inc/news/industrys-first-to-market-supermicro-nvidia-hgxtm-b200-systems-demonstrate-ai-performance-leadership-on-mlperfr-inference-v50-results
Industry's First-to-Market Supermicro NVIDIA HGX™ B200 Systems Demonstrate AI Performance Leadership on MLPerf® Inference v5.0 Results

About this update from Super Micro Computer, Inc.

[{"type":"text","content":"Latest Benchmarks Show Supermicro Systems with the NVIDIA B200 Outperformed the Previous Generation of Systems with 3X the Token Generation Per Second\nSAN JOSE, Calif., April 3, 2025 /PRNewswire/ -- Super Micro Computer, Inc. (SMCI), a Total IT Solution Provider for AI/ML, HPC, Cloud, Storage, and 5G/Edge, is announcing first-to-market industry leading performance on several MLPerf Inference v5.0 benchmarks, using the NVIDIA HGX™ B200 8-GPU. The 4U liquid-cooled and 10U air-cooled systems achieved the best performance in select benchmarks. Supermicro demonstrated more than 3 times the tokens per second (Token/s) generation for Llama2-70B and Llama3.1-405B benchmarks compared to H200 8-GPU systems.\n\n \n \n \n \n \n \n\n \n\"Supermicro remains a leader in the AI industry, as evidenced by the first new benchmarks released by MLCommons in 2025,\" said Charles Liang, president and CEO of Supermicro. \"Our building block architecture enables us to be first-to-market with a diverse range of systems optimized for various workloads. We continue to collaborate closely with NVIDIA to fine-tune our systems and secure a leadership position in AI workloads.\"\nLearn more about the new MLPerf v5.0 Inference benchmarks at: https://mlcommons.org/benchmarks/inference-datacenter/\nSupermicro is the only system vendor publishing record MLPerf inference performance (on select benchmarks) for both the air-cooled and liquid-cooled NVIDIA HGX™ B200 8-GPU systems. Both air-cooled and liquid-cooled systems were operational before the MLCommons benchmark start date. Supermicro engineers optimized the systems and software to showcase the impressive performance. Within the operating margin, the Supermicro air-cooled B200 system exhibited the same level of performance as the liquid-cooled B200 system. Supermicro has been delivering these systems to customers while we conducted the benchmarks.\nMLCommons emphasizes that all results be reproducible, that the products are available and that the results can be audited by other MLCommons members. Supermicro engineers optimized the systems and software, as allowed by the MLCommons rules.\nThe SYS-421GE-NBRT-LCC (8x NVIDIA B200-SXM-180GB) and SYS-A21GE-NBRT (8x NVIDIA B200-SXM-180GB) showed performance leadership running the Mixtral 8x7B Inference, Mixture of Experts benchmarks with 129,000 tokens/second. The Su...

More updates from Super Micro Computer, Inc.