NVIDIA Claims 2x Higher Performance & Almost 3x Efficiency For Ampere A100 GPUs Versus AMD’s Instinct MI250
NVIDIA has already announced its next-generation H100 GPU based on the Hopper graphics (GPU) architecture which will be shipping to customers later this year. The Hopper GPU will be delivering an estimated 26x increase in performance over the Pascal P100 which was released six years ago & that’s 3x faster than the trajectory offered by Moore’s Law. So coming to the performance tests, NVIDIA tested the Ampere A100 GPU in both single and multi-GPU configurations. The same configurations were used for the Instinct MI250 from AMD. Some of the most popular Data Center workloads such as LAMMPS, NAMD, openMM, GROMACS & AMBER, were used for performance tests. NVIDIA’s single Ampere A100 GPU turned out to be up to 1.9x faster than the AMD Instinct MI250 GPU accelerator while the quad-GPU solution showed up to a 2.1x gain for the Ampere system. In energy efficiency, the quad-GPU solution provided 2.8x higher perf/watt. Following are the notes from the testing: A100 also presents as a single processor to the operating system, requiring that only one MPI rank be launched to take full advantage of its performance. And, A100 delivers excellent performance at scale thanks to the 600-GB/s NVLink connections between all GPUs in a node. Now it should be noted that the AMD Instinct MI250 used here isn’t the full configuration since that sits on the MI250X but based on these results, the A100 should still be very competitive against the AMD CDNA 2 offerings. With Hopper coming soon, NVIDIA will push these numbers even further & that’s where AMD’s Instinct MI300 comes in with its brand new APU-like design. AMD MI250 measured on a GIGABYTE M262-HD5-00 with (2) AMD EPYC 7763 with 4x AMD Instinct™ MI250 OAM (128 GB HBM2e) 500W GPUs with AMD Infinity Fabric™ technology. NVIDIA runs on ProLiant XL645d Gen10 Plus using dual EPYC 7713 CPUs and 4x A100 (80 GB) SXM4 LAMMPS develop_db00b49(AMD) develop_2a35ec2(NVIDIA) datasets ReaxFF/c, Tersoff, Leonard-Jones, SNAP | NAMD 3.0alpha9 dataset STMV_NVE | OpenMM 7.7.0 Ensemble runs for datasets: amber20-stmv, amber20-cellulose, apoa1pme, pme| GROMACS 2021.1(AMD) 2022(NVIDIA) datasets ADH-Dodec (h-bond), STMV (h-bond) | AMBER 20.xx_rocm_mr_202108(AMD) and 20.12-AT_21.12 (NVIDIA) datasets Cellulose_NVE, STMV_NVE | 1x MI250 has 2x GCD via NVIDIA