AMD Intros EPYC 97x4 “Bergamo” CPUs: 128 Zen 4c CPU Cores For Servers, Shipping Now
by Ryan Smith on June 13, 2023 2:30 PM ESTKicking off a busy day of product announcements and updates for AMD’s data center business group, this morning AMD is finally announcing their long-awaited high density “Bergamo” server CPUs. Based on AMD’s density-optimized Zen 4c architecture, the new EPYC 97x4 chips offer up to 128 CPU cores, 32 more cores than AMD’s current-generation flagship EPYC 9004 “Genoa” chips. According to AMD, the new EPYC processors are shipping now, though we’re still awaiting further details about practical availability.
AMD first teased Bergamo and the Zen 4c architecture over 18 months ago, outlining their plans to deliver a higher density EPYC CPU designed particularly for the cloud computing market. The Zen 4c cores would use the same ISA as AMD’s regular Zen 4 architecture – making both sets of architectures fully ISA compatible – but it would offer that functionality in a denser design. Ultimately, whereas AMD’s mainline Zen 4 EPYC chips are designed to hit a balance between performance and density, these Zen 4c EPYC chips are purely about density, boosting the total number of CPU cores available for a market that is looking to maximize the number of vCPUs they can run on top of a single, physical CPU.
While we’re awaiting additional details on the Zen 4c architecture itself, at this point we do know that AMD has taken several steps to boost their CPU core density. This includes redesigning the architectural layout to favor density over clockspeeds – high clockspeed circuits are a trade-off with density, and vice versa – as well as cutting down the amount of cache per CPU core. AMD has also outright stuffed more CPU cores within an individual Core Complex Die (CCD); whereas Zen 4 is 8 cores per CCD, Zen 4c goes to 16 cores per CCD. Which, amusingly, means that the Zen 4c EPYC chips have fewer CCDs overall than their original Zen 4 counterparts.
AMD EPYC 97x4 Bergamo Processors | |||||||||
AnandTech | Core/ Thread |
Base Freq |
1T Freq |
L3 Cache |
PCIe | Memory | TDP (W) |
Price (1KU) |
|
9754 | 128 | 256 | 2250 | 3100 | 256MB | 128 x 5.0 | 12 x DDR5-4800 | 360 | $11,900 |
9754S | 128 | 128 | 2250 | 3100 | 256MB | 128 x 5.0 | 12 x DDR5-4800 | 360 | $10,200 |
9734 | 112 | 224 | 2200 | 3000 | 256MB | 128 x 5.0 | 12 x DDR5-4800 | 320 | $9,600 |
Despite these density-focused improvements, Bergamo is still a hefty chip overall in regards to the total number of transistors in use. A fully kitted out chip is comprised of 82B transistors, down from roughly 90B transistors in a full Genoa chip. Which, accounting for the larger number of CPU cores available with Bergamo, works out to a single Zen 4c core being about 68% of the transistor count as a Zen 4 core, when amortized over the entire transistor count of the chip. In reality, the savings at the CPU core level alone are likely not as great, but it goes to show how many transistors AMD has been able to save by cutting down on everything that isn’t a CPU core.
Meanwhile, as these are a subset of the EPYC 9004 series, the 97x4 EPYC chips are socket compatible with the rest of the 9004 family, using the same SP5 socket. A BIOS update will be required to use the chips, of course, but server vendors will be able to pop them into existing designs.
As noted earlier, the primary market for the EPUC 97x4 family is the cloud computing market – the ‘c’ in Zen 4c even stands for “cloud”, according to AMD. The higher core counts and less aggressive clockspeeds make the resulting chips, on a core-for-core basis, more energy efficient than Genoa designs. Which for AMD’s target market is a huge consideration, given that power is one of their greatest ongoing costs. As part of today’s presentation, AMD is touting a 2.7x improvement in energy efficiency, though we’re unclear over what that figure is in comparison to.
With their higher core density and enhanced energy efficiency, AMD is especially looking to compete with Arm-based rivals in this space, with Ampere, Amazon, and others using Arm architecture cores to fit 128 (or more) cores into a single chip. AMD will also eventually be fending off Intel in this space, though not until Sierra Forest in 2024.
Pure compute users will also want to keep an eye on these new Bergamo chips, as the high core count changes the current performance calculus a bit. In regards to pure compute throughput, on paper the new chips offer even more performance than 96 core Genoa chips, as the extra 32 CPU cores more than offsets the clockspeed losses. With that said, the cache and other supporting hardware of a server CPU boost performance in other ways, so the performance calculus is rarely so simple for real-world workloads. Still, if you just need to let rip a lot of semi-independent threads, then Bergamo may offer some surprises.
We’ll have more on more on the EPYC 97x4 series chips in the coming days and weeks, including more on the Zen 4c core architecture, as AMD releases more information on that. So until then, stay tuned.
14 Comments
View All Comments
michael2k - Wednesday, June 14, 2023 - link
I don't understand your point. AMD has (theoretically) the same CPU tailored for high clock speed or tailored for high density.Meaning in theory if you underclocked a Genoa Zen 4 (to save power in a laptop, for example) you would see the same (absent cache differences) performance as a Zen 4c part at the same clock.
The difference is the size: you can fit 3 Zen 4c cores in the same area as 2 Zen cores.
There's a tipping point where the heat generated by Zen 4 cores forces the complex to underclock itself (throttling due to excess heat) and you can see it very clearly:
https://www.anandtech.com/show/18763/amd-announces...
The more cores in a CPU, the lower the base clock. At the 360W TDP they offer both a 96c and 64c part, and there's a 700MHz drop in base clock in adding 32 more cores. There's another 350MHz drop between 64 and 48, which I would believe is due to the binning process.
At 290/280W they offer 48c at 2.75GHz or 32c at 3.25GHz
But the Bergamo parts can fit 128 cores at 2.25GHz in the same 360W envelope as the 9654 with only 96 cores. A hybrid design might see 48 high performance cores running at 3GHz and 64 high efficiency cores running at 2.25GHz
You can imagine different mixes and matches is possible (assuming AMD did the work to engineer the possibility). At worst case when throttled by heat all the CPU cores will be more or less the same due to clock speed limits. At best you get more cores than is possible using straight Zen 4 and you get more performance than is possible using straight Zen 4c due to clock speed and cache differences.
Silver5urfer - Friday, June 16, 2023 - link
On HPC side you aint seeing hybrid designs, that will cause a gigantic mess on the VMWare and other similar workloads. It will all be homogeneous not even ARM has such type of processors. Ofc heat is an issue but you can check the AMD website on how base clocks are for Genoa and Bergamo there's a clock rate cut for Bergamo.sjkpublic@gmail.com - Friday, June 16, 2023 - link
Nice increase in cores/efficiency. But what is the effect of L3 caching when the cores/threads increase proportional more than the cache?supdawgwtfd - Tuesday, June 27, 2023 - link
Depends on workload as said in the article.