Leaked Data on AMD’s Milan-X Epyc CPUs Claims 768MB L3, Reduced Base Clocks


This web page could gain affiliate commissions from the links on this web page. Conditions of use.

AMD Epyc

There’s a new set of leaks all over AMD’s up coming-generation server CPU, codenamed Milan-X. Milan-X takes advantage of the same microarchitecture as AMD’s current Milan, but with one sizeable difference: Up to 768MB of L3, divided involving eight chiplets and 64 cores.

AMD’s plans to staple an added 64MB of L3 cache for every chiplet by using its new V-NAND buildings have been properly-mentioned, but these leaks — if accurate — give us some concept of what form of trade-offs the enterprise is contemplating between TDP, clock velocity, and cache sizing.

There are four sections in total — a 32-core and a 24-core spherical items out — but the major-close and bottom-stop are the most intriguing.

At the higher conclude, AMD is buying and selling ~10 % foundation clock frequency for an more 512MB of L3. At major of the 16-core mark, the 7373X trades off ~13 p.c frequency, but gives no less than 48MB of L3 cache per main (768MB / 16 cores). If Milan-X makes use of the exact chiplet configurations as Milan, AMD is only lighting up two cores per chiplet for a CPU like this — but the company has does a little something identical in advance of. AMD at present ships an eight-main CPU with 256MB of L3 in full, or 32MB of L3 for every cache. AMD may well be reserving V-Cache for its superior-electrical power products most of AMD’s 16-core chips concentrate on a TDP below 240W.

Somewhere else, AMD has prompt that its V-Cache is worthy of ~15 p.c efficiency, which may possibly seem to imply the company is supplying up most of its gain by investing absent base clock pace. This in all probability isn’t accurate, for a number of good reasons. To start with, foundation clock displays the negligible clock, not always the sustained CPU clock. 2nd, server workloads do not scale according to the exact elements as desktop workloads in all instances.

Desktop workloads are likely to be latency bound as opposed to throughput bound. Certainly, there are server applications that also operate into latency bottlenecks, but AMD’s 64-main CPU involves pretty a great deal of memory bandwidth to feed it. The huge L3 cache on these chips will offset memory bandwidth calls for. If AMD builds a new 64-main Threadripper on a single of these core options — and I see no motive to believe it won’t — we can expect the chip to supply greater efficiency in individual. Assessments of the 3990X against the 3995WX showed that the previous is really memory bandwidth limited. AMD may well also help you save some electric power on fabric if it can keep facts nearby to the CPU more generally, even though this could be workload-dependent.

Slapping a big L3 cache on top rated of a chip doesn’t automatically audio like substantially of an advance, but AMD’s capability to stack the die vertically and the do the job it has finished to retain the cache responsive make this a incredibly appealing chip. No a person has brought a industrial significant-stop CPU products to marketplace with a 3D chip stack like this, even though Intel has its own intensive designs around 3D die stacking by using goods like Foveros and EMIB. With Milan, AMD had to make some energy tradeoffs amongst interconnect and cores, so we’ll see how an more 64MB of L3 cache for every chiplet alterations the energy equation in the subsequent handful of months.

Now Go through:

This site may gain affiliate commissions from the backlinks on this web page. Phrases of use.

Leave a comment

Your email address will not be published.