AMD Previews V-Cache, Provides +15% FPS on Average in Gaming. Coming Soon to Servers?

Dr. Lisa Su showing off prototype “5000-series” [5900X] CPU with V-Cache [Image Credit: AMD]

During AMD’s Computex 2021 presentation, Dr. Lisa Su highlighted AMD’s advanced technologies they’ve brought to market, such as 30+ products on TSMC’s 7nm node and 2.5D HBM chiplet stacking. Today, she announced a new technology: AMD’s new 3D V-Cache, utilizing TSMC’s 3D Fabric technology.

Evolution of AMD’s packaging technologies [Image Credit: AMD]

Starting with HBM, AMD brought GPU memory onto the GPU using an interposer, which enabled a massive jump in available bandwidth for video RAM while shrinking the size of the PCB needed due to eliminating the traditional GDDR arc of chips that encircled the GPU die on the board. MCM allowed AMD to produce CPUs with higher core counts cost-effectively, ushering in higher compute performance in the datacenter. With chiplets, AMD was able to use different process nodes for the CPU cores and IO die in the same packaging, providing another leap forward in performance and capabilities.

AMD’s breakdown of the new V-Cache for their 5000-series prototype Zen3 core complex [Image Credit: AMD]

With 3D Chiplets, AMD is leveraging TSMC’s 3D Fabric technology to combine chiplet packaging with die stacking. Dr. Lisa Su says their first application of the technology will be to enable an active-on-active 3D vertical cache (that they’re calling 3D V-Cache). AMD created a prototype 5000-series CPU utilizing this technology by layering a 64MB 7nm SRAM directly on top of the existing L3 cache of the Zen3 CCD. This effectively tripled the Zen3’s 32MB of cache to 96MB per CCD, and brings the total L3 cache up to 192MB in two CCD CPUs such as the 5900X and 5950X.

This SRAM cache is bonded directly to the Zen3 CCD using through-silicon vias (TSVs), which you might recall, is what was used to interconnect HBM’s multiple layers, and will support more than 2 terabytes per second of bandwidth for the new 3D V-Cache. The manufacturing process thinned the cache die and they added structural silicon (shown as semi-transparent white blocks flanking the cache die in the image above) to create a seamless surface for the combined chip. This will allow for even contact with the IHS for better thermal dissipation, as opposed to the issues we saw with some Vega GPUs that used unmolded HBM packages.

Showing off the new 3D V-Cache [Image Credit: AMD]

In the picture above, Dr. Lisa Su shows off a prototype 5900X processor. They’ve exposed the left CCD to show the 6mm x 6mm square 64MB SRAM die hybrid-bonded to the CCD. Normally the CCD would be obscured as seen with the right CCD.

[Image Credit: AMD]

The hybrid-bond and TSV approach AMD is using provides that >200X interconnect density compared to traditional 2D chiplets, as well as >15X interconnect density of micro bump 3D stacking solutions. AMD points out that instead of using solder micro bumps (that were used with HBM), they’re using direct copper-to-copper bonds without using solder bumps of any kind. This provides “dramatically” better thermals, transistor density, and interconnect pitch and uses only one-third the energy per signal compared to traditional micro bump technology. This is what provides the “>3X interconnect energy efficiency” shown in the slide.

AMD demo of Gears 5 using two 5900Xs, locked at 4GHz, one with 3D V-Cache [Image Credit: AMD]

Obviously, since this is just GameCache [AMD YouTube vid] on steroids, AMD wanted to demo this new technology on games, treating us to a visual recording, with resulting average FPS overlayed, of Gears 5. The video on the left was the standard Ryzen 9 5900X locked at 4GHz and on the right is the prototype 5900X with 3D V-Cache, also locked at 4GHz. The result was a 12% performance boost, with only the 3D V-Cache making the difference.

3D V-Cache Games Performance [Image Credit: AMD]

Additional game results were shown as bar charts for four other games talking up an actual 15% FPS boost on average, 3% more than the Gears 5 demo they baited us with first. What ISN’T mentioned is that the “15%” figure is actually an average of 32 PC games ran at 1080p High Quality. 1080p is that contentious resolution that AMD tended to perform worse in compared to Intel CPUs (at least until the 5000-series), but here AMD takes that resolution head-on with the 3D V-Cache to give them what amounts to a architectural generation performance improvement.

News Done. Leak and Speculation Time. Bring Salt.

Of course, this 3D V-Cache, in additional to boosting gaming performance, dovetails quite nicely into the recent Milan-X leak, which we talked about in our Genoa leak article earlier, as servers would also greatly benefit from this 3D V-Cache technology. This is probably exactly what Dr. Lisa Su was alluding to at the end of her presentation [AMD YouTube timestamp link], saying “we’ll be ready to start production on our highest-end products with 3D chiplets by the end of this year” [our emphasis]. Obviously, the highest-end products are top-end server CPUs like Milan (or Milan-X). If you recall, AMD did a mid-cycle launch of three 7xF3 EPYC CPUs that was notable because all three had a full 8 CPU dies with the full 256MB L3 cache available, even down to the 16-core 73F3, demonstrating a precise targeting of this exact market in the server space. I’ve also heard (through a leak source, not officially) that Milan-X will be using only “3D,” not “2.5D,” and AMD calls out 2.5D as HBM in the above slides and specifically denotes 3D V-Cache as a “3D” technology. I wouldn’t be shocked at all to see Milan-X being nothing more than these 3D V-Cache CCDs put on a lineup of EPYC Milan CPUs like the 7xF3 lineup, providing 384MB (4 CCDs) and/or 768MB (8 CCDs) L3 cache variants. This will allow AMD to do a limited mass production run to prep the way for 3D V-Cache going mainstream in (some?) EPYC and desktop CPUs for Zen4, which lines up nicely with “by end of this year,” just in time to start sampling (or selling) Genoa and desktop Zen4 early next year. This gives AMD a huge 3D packaging lead ahead of Intel, which at best is (allegedly) eyeballing packaging HBM (likely using the now-old-school microbumps) on some Sapphire Rapids Xeons in addition to doing EMIB to glue up to four Xeon dies together.

Additionally, Dr. Ian Cutress from AnandTech confirmed we’ll see “Ryzen Zen 3 products” with V-Cache, which specially refers to desktop CPUs. This means we’re likely to see the prototype 5900X with 3D V-Cache and of course the 5950X, to give AMD their Halo products to hold the top of the benchmark charts against Intel CPUs coming out next year.

Update (6/2/2021):

AnandTech did get some additional information about 3D V-Cache which confirmed Zen 3 Ryzen processors, though nothing was said about EPYC. Confirms the z-height issue won’t be a problem. Confirms it’s a single 64MB die, rather than multi-stacked, just a denser SRAM then what is under it, fitting 2X the MB into the same 6x6mm space.

Liked it? Take a second to support Kirk Johnson on Patreon!
Become a patron at Patreon!