Warning! We’re going deep into the realm of speculation, using rumors and leaked information, to ascertain the relative quality binning of Zen2 chiplets. Any number of particular data points could be proven wrong in the future. It’s OK. This is merely a journey for the enjoyment of the experience. Let’s enjoy the trip down the rabbit hole.
AMD’s chiplet-based architecture has the potential to revolutionize the computing industry. We’ve already seen how rapidly the desktop has shifted from iterative 4-core designs, with higher core counts that barely stretched to ten cores reserved for dream machines priced higher than most people’s entire computer rig. Even server CPUs had been limited in core counts, stuck below 20 cores per CPU until 2016, necessitating 4- and even 8-way CPU designs to get monolithic compute from a single server. When Ryzen launched in March of 2017, a new paradigm was set, giving 6- and 8-core desktop CPUs to the masses as prices we had only just recently been paying for 4-core CPUs.
How were these chiplets able to be so cost effective, while simultaneously giving us double the performance we had normally been seeing at similar price points? Let’s discuss the newest Zen2 CPU design to find out.
A Familiar Lineup
|CPU Model||MSRP||Chiplets||Cores||Threads||Base Frequency||Max Boost||TDP||Total Cache|
|Ryzen 9 3950X||$749||2||16||32||3.5 GHz||4.7 GHz||105 W||72 MB|
|Ryzen 9 3900X||$499||2||12||24||3.8 GHz||4.6 GHz||105 W||70 MB|
|Ryzen 7 3800X||$399||1||8||16||3.9 GHz||4.5 GHz||105 W||36 MB|
|Ryzen 7 3700X||$329||1||8||16||3.6 GHz||4.4 GHz||65 W||36 MB|
|Ryzen 5 3600X||$249||1||6||12||3.8 GHz||4.4 GHz||95 W||35 MB|
|Ryzen 5 3600||$199||1||6||12||3.6 GHz||4.2 GHz||65 W||35 MB|
The one column I added to the usual table we see everywhere is the Chiplet column. This is rather obvious for the desktop SKUs, as each chiplet (CCD) contains eight cores (two CCXs, containing four cores each). Each CCX also contain’s its own 16MB piece of L3 cache, which is what gives each CCD a total of 32MB L3 cache. AMD’s “GameCache” branding likes to add this L3 cache and the L2 cache (512 KB per enabled core) together for our “Total Cache” column. It’s why our Ryzen 5 six-core CPUs are 35MB, while their Ryzen 7 eight-core counterparts are 36MB of cache. These cache sizes also plainly revealed to us, even before physical processors were in hand, that the eight-core Ryzen 7 CPUs were single-chiplet CPUs, rather than a potential two-chiplet 4+4 core configuration some people were speculating. We do, however, get to see such theorized four-core (with four cores disabled) chiplets on the server side, in EPYC Rome CPUs.
The Leaked EPYC Rome Lineup
|CPU Model||MSRP (USD)||Chiplets||Cores||Threads||Max Boost||TDP||Total Cache|
|EPYC 7742||$7774.18||8||64||128||3.4 GHz||225 W||256 MB|
|EPYC 7702||$7215.60||8||64||128||3.35 GHz||200 W||256 MB|
|EPYC 7702P||$4955.03||8||64||128||3.35 GHz||200 W||256 MB|
|EPYC 7642||$5345.84||6||48||96||3.4 GHz||225 W||192 MB|
|EPYC 7552||$4307.82||6||48||96||3.35 GHz||200 W||192 MB|
|EPYC 7542||$3810.05||6||48||96||3.4 GHz||225 W||192 MB|
|EPYC 7502||$2916.80||4||32||64||3.35 GHz||180 W||128 MB|
|EPYC 7502P||$2581.77||4||32||64||3.35 GHz||180 W||128 MB|
|EPYC 7452||$2275.30||4||32||64||3.35 GHz||155 W||128 MB|
|EPYC 7402||$2004.17||4||24||48||3.35 GHz||180 W||128 MB|
|EPYC 7402P||$1403.70||4||24||48||3.35 GHz||180 W||128 MB|
|EPYC 7352||$1457.51||4||24||48||3.2 GHz||155 W||128 MB|
|EPYC 7302||$1099.45||4||24||48||3.2 GHz||155 W||128 MB|
|EPYC 7302P||$929.70||4||16||32||3.3 GHz||155 W||128 MB|
|EPYC 7282||$706.29||2||16||32||3.3 GHz||120 W||64 MB|
|EPYC 7272||$678.99||2||12||24||3.2 GHz||120 W||64 MB|
|EPYC 7262||$650.75||2||8||16||3.4 GHz||155 W||64 MB|
|EPYC 7252||$518.30||2||8||16||3.2 GHz||120 W||64 MB|
|EPYC 7252P||$490.38||2||8||16||3.2 GHz||120 W||64 MB|
This EPYC leak comes courtesy of Planet 3DNow! (German). The MSRP was converted into USD from Euros and had to have a 21% VAT removed from the leaked Euros pricing. Now, as is the nature of all leaks, any particular points, most notably PRICE is subject to change even just days before official launch, as we so poignantly saw with AdoredTV’s own leaked Ryzen 3000 lineup, where targets where mostly met, but at a complete upshift in pricing, due to Intel languishing on 14nm for longer than expected and increased prices in target market. Zen2 brought an eight-core CPU that challenges the i9-9900K at a price point $150 cheaper. AMD knew they could charge more for their CPUs because they performed very competitively, and so they priced them accordingly.
Back to the EPYC leak chart. You’ll notice I, once again, added the Chiplet column. Since only the L3 cache total was listed in the “Total Cache” column, it was a simple matter of dividing by 32 MB to calculate the number of active chiplets in the CPU. I say “active” because with EPYC Naples and Threadripper, “dummy dies” were used to fill in the blank spots on the package where a chiplet could have been, in order to level out and support the heat-spreader (according to AMD). These dummy dies were most likely mostly or completely dead etched chiplets from the reject pile. Depending on the yield of the node, there may not be enough dead chips to fill all the dummy die slots, at which point they may very well have used unetched silicon.
With Zen2, there may not be a need to insert dummy dies as before. We see with the single-chiplet Ryzen CPUs, there’s a rather obvious vacancy left where the second chiplet would go on the dual die Ryzen 9 CPUs. This may suggest that Threadripper 3 and EPYC could have some vacancies on the package as well, rather than necessitating dummy (or active) dies. Which leads us, once again, back to our EPYC chart and the calculated Chiplet column.
Based on the cache size, only the top three EPYC SKUs even have a full package of eight chiplets. The rest have a mixture of chiplets, all the way down to just two on the package. I will make my first real assumption (as so far our speculation has been rather straight-forward logic). I believe, if dummy dies are needed at all (for supporting the heat-spreader or the like), that support will require at least one chiplet die in each of the four die quadrants (the clusters of two chiplets seen on the EPYC package shown). With 7nm yields being as high as they are, I personally doubt there’s enough rejected silicon to fill all eight chiplet locations in every server CPU, especially since the four-chiplet CPUs will likely be the main sellers of the lineup. But if dummy dies are not needed, those four chiplets could sit, one per quadrant, and evenly support the heat-spreader, leaving only the 2-chiplet EPYC CPUs the only server CPU in need of extra dies to support the heat-spreader. No more dark silicon!
Chiplet Cost and the Potential Profit Margins
Now for the exciting part! Just how much does the IO die and chiplets cost to manufacture, and just how much does AMD make from each chiplet? Since we can’t accurately even guess how much R&D dollars went into designing chiplets or just Zen2, nor the marketing and other such verticals, we’ll have to just determine “profits” and know that some of that profit has to go into those unaccounted-for expenses. We’re also going to have to make assumptions on costs for things like packaged coolers (desktop Ryzen), packaging (attaching the chips to the CPU package, and cost of package itself), product packaging, shipping, etc.
Manufacturing costs for semiconductors is a fairly secretive ordeal. Each company gets special pricing based on volume, partnership, available capacity, etc. Therefore, we don’t know the exact values AMD pays for chips. But we can make an educated guess!
I’ve devised the following table, using a cost-per-wafer estimate that I felt was quite reasonable based on known wafer pricing for dies of similar size to AMD’s. This gave us our starting 14nm price of $4000 per 300mm wafer. AMD’s own slide (above) was used to confirm our $8000/wafer calculation for 7nm. Since Apple’s A12 mobile SoC was technically TSMC’s initial 7nm run in the last half of 2018, AMD likely benefited from a potential discount from the rather high initial silicon costs of a new node, making $8000/wafer a logical number. Note, I went fairly pessimistic with yield rates too. The linked yield reports said “over X%” which I interpreted down to “exactly X%.” With no current numbers on 14nm, I went with a simple 0.1 defect density which yielded 88%, which is frankly a very conservative estimate for such a mature process.
|Node||Yield||Defect Density||Good Dies||Defective Dies||Max Dies||Cost/Wafer||Cost/Good Die|
These yields were calculated with Caly Technologies die yield calculator if you want to experiment too. Here’s a table of my estimated die sizes and dimensions that I used for my calculations:
|CPU Chiplet||74.55||7nm||7.1mm x 10.5mm|
|Ryzen IO Die||120||12nm||9.14mm x 13.1mm|
|EPYC IO Die||471.24||14nm||15.4mm x 30.6mm|
The important column is the Cost per Good Die calculation. Based on these estimates, Zen2 CPU Chiplets only cost $14.60/ea to make. The Ryzen IO die is only $12.85. Since the IO die is what is used for the X570 chipset die, it certainly explains some of the added cost for the latest generation of motherboards. I wouldn’t expect AMD to charge too much of a market on this IO die though, as they should want to keep their partners happy and their real profits will come from the CPUs themselves. Let’s update our Ryzen lineup chart with a new column, while dropping the cache column to make room:
|CPU Model||MSRP||Chiplets||$/Chiplet||Cores||Threads||Base Frequency||Max Boost||TDP||TDP/Chiplet|
|Ryzen 9 3950X||$749||2||$347||16||32||3.5 GHz||4.7 GHz||105 W||47.5 W|
|Ryzen 9 3900X||$499||2||$222||12||24||3.8 GHz||4.6 GHz||105 W||47.5 W|
|Ryzen 7 3800X||$399||1||$344||8||16||3.9 GHz||4.5 GHz||105 W||95 W|
|Ryzen 7 3700X||$329||1||$274||8||16||3.6 GHz||4.4 GHz||65 W||55 W|
|Ryzen 5 3600X||$249||1||$194||6||12||3.8 GHz||4.4 GHz||95 W||85 W|
|Ryzen 5 3600||$199||1||$144||6||12||3.6 GHz||4.2 GHz||65 W||55 W|
We have a new $Cost per Chiplet column! This takes the MSRP, subtracts $10 for the stock cooler, an estimated $30 for “packaging” (assembly of the CPU package, boxing, etc), and $15 for the IO die (yes, we calculated it was $12.85, but they would have to ship it from Global Foundries to a packaging facility to assemble the package). Also note, this calculation also assumes that AMD isn’t trying to make a profit on the IO die, but rather the CPU chiplets and the performance the composition and binning of those chiplets bring. In the thread of binning, I also added an estimated TDP budget per chiplet column as well. Since Ryzen’s IO die, via the chipset wattage numbers, has been shown to have about a 10W TDP budget, the remaining gets distributed to the remaining one or two chiplets to calculate this column. Lower TDP chips are either clocked lower, or are just simply more-efficient silicon.
Reviewing this new column, out of the Ryzen lineup, the Ryzen 9 3950X and Ryzen 7 3800X are nearly tied in how valuable AMD believes the chiplet dies to be. (Spoiler!) You’ll see later, that these are actually valued fairly in-line with the die value of the four-chiplet EPYC Rome CPUs, making these two CPUs upper-mid-grade chips, likely binned by top clocks (and obviously fully-functioning eight-cores), with some moderate emphasis on efficency in the case of the 3950X. The rest of the CPUs are fairly bottom barrel, particularly the six-core Ryzen 5 CPUs, only netting sub $200 for their chiplets. The Ryzen 7 3700X pulls a fairly nice value contender at $274 while having obvious efficency leanings with a 65W TDP with a fully-functioning eight-core chiplet. Even AMD themself (well, AMD_Robert, from their marketing staff posting on Reddit) said they only initially seeded the 3700X and 3900X to demonstrate the two main sides of Zen2: efficiency and performance (respectively). The TDP/chiplet column also points out the power required to push these chips up to the 3.8 and 3.9 GHz base-clock range. If we had an EPYC CPU pushing 8 chiplets trying to hit 3.8 GHz all-core base-clock, we’d be looking at an estimated 700W TDP CPU! That’s in the realm of a certain CPU maker’s 28-core “5.0 GHz” demo CPU…
The Cost of EPYC
|CPU Model||MSRP (USD)||Chiplets||$/Chiplet||Cores||Threads||$/Thread||Max Boost||TDP||TDP/Chiplet|
|EPYC 7742||$7774.18||8||$957||64||128||$60.74||3.4 GHz||225 W||25.63 W|
|EPYC 7702||$7215.60||8||$887||64||128||$56.37||3.35 GHz||200 W||22.5 W|
|EPYC 7702P||$4955.03||8||$604||64||128||$38.71||3.35 GHz||200 W||22.5 W|
|EPYC 7642||$5345.84||6||$871||48||96||$55.69||3.4 GHz||225 W||34.17 W|
|EPYC 7552||$4307.82||6||$698||48||96||$44.87||3.35 GHz||200 W||30 W|
|EPYC 7542||$3810.05||6||$615||48||96||$39.69||3.4 GHz||225 W||34.17 W|
|EPYC 7502||$2916.80||4||$699||32||64||$45.58||3.35 GHz||180 W||40 W|
|EPYC 7502P||$2581.77||4||$615||32||64||$40.34||3.35 GHz||180 W||40 W|
|EPYC 7452||$2275.30||4||$539||32||64||$35.55||3.35 GHz||155 W||33.75 W|
|EPYC 7402||$2004.17||4||$471||24||48||$41.75||3.35 GHz||180 W||40 W|
|EPYC 7402P||$1403.70||4||$321||24||48||$29.24||3.35 GHz||180 W||40 W|
|EPYC 7352||$1457.51||4||$334||24||48||$30.36||3.2 GHz||155 W||33.75 W|
|EPYC 7302||$1099.45||4||$245||24||48||$34.36||3.2 GHz||155 W||33.75 W|
|EPYC 7302P||$929.70||4||$202||16||32||$29.05||3.3 GHz||155 W||33.75 W|
|EPYC 7282||$706.29||2||$293||16||32||$22.07||3.3 GHz||120 W||50 W|
|EPYC 7272||$678.99||2||$279||12||24||$28.29||3.2 GHz||120 W||50 W|
|EPYC 7262||$650.75||2||$265||8||16||$40.67||3.4 GHz||155 W||67.5 W|
|EPYC 7252||$518.30||2||$199||8||16||$32.39||3.2 GHz||120 W||50 W|
|EPYC 7252P||$490.38||2||$185||8||16||$30.65||3.2 GHz||120 W||50 W|
This chart puts the desktop Ryzens in a while new light! With the lower base clocks (estimated to be in the 2.4 GHz range) and a conservative boost clock, the chiplet TDPs are significantly lower than the highly-clocked desktop SKUs. We’re also assuming the EPYC IO die is a 20W TDP chip, even though it’s nearly four times the size of desktop Ryzen’s IO die. Six more memory channels and potentially 128 PCIe 4.0 lanes makes a significant power draw. Just remember, we’re going conservative here. If the EPYC IO die does draw more than 20W TDP (as AMD’s slide suggests it most certainly will), that only makes our TDP/chiplet column even more efficient than the number’s we put down, making them even more separated from the desktop SKUs. With the clock difference, this wouldn’t be a surprise.
You can see, EPYC has a complete lineup of chips offering great efficiency per chiplet (EPYC 7552 or any of the eight-core parts), or the exotic four-cores per chiplet configuration people were theorizing for the 3800X that we end up seeing with the EPYC 7262 and 7252(P) chips. As usual, Enterprise customers pay a premium for the core density provided in Rome, with everything above the 7402P costing significantly more per chiplet than desktop. If we calculate cost-per-thread, only the EPYC 7282 lands in the desktop value range at $22.07/thread, with the most expensive desktop cost/thread being the 3950X and 3800X at $23.41 and $24.94 respectively. This can be compared to the i9-9900K, which costs $30.31/thread or an 7282-equivalent 16-core chip, the Xeon Gold 5218, at $39.78/thread.
A user by the handle of vpcf90 on Twitter conducted some benchmarks using Undervolting and PPT settings (separately) in BIOS to determine the best method of cutting wattage (and core temperature) while trying to maintain performance. He came up with an interesting chart showing a clear inflection at (coincidentally?) 3.3 GHz and 48W on his 3700X. This magic spot just happens to be practically the max boost clock for all the EPYC CPUs, most likely keeping their TDP in check. Obviously, for everyone looking to conserve power and temperatures on your Ryzen 3000 CPUs, you’ll want to adjust your CPU’s PPT (Package Power Tracking) to do it, rather than the classic undervolting method. I found my PPT settings buried under my Zen options in BIOS or you can configure it in Ryzen Master.
With AMD making a fairly hefty profit off each CPU they sell, usually in the $200-$300 range for desktop, they have a significant opportunity to make ROI on their Zen2 R&D easily and even fund future projects and R&D. AMD, having a gross margin nearing 40% as of March 31, 2019, I see that significantly increasing, perhaps even exceeding the 60-65% margins Intel has been seeing for the past nine years. I wouldn’t expect those margins to last, as competition will heat up when Intel’s 10nm finally gets off the ground, or perhaps when their 7nm node comes out in 2021. By then, AMD will have moved on to TSMC’s 7nm EUV or even 5nm node and we’ll be on Zen4 as well.
These are certainly exciting times in the computing industry, both from a desktop perspective and from a server perspective and I’m fortunate enough to have a foot in both.