Exclusive: Intel Arctic Sound Specifications and Intel Architecture On NIC
Update: We have been made aware of the codename “Phantom Lake” which may be a product based on IAONIC or a new name for IAONIC.
Today we have a few new tidbits to share about Arctic Sound and Intel’s new NIC based product. Nothing major, just tidbits, and fairly technical ones at that.
Starting off with this package size comparison. From left to right we have dual tile Arctic Sound, single tile Article Sound, DG1, Tiger Lake U, Tiger Lake Y, Lakefield, and Ice Lake U. There’s no specific technical details to go along with this image, but if you were wondering what the dimensions were for these parts, then there you go. This image also gives you a really good idea of just how small Lakefield is and why it’s sometimes more preferable than the also respectfully small Tiger Lake Y. You might also notice that quad tile Arctic Sound is missing. That might have some important implications but it’s hard to draw conclusions on this one small detail.
Anyways, I’ll move onto the actual technical details I have to share today; first, some of the specifications for Intel’s Arctic Sound GPU. The primary piece of info here is that each Arctic Sound GPU tile has 512 execution units (EUs), which each have 8 threads, for a total of 2048 threads per tile. This is actually something WCCFTech reported first, though what I call threads, they called cores. This makes Xe HP more than 5 times larger than its LP counterpart in respect to core count, though the comparison might not be appropriate given LP cores are probably not the same as HP cores, which is the case at least in terms of power and performance.
I was actually curious about where these cores and threads are coming from since I had never heard of such a term in respect to any Intel GPU, so I went and skimmed through Intel’s Gen 11 iGPU documentation and found a diagram that breaks down the anatomy of an EU. Each Gen 11 EU has two SIMD floating point units or ALUs which each can perform 8 addition or multiplication FP 32 operations per clock. This seems to be how a thread or core is being defined, but I’m not entirely sure, especially since this is a Gen 11 diagram.
The memory hierarchy is also interesting. It actually goes from L1 straight to L3 to the VRAM, which is HBM2. There is L2 cache, but it’s distributed among multiple EU clusters/subslices, unlike L1 which is specific to an EU cluster/subslice. I know this point about the L2 will sound weird but this is basically how it was described to me. There is a memory related bottleneck in Arctic Sound; the HBM, especially in burstier, shorter tasks and non-linear fetch tasks, suffers from performance issues. This gets worse with multiple tiles. Bandwidth efficiency does increase with greater Q depth, however. I believe that Intel will likely use the fastest HBM2 possible in order to mitigate these issues.
Finally, we have Intel Architecture On NIC, or IAONIC. This is a new product that has been greenlighted recently and it already has multiple customers that have working prototypes. There are two versions. The NS has a NIC, CPU cores (probably Snow Ridge), and a hardware queue manager (HQM). The NX has a switch, inline crypto, NIC, CPU cores (again probably Snow Ridge), and an HQM. It could be some time until we see this announced.
And that will do for this leak. This one is alot more dry than usual but I hope that the package size comparison was at least interesting. Jim and I have another leak coming out soon which should be far more interesting, so stay tuned.
Liked it? Take a second to support Matthew Connatser on Patreon!