We pitted MI100 as an A100 competitor but only in single precision workloads due to its unusually high FP32 performance. Just about everyone regarded this 42 TFLOPs figure as unusual, but we were confident it was correct. Anyways, now let’s take a look at what AMD has announced:
32 GB HBM2
11.5 TFLOPs FP64
23 TFLOPs FP32
46 TFLOPs FP32 Matrix
185 TFLOPs FP16
300 Watt TDP
For once our expectations were actually beaten and quite thoroughly. We’re not sure exactly how they did it (most likely clock speeds were increased during development but somehow TDP remained the same), but AMD did in fact deliver an MI100 faster than we expected. Good job guys.
So, thanks to this increased performance, AMD is also going after A100 in FP64 workloads. Additionally, AMD estimates they provide about twice the bang for buck compared to A100 in both FP64 and FP32 workloads. This is about in line with our expectations; AMD probably needs to provide this kind of deal in order to gain a foothold in this market.
As an aside, the delay SemiAccurate reported for Milan seems to be confirmed as it has slipped to Q1. I had expected Milan to launch very late in Q4 but it seems that was not feasible for AMD. As for MI100’s launch date, it seems today is it, and we should expect MI100 based systems to crop up in December or perhaps even this month.