Here’s the configuration of our test system for today. I’ve split the disks into two separate RAID6 arrays and will only be benchmarking with one of them. Twenty-one disks is still a rather large RAID6 non-the-less.
Since these types of systems are usually ordered from a value-added reseller, quote prices may vary (sometimes significantly), so be sure to source from three or so VARs to make sure you’re getting the best prices. Also, when you pick which of those three are the best, be sure to ask for additional “special pricing,” since you may be able to get even better pricing if you’re ordering at the end of a quarter, year-end, or some other special reason.
We’re starting with a quick storage benchmark to see what type of performance this LSI 3108-based MegaRAID controller gets with 21 disks in a RAID6. That kind of stripe size may make our random IO suffer, but likely will boost our sequential. I’ve provided numbers from a couple of mainstream SSDs in a different system for comparison purposes.
Our first benchmark uses AS-SSD, a staple of SSD benchmarking. It utilizes randomization to create an incompressible stream of data to determine real read/write performance characteristics of an SSD. It is heavy enough it’s not recommended to run this against a normal hard disk, as it will take a very long time to complete, comparably. However, the hallmarks of performance: sequential speed, random IO, and access time, are all measured and broken out for reads and writes, making it a tool for how well a disk performs.
Sure enough, we see massive sequential read and write numbers. However, we’re presented with some surprising single-thread 4K Read/Write benchmarks, with a strong 89.49MB/s (22,909 IOPS) reads, which is more than twice the performance of the two mainstream SATA SSDs. It scales up to 319.07MB/s (81,682 IOPS) reads under 64-threads, which keeps up with the performance of SSDs, even surpassing the SU800. If we look at the access time, we can see the array will still feel like you’re reading from a spindle drive, however, in terms of responsiveness.
If you’re curious what my post-benchmarking, production RAID60 performs like, the sequential read improved by 260MB/s and SingleT 4K reads dropped by half into the ~45MB/s range from the 89.49MB/s I benched with the RAID6. Surprisingly that didn’t affect the 64-thread 4K reads (or writes, but those hit the RAID cache anyway). The read access time went up from 12.168ms to 13.499ms.
Geekbench is a detailed benchmark that can give results on a variety of workloads and summarize it all into an overall score, all at the press of a button. I like to use it because it tests algorithms that are commonly used, such as AES, LZMA, JPEG, PDF rendering, and tasks such as SQLite, and HTML5 parsing. This helps determine if a class of CPU, or architecture, is going to be beneficial for a specific workload, rather than having that information obscured by an overall score.
This is a rather telling benchmark, and partly the reason I didn’t benchmark these Intel Xeon Silver 4114 CPUs in an exhaustive suite of benchmarks; they just simply get decimated and they’ve been benchmarked already by others. I compared this server’s 20C/40T against my 3900X, and the Ryzen’s 12C/24T beats it in every benchmark (except memory latency, as I’m running a full 64GB using dual-rank CL19 RAM clocked at 2933MHz). Surprisingly, the dual-channel desktop CPU has triple the memory bandwidth of the 6-channel Xeons. It’s a good thing this server wasn’t selected for its CPU performance. Supermicro should really consider making a single-socket AMD EPYC-based version though.
CPUz provides a quick single- and multi-threaded benchmark that proves rather useful for a quick comparison, especially since it contains a rather good database of CPU performance scores for many generations of CPUs.
Once again, the 2.2GHz clockspeed on the Xeon Silvers hold back the single-thread performance, but even the multi-thread falls short of the 3900X that has almost half the number of cores.
7-Zip has quickly come to the forefront of open-source archiving tools, and is a personal favorite of mine for being efficient and feature-rich. The built-in benchmark gives us a great indicator of CPU horsepower and memory bandwidth as well.
I put a Skylake CPU in to show what desktop Skylake does, verses it’s Xeon big brother. It wasn’t too long ago that the i7-6600K was the best we were offered for mainstream desktop performance. Times have changed indeed.
Cinebench R15 and R20
Cinebench is the benchmarking tool that measures a computer’s performance in the Cinema 4D animation and rendering engine. R20 makes improvements over R15 by taking better advantage of multiple cores and adding capability to use recent processor features. Those tweaks most likely included optimizations for Ryzen, as it greatly benefits from the upgrade from R15 to R20.
Handbrake 1.2.2 x64
Handbrake is a free, open-source transcoder that works with libraries such as FFmpeg, x265, and libvpx to convert from most video formats to most common formats. Transcoding tends to be a fairly intensive process for the whole system, using plenty of RAM, bandwidth, cache, and cores. Since streaming, on-demand recording and instant-replay, and always-on recording (for kill-cam sharing of course) are wildly popular today more than ever, encoding performance is quickly becoming a crucial benchmark for many.
For this benchmark, we used two versions of the Big Buck Bunny, Sunflower video, which are freely available to download here. The first version is the Full HD 1080p60 2D video and the second is the 4K 60fps 2D video. We transcode these to the General HQ 1080p30 Surround and HQ 720p30 Surround presets to test H.264 encoding, and Matroska H.265 MKV 1080p30 and H.265 MKV 720p30 since H.265 is more processing intensive than H.264.
Here we once again only pit it against the 3900X for a point of reference, as the desktop CPU is a different class of hardware. The Xeons benefit from ECC RAM, ensuring no RAM errors that couple corrupt the transcode, but the desktop CPU with significantly higher clockspeed and newer architecture sails ahead on all fronts, even at a significant thread disadvantage, which suggests video transcoding may not core-scale as well as one would hope.
Which brings us to our next round of benchmarks.
Liked it? Take a second to support Kirk Johnson on Patreon!