Multiprocessing at a Glance (VIII): Conclusion

Let’s summarize what we’ve been going through.

About processors:
  • Originally, processors had single processing units (i.e. single core processors)
  • Eventually processors were built with several distinct processing units (i.e cores) in the same package
  • There are also multi-processor systems, mostly used in server environments
About operating systems:
  • Originally, they were simple and only allowed a single process to run at a time
  • They evolved to employ multi-tasking whereby a program would only run a limited amount of time before another process will be executed, and so on
  • Multi-tasking is nowadays preemptive such that the operating system is stopping the running program and resumes another, based on a particularly algorithm called task scheduler
  • Later on, multi-threading support was added
About software in general:
  • Simple programs are single-threaded, meaning that the entire execution is a long list of instructions that need to be performed sequentially, and that for these cases multi-tasking doesn’t bring benefits
  • Complex programs that need to use different resources or need to perform the same operation on different sets of data do benefit from using multi-threading especially on the multi-core processors
  • Multi-threading can be counter intuitive to implement and requires the programmer to think differently
What understanding should we have gained so far?

We should understand a little better how processors execute instructions, which processor(s) we need for our daily use-case and what are the differences between the various products on the market. If all one does is running applications that are not able (or don’t need) to take advantage of multi-threading, then a processor with a good single-threaded performance is better. On the other hand, if one needs to execute programs that do benefit from using many threads or running many programs at the same time (for example background services), then having many cores is an advantage.

How is this relevant to you?

Today’s operating systems of all flavors are designed with background services in mind. That is, even if you only run a single program at a time, there is a plethora of background services running quietly and invisibly. Modern processors help operating systems by allowing them to schedule several of those services to “actually” run at the same time on different cores. This increases the responsiveness and the total throughput of the system.

How many cores are too few, and how many are too many?

This cannot be answered definitively. It largely depends on the time period for which you ask. Ten years ago, it seemed like having 2 to 4 cores with SMT was good enough for the typical mainstream desktop system which runs various utility and gaming software. Few applications were heavily threaded. Today, it would seem like the magic number is somewhere between 8 to 16 cores with SMT, mostly as the software industry at large, and the gaming industry in particular is catching up. Probably in another ten years, 32 to 64 cores with SMT will be the mainstream norm. I cannot say where the core count numbers will stabilize though. Fact is that CPUs need to run much less SIMD loads than GPUs do. On the other hand, CPUs have a lot more peripherals, so… But hey, I can’t even tell you if in 20 years’ time there will be desktop computers anymore. Probably not? Or at least not as we think of a desktop today.

It’s clear though, that writing single-threaded software today for workloads that could easily be parallelized should be frowned upon. It’s not anymore a missing feature, it’s simply a bug.

Inside a server farm
Is there such a thing as too many cores?

Having too many cores cannot hurt the overall performance of the system. However, the downside of having more cores than necessary is the cost of the CPU and the energy consumption of the system. Let’s not forget that excess CPU energy consumption impacts twice (or even thrice) the total energy consumption: once directly via the amount of energy wasted and converted into heat, secondly in increased cost and energy consumption spent by the (more expensive) cooling solutions to dissipate that heat away from the system, and thirdly in dissipating the heat away from the room.

Personally, I am “saving” that third energy, by reducing the ambient room temperature in my office where my PC lives, by a similar amount as produced by my system. However, I still notice that when I play for longer periods, the room gets hotter (as both the CPU and the GPU warm up).

You might have already heard news (and I am sure you will hear more and more in the future) of big data-centers being built thousands of kilometers away, usually in the North where the weather is cool, instead of just 20 kilometers away from Vegas. The limiting factors in remote areas are suitable power grid availability, transport and communications infrastructure and available qualified personnel in those locations. But as the need for powerful AI-s will become more and more of a national-security for major world players, necessary investments in power, transportation and communication infrastructure in such areas will be prioritized and this trend will continue.

Big countries like China, India which do operate large data centers but don’t have access to high latitudes, do instead have access to high altitudes, which essentially is the same thing in this case.

What about the gigahertz war?

The performance of a general-purpose CPU is roughly given by this formula:

perf = (freq / memory_latency_factor) * (instr / clock) * (cores / socket) * sockets

Frequency is only one parameter out of 7. Improvements in any of those can translate in visible performance. The magic 5 GHz barrier many people are obsessed on the desktop market is mostly a marketing ploy. So much mental energy is being wasted comparing this metric alone.

Lately, the core count is another single metric which people tend to focus on, and I fear that this is becoming the new obsession. But at least the core counts today are more relevant since we are starting from lower numbers. A jump from 4 to 8 cores represents a doubling of cores, which has the potential to offer a significant overall performance boost of 100%. A jump of 200 MHz from 4.8 to 5.0 GHz may give you a boost of only 4% in the best case, at a high energy penalty.

Similarly, IPC and memory latency improvements may give you double-digit performance increase from one CPU generation to the next. So try not to obsess about any single parameter when comparing hardware, and follow reviews and benchmarks relevant to your workloads. One or two frames per second in your favorite game is not worth 50 dollars and 50 Watts extra. Sorry, but it simply doesn’t.

How to interpret benchmarks?

You have probably seen many product reviews and benchmarks before. There are many serious online publications and YouTube channels – this one included – periodically reviewing the latest development in hardware and comparing the latest product with previous iterations or competitor offerings.

Of note, I would like to concentrate a bit on using games as benchmarking tools. Some outlets obsess with using games to benchmark new hardware (be it processors or graphics cards). That is usually because their target audience is dominated by gamers.

It is known that some games consistently favor one or other’s manufacturer of processors or graphics cards. What gives? Well, reading this miniseries should have made you aware that not all games are equal and that they are most likely differently implemented.

Sometimes, I wonder if the hardware reviewers are inadvertently benchmarking games instead of benchmarking hardware.

Yes, if you mostly game on your PC, then it’s OK to use primarily games as a measuring tool between different hardware offerings. But be smart. Don’t let any single game influence your decision, and try to think ahead. In other words, is the hardware under review better prepared for the games of tomorrow, or just being optimized for yesterday’s games? And don’t get blinded by the bells and whistles. If a feature is not mature or not good enough to be really usable, don’t take it into account when comparing offerings.

Point is, use a pinch of salt when you see games being used as benchmarks. And do the same also when you see results from official benchmarking software. Always ask yourself if the benchmark is relevant for your workload, and use your judgment. It’s not unheard of certain hardware manufacturers pushing against or providing “sweeteners” for gaming or benchmarking software studios in order to favor one’s architecture, skewing the results.

Here is an excellent example of what I mean. But be sure, no single large corporation will refrain from doing that, regardless of where your allegiances are.

What about operating systems?

As the name implies, operating systems sit at the core of any system. They are as relevant as the hardware – when it comes to performance. As such, they must be designed to take full advantage of the available hardware.

As of today – regarding multi-tasking in general – it seems that Linux is ahead of Windows since it is able to utilize the cores much more efficiently. But Windows is still the dominant operating system (for non-server workloads) so it’s a shame that most people cannot take full advantage of their hardware’s capability. However, recent developments have shown that Windows – while still behind – is trying to catch up.


Finally, after a long time of stagnation, lack of innovation and market segmentation – all due primarily to a lack of competition on the hardware market – it’s nice to see that things are progressing again. And now, it’s high time for software (be it productivity applications, games or operating systems in general) to do a similar push and unleash the power of the hardware to our collective benefit.

I wish you all a Happy New Year! We certainly do live interesting times.


Previous: Episode VII – Practical example

More Stories
Multiprocessing at a Glance (VII): Practical example
Do NOT follow this link or you will be banned from the site!