Now let’s have a look at how single-threaded and multi-threaded software works, and what are the differences. For this, we need to use an example, and for the sake of simplicity we will use a trivial one: to print all the numbers from 0 to 999 in increasing order.
We will be exemplifying using pseudo-code, so that on one hand we don’t tie ourselves down to any particular programming language, and on the other hand so that we don’t get bogged into tedious minutia about whether the syntax is correct, etc.
So let’s start with the pseudo-code of the single-threaded version:
function main() print("Listing numbers from 0 to 999:") for(each number in 0..999) do print(number) print("Done!")
Each time you would be running this program, it will output this:
Listing numbers from 0 to 999:
0
1
2
3
4
…
998
999
Done!
Looking at this code, we see that for each individual [number] we perform two main actions
- for every number between 0 and 999 (inclusive)
- we print the value of [number] to the screen
Now let’s assume we want to parallelize this application. It’s stupid – I know – but let’s just do it to see what happens. First, we would need to print the numbers in a dedicated function which we will later use to instantiate a second thread.
global number = 0 function printNumber() print(number) function main() print("Listing numbers from 0 to 999:") do printNumber() number = number + 1 while(number < 1000) print("Done!")
This program is still single threaded. Running it as-is will produce the exact same output as the first one. The only difference is that now we’ve moved the printing in a dedicated function.
Now let’s try to parallelize it. Let’s spawn a thread which will be in charge of doing the number printing, and the main thread will continue iterating through the value range for [number].
global number = 0 function printNumber() do print(number) while(number < 999) function main() print("Listing numbers from 0 to 999:") spawn_thread(printNumber()) do number = number + 1 while(number < 999) print("Done!")
First, you’ll notice that we had to make the printing function into a loop, so we added a do-while() construct. The reason for this is that if we want the thread to survive, we need the loop to prevent it from exiting the function. If we wouldn’t have the loop, the thread would simply die when the function would exit.
And here we hit a snag. I wouldn’t be able to tell you exactly what this program will print with this new thread added. It may produce the exact same output like our single-threaded counterpart, but most likely it won’t. Moreover, every run of this program may print different number sequences, even if it was run on the same machine repeatedly. How is that possible?
Disclaimer: I really do hope that my analysis below will be entirely correct. I’ve obviously never executed this pseudo-code on any machine other than on the jelly between my ears. I will try to use my experience to analyze what this program may do, but there is a chance that I might be missing a corner case. Let this stand as a testament to the fact that multi-threaded code is sometimes difficult to interpret, even if it is apparently simple.
So, let’s first understand what does this new version of the program really do. This program calls a fictive spawn_thread () function which presumably creates our extra thread that executes the function printNumber (). This new thread runs in parallel with the main thread.
Let’s consider a possible output:
Listing numbers from 0 to 999:
1
2
10
11
12
204
204
204
204
368
567
…
997
998
998
999
Done!
What do we observe? We see some numbers which are completely skipped (not printed) and we see that some other numbers are repeatedly printed. Moreover, our program started from 1, not 0 as we wanted.
Surely, we have introduced a bug somewhere…
Let me assure you that [number] is not really skipping values and that the program really does start from 0. Before we analyze further this output, allow me to repeat that every time you would run this program you might see a different result:
- you might see a result that doesn’t print any number at all, only the first line (
Listing numbers
…) followed by the last (Done!
) - you might see a listing similar to the one above – with some numbers missing, others repeatedly printed
- you might see listings where we have either numbers missing, but no duplicates or the opposite, duplicated numbers without a single one being skipped
- you also might see a “perfect” sequence in which every number is printed only once as in the single-threaded example (no skipping, no duplication)
What you will never see:
- the number 999 printed more than once
- numbers bigger than 999 being printed
- numbers printed in the reverse order (smaller number following a larger one)
Now that we’ve established that, let’s dissect it to understand what happens. And let’s start the analysis from our fictive output. For simplicity, let’s name the two threads, and see what each of them does.
The main thread (M):
- prints the first line
- increments [number] and verifies that it doesn’t go above 999
- prints the last line
The second thread (S):
- prints the current value of [number]
- verifies that the number has not yet reached 999, and if not, it continues printing
The problem with missing or repeated numbers is due to the fact that one of the two threads could be scheduled to run for a longer period of time at the expense of the other. Remember, the scheduling is not up to the program itself. It’s the operating system’s task scheduler which controls this. So from inside the program you cannot control when (M) or (S) are executing – or we can, but we will talk in a future episode about how.
For this reason, if (S) is executing for too long (at the expense of (M)), then you will get repeated numbers in the output as it is printing the same number over and over again. On the other hand, if (M) executes too long and (S) is not scheduled, it will increase the [number] repeatedly without (S) having a chance to print every value, hence you will notice “missing” numbers in the output.
The same reasoning also guarantees that the value 999 will only be printed once (after printing it, the condition in (S) will become false and break the loop), [number] will never exceed 999 (the increment and the comparison is also done in the same thread), and that the values will never be out of order since [number] is only incremented.
One last clarification. Having one or several cores will not change the caveats of running this program. Running this program in both single-core or multi-core CPU may give you similar issues: missing numbers, repeated numbers, and at the same time you will not see reversed numbers or numbers greater than 999.
Just because we expect that the thread (M) should increment [number] and then thread (S) should print it, it doesn’t mean that this is what we have actually instructed the processor to do.
Computers don’t do what we want them to do, they do what we tell them to do.
The reality is that, in this form, the multi-threaded version of the program is buggy because it suffers from a race condition. We will discuss in the future episodes about the various pitfalls one needs to be aware of when writing multi-threaded code. The good news is that we can fix it to make it run as we want by adding thread synchronization.
The eagle-eyed readers may have noticed that in the single threaded code we finish the do…while() loop by comparing [number] to 1000, yet in the multi-threaded variant we compare [number] to 999 in (M).
The reason for that is the same pesky race condition. Had we left the comparison with 1000 in (M), it would have meant that [number] could have reached that value. As such, under the right conditions, if the scheduler would have paused thread (S) just before the printing instruction is executed, and then it would have been resumed after thread (M) would have incremented [number] all the way up to 1000, we could have ended up with the value 1000 being printed. Which is pretty embarrassing to see after a line saying Listing numbers from 0 to 999
.
To be continued…
Previous: Episode III – Processes and Threads
Next: Episode V – Thread synchronization