Concurrency from the operating system’s perspective – Concurrency and Asynchronous Programming: a Detailed Overview

The role of the operating system

The operating system (OS) stands in the center of everything we do as programmers (well, unless you’re writing an operating system or working in the embedded realm), so there is no way for us to discuss any kind of fundamentals in programming without talking about operating systems in a bit of detail.

Concurrency from the operating system’s perspective

This ties into what I talked about earlier when I said that concurrency needs to be talked about within a reference frame, and I explained that the OS might stop and start your process at any time.

What we call synchronous code is, in most cases, code that appears synchronous to us as programmers. Neither the OS nor the CPU lives in a fully synchronous world.

Operating systems use preemptive multitasking and as long as the operating system you’re running is preemptively scheduling processes, you won’t have a guarantee that your code runs instruction by instruction without interruption.

The operating system will make sure that all important processes get some time from the CPU to make progress.

Note

This is not as simple when we’re talking about modern machines with 4, 6, 8, or 12 physical cores, since you might actually execute code on one of the CPUs uninterrupted if the system is under very little load. The important part here is that you can’t know for sure and there is no guarantee that your code will be left to run uninterrupted.

Teaming up with the operating system

When you make a web request, you’re not asking the CPU or the network card to do something for you – you’re asking the operating system to talk to the network card for you.

There is no way for you as a programmer to make your system optimally efficient without playing to the strengths of the operating system. You basically don’t have access to the hardware directly. You must remember that the operating system is an abstraction over the hardware.

However, this also means that to understand everything from the ground up, you’ll also need to know how your operating system handles these tasks.To be able to work with the operating system, you’ll need to know how you can communicate with it, and that’s exactly what we’re going to go through next.

Choosing the right reference frame – Concurrency and Asynchronous Programming: a Detailed Overview

When you write code that is perfectly synchronous from your perspective, stop for a second and consider how that looks from the operating system perspective.

The operating system might not run your code from start to end at all. It might stop and resume your process many times. The CPU might get interrupted and handle some inputs while you think it’s only focused on your task.

So, synchronous execution is only an illusion. But from the perspective of you as a programmer, it’s not, and that is the important takeaway:

When we talk about concurrency without providing any other context, we are using you as a programmer and your code (your process) as the reference frame. If you start pondering concurrency without keeping this in the back of your head, it will get confusing very fast.

The reason I’m spending so much time on this is that once you realize the importance of having the same definitions and the same reference frame, you’ll start to see that some of the things you hear and learn that might seem contradictory really are not. You’ll just have to consider the reference frame first.

Asynchronous versus concurrent

So, you might wonder why we’re spending all this time talking about multitasking, concurrency, and parallelism, when the book is about asynchronous programming.

The main reason for this is that all these concepts are closely related to each other, and can even have the same (or overlapping) meanings, depending on the context they’re used in.

In an effort to make the definitions as distinct as possible, we’ll define these terms more narrowly than you’d normally see. However, just be aware that we can’t please everyone and we do this for our own sake of making the subject easier to understand. On the other hand, if you fancy heated internet debates, this is a good place to start. Just claim someone else’s definition of concurrent is 100 % wrong or that yours is 100 % correct, and off you go.

For the sake of this book, we’ll stick to this definition: asynchronous programming is the way a programming language or library abstracts over concurrent operations, and how we as users of a language or library use that abstraction to execute tasks concurrently.

The operating system already has an existing abstraction that covers this, called threads. Using OS threads to handle asynchrony is often referred to as multithreaded programming. To avoid confusion, we’ll not refer to using OS threads directly as asynchronous programming, even though it solves the same problem.

Given that asynchronous programming is now scoped to be about abstractions over concurrent or parallel operations in a language or library, it’s also easier to understand that it’s just as relevant on embedded systems without an operating system as it is for programs that target a complex system with an advanced operating system. The definition itself does not imply any specific implementation even though we’ll look at a few popular ones throughout this book.

If this still sounds complicated, I understand. Just sitting and reflecting on concurrency is difficult, but if we try to keep these thoughts in the back of our heads when we work with async code I promise it will get less and less confusing.

Concurrency and its relation to I/O – Concurrency and Asynchronous Programming: a Detailed Overview

As you might understand from what I’ve written so far, writing async code mostly makes sense when you need to be smart to make optimal use of your resources.

Now, if you write a program that is working hard to solve a problem, there is often no help in concurrency. This is where parallelism comes into play, since it gives you a way to throw more resources at the problem if you can split it into parts that you can work on in parallel.

Consider the following two different use cases for concurrency:

  • When performing I/O and you need to wait for some external event to occur
  • When you need to divide your attention and prevent one task from waiting too long

The first is the classic I/O example: you have to wait for a network call, a database query, or something else to happen before you can progress a task. However, you have many tasks to do so instead of waiting, you continue to work elsewhere and either check in regularly to see whether the task is ready to progress, or make sure you are notified when that task is ready to progress.

The second is an example that is often the case when having a UI. Let’s pretend you only have one core. How do you prevent the whole UI from becoming unresponsive while performing other CPU-intensive tasks?

Well, you can stop whatever task you’re doing every 16 ms, run the update UI task, and then resume whatever you were doing afterward. This way, you will have to stop/resume your task 60 times a second, but you will also have a fully responsive UI that has a roughly 60 Hz refresh rate.

What about threads provided by the operating system?

We’ll cover threads a bit more when we talk about strategies for handling I/O later in this book, but I’ll mention them here as well. One challenge when using OS threads to understand concurrency is that they appear to be mapped to cores. That’s not necessarily a correct mental model to use, even though most operating systems will try to map one thread to one core up to the number of threads equal to the number of cores.

Once we create more threads than there are cores, the OS will switch between our threads and progress each of them concurrently using its scheduler to give each thread some time to run. You also must consider the fact that your program is not the only one running on the system. Other programs might spawn several threads as well, which means there will be many more threads than there are cores on the CPU.

Therefore, threads can be a means to perform tasks in parallel, but they can also be a means to achieve concurrency.

This brings me to the last part about concurrency. It needs to be defined in some sort of reference frame.

Hyper-threading – Concurrency and Asynchronous Programming: a Detailed Overview

As CPUs evolved and added more functionality such as several arithmetic logic units (ALUs) and additional logic units, the CPU manufacturers realized that the entire CPU wasn’t fully utilized. For example, when an operation only required some parts of the CPU, an instruction could be run on the ALU simultaneously. This became the start of hyper-threading.

Your computer today, for example, may have 6 cores and 12 logical cores.. This is exactly where hyper-threading comes in. It “simulates” two cores on the same core by using unused parts of the CPU to drive progress on thread 2 and simultaneously running the code on thread 1. It does this by using a number of smart tricks (such as the one with the ALU).

Now, using hyper-threading, we could actually offload some work on one thread while keeping the UI interactive by responding to events in the second thread even though we only had one CPU core, thereby utilizing our hardware better.

You might wonder about the performance of hyper-threading

It turns out that hyper-threading has been continuously improved since the 90s. Since you’re not actually running two CPUs, there will be some operations that need to wait for each other to finish. The performance gain of hyper-threading compared to multitasking in a single core seems to be somewhere close to 30% but it largely depends on the workload.

Multicore processors

As most know, the clock frequency of processors has been flat for a long time. Processors get faster by improving caches, branch prediction, and speculative execution, and by working on the processing pipelines of the processors, but the gains seem to be diminishing.

On the other hand, new processors are so small that they allow us to have many on the same chip. Now, most CPUs have many cores and most often, each core will also have the ability to perform hyper-threading.

Do you really write synchronous code?

Like many things, this depends on your perspective. From the perspective of your process and the code you write, everything will normally happen in the order you write it.

From the operating system’s perspective, it might or might not interrupt your code, pause it, and run some other code in the meantime before resuming your process.

From the perspective of the CPU, it will mostly execute instructions one at a time.* It doesn’t care who wrote the code, though, so when a hardware interrupt happens, it will immediately stop and give control to an interrupt handler. This is how the CPU handles concurrency.

An evolutionary journey of multitasking – Concurrency and Asynchronous Programming: a Detailed Overview

In the beginning, computers had one CPU that executed a set of instructions written by a programmer one by one. No operating system (OS), no scheduling, no threads, no multitasking. This was how computers worked for a long time. We’re talking back when a program was assembled in a deck of punched cards, and you got in big trouble if you were so unfortunate that you dropped the deck onto the floor.

There were operating systems being researched very early and when personal computing started to grow in the 80s, operating systems such as DOS were the standard on most consumer PCs.

These operating systems usually yielded control of the entire CPU to the program currently executing, and it was up to the programmer to make things work and implement any kind of multitasking for their program. This worked fine, but as interactive UIs using a mouse and windowed operating systems became the norm, this model simply couldn’t work anymore.

Non-preemptive multitasking

Non-preemptive multitasking was the first method used to be able to keep a UI interactive (and running background processes).

This kind of multitasking put the responsibility of letting the OS run other tasks, such as responding to input from the mouse or running a background task, in the hands of the programmer.

Typically, the programmer yielded control to the OS.

Besides offloading a huge responsibility to every programmer writing a program for your platform, this method was naturally error-prone. A small mistake in a program’s code could halt or crash the entire system.

Note

Another popular term for what we call non-preemptive multitasking is cooperative multitasking. Windows 3.1 used cooperative multitasking and required programmers to yield control to the OS by using specific system calls. One badly-behaving application could thereby halt the entire system.

Preemptive multitasking

While non-preemptive multitasking sounded like a good idea, it turned out to create serious problems as well. Letting every program and programmer out there be responsible for having a responsive UI in an operating system can ultimately lead to a bad user experience, since every bug out there could halt the entire system.

The solution was to place the responsibility of scheduling the CPU resources between the programs that requested it (including the OS itself) in the hands of the OS. The OS can stop the execution of a process, do something else, and switch back.

On such a system, if you write and run a program with a graphical user interface on a single-core machine, the OS will stop your program to update the mouse position before it switches back to your program to continue. This happens so frequently that we don’t usually observe any difference whether the CPU has a lot of work or is idle.

The OS is responsible for scheduling tasks and does this by switching contexts on the CPU. This process can happen many times each second, not only to keep the UI responsive but also to give some time to other background tasks and IO events.

This is now the prevailing way to design an operating system.

Note

Later in this book, we’ll write our own green threads and cover a lot of basic knowledge about context switching, threads, stacks, and scheduling that will give you more insight into this topic, so stay tuned.

Technical requirements – Concurrency and Asynchronous Programming: a Detailed Overview

Asynchronous programming is one of those topics many programmers find confusing. You come to the point when you think you’ve got it, only to later realize that the rabbit hole is much deeper than you thought. If you participate in discussions, listen to enough talks, and read about the topic on the internet, you’ll probably also come across statements that seem to contradict each other. At least, this describes how I felt when I first was introduced to the subject.

The cause of this confusion is often a lack of context, or authors assuming a specific context without explicitly stating so, combined with terms surrounding concurrency and asynchronous programming that are rather poorly defined.

In this chapter, we’ll be covering a lot of ground, and we’ll divide the content into the following main topics:

  • Async history
  • Concurrency and parallelism
  • The operating system and the CPU
  • Interrupts, firmware, and I/O

This chapter is general in nature. It doesn’t specifically focus on Rust, or any specific programming language for that matter, but it’s the kind of background information we need to go through so we know that everyone is on the same page going forward. The upside is that this will be useful no matter what programming language you use. In my eyes, that fact also makes this one of the most interesting chapters in this book.

There’s not a lot of code in this chapter, so we’re off to a soft start. It’s a good time to make a cup of tea, relax, and get comfortable, as we’re about start this journey together.

Technical requirements

All examples will be written in Rust, and you have two alternatives for running the examples:

  • Write and run the examples we’ll write on the Rust playground
  • Install Rust on your machine and run the examples locally (recommended)

The ideal way to read this chapter is to clone the accompanying repository (https://github.com/PacktPublishing/Asynchronous-Programming-in-Rust/tree/main/ch01/a-assembly-dereference) and open the ch01 folder and keep it open while you read the book. There, you’ll find all the examples we write in this chapter and even some extra information that you might find interesting as well. You can of course also go back to the repository later if you don’t have that accessible right now.