When writing an operating system, after you get past the tedium of the arbitrary rituals you need to initialize your hardware, you get to the fun part – designing the abstractions that your kernel will use. One of the things we need to think about is processes and threads.

In operating systems textbooks, there are usually separate but related abstractions for processes and (user level) threads. A process is a an entire program in execution, and the process abstraction relies on a data structure (the process control block) that keeps track of the resources the program uses, and do bookkeeping about its state. A thread, in comparison, has a more lightweight. It only needs to keep track of its stack and execution context, and shares everything else with its parent process (including memory mappings, thus threads share memory).

Therefore, as far as the abstraction is concerned, a thread is usually a subset of a process, and thread bookkeeping can be kept as a smaller data structure that is pointed to by the processes they belong to. The sharing of memory maps also mean there is no expensive TLB flush when context switching between threads.

Linux does not make much of a distinction between processes and threads. They’re both implemented as task_struct under the hood. Threads share VM mappings and other resources, but there is no abstraction that groups threads together into a single unit. A single-threaded process is a single task, and it’s task id is the process id. However, a multithreaded process is just a loose-ish collection of tasks, each with it’s own task id, and the “process” id is simply the task id of the first thread in the group.

This design decision has led to some unfortunate consequences, as the Golang people found out when trying to apply setuid/setgid to a multithreaded process (or “process with goroutines” in Go lingo) only to find that it doesn’t do it for every thread. The only reason it works on Linux in a POSIX-compliant way is because glibc sets the setuid/setgid on every single task when the syscall is done through the C library. If you’re making a “raw” syscall to the kernel without using glibc, then the abstraction leaks.

Every design decision made when constructing your kernel affects everything else in the system. Therefore, it’s always a good idea to keep things like clean abstractions in mind, as hacks and workarounds can lead to unintended side effects.

Leave a Reply

Your email address will not be published. Required fields are marked *