Belated musings regarding the microkernel vs. monolithic kernel debate

published May 11, 2006, last modified Jun 26, 2013

What's the latest discussion topic on the kernel development sphere? It's a new rehash of a (fairly old) discussion. The famous microkernel vs. monolithic kernel debate.

Of course, the (now quite old) debate between Linus Torvalds and Andy Tanenbaum is an amusing read. But let's not confuse "amusing read" with statement of fact. Both Linus and Andy were "stating the obvious while standing on their camp". Both were stating facts to defend their position, but they rarely (if at all) discussed their assumptions that led each one to choose their strategies and defend their facts as true.

Let's recall:

  • Linus Torvalds reminds Andy Tanenbaum that monolithic kernels are both faster and easier to develop for
  • Andy Tanenbaun advocates microkernels because they have much stronger fault isolation between "servers"

(in microkernel parlance, the word "server" is used to describe a set of related responsibilities running as an independent process or thread).

Again, let's state the obvious... we'll talk about the consequences of each kernel design philosophy right after this:

  • Monolithic kernels have one distinguishing characteristic: all kernel processes share the same address space. In layman's terms, this means that one kernel process can directly manipulate kernel data or code.
  • Microkernels divide responsibilities among different processes in different address spaces ("hybrid" microkernels can, however, share one address space, but then what's the point of doing a microkernel?). No "server" can touch other servers' memory, and they're forced to communicate among each other by passing messages.

Now, let's discuss the consequences of each different approach.

Consequences of the approaches

In monolithic kernels, one running kernel process can manipulate (or mess up) kernel data or code. A software fault in a monolithic kernel causes a (frequently unrecoverable) crash.

In microkernels, no process can touch other processes' memory. This means that microkernels are impervious to memory corruption by malicious or buggy processes. But this comes at a cost. Speed, both development- and execution-wise.

Now, these two assertions aren't just "ideas" of mine. These are facts. Now, let's move on and discuss assumptions and tradeoffs.

Tradeoffs?

The costs of each strategy are, of course, debatable! Let's pick that debate now.

The cost of a microkernel strategy...

...is twofold, but that doesn't mean it's necessarily bad. First, let's tackle the software development angle.

Developing a microkernel is generally harder. Why? Because there's more code to be written. Synchronization and locking between servers (remember that the servers are designed to cooperate with each other) and message passing is hard. An (arguably bad) example: the network layer cannot directly call or jump into a function in the virtual file system code at all -- in a microkernel, the network server has to request operations from the VFS server via messages , and wait for responses. Harder.

Of course, with the right abstractions, the coding phase can (and should) be accelerated. But the "right abstractions" usually come with a (performance) penalty of their own.

And the performance side of the issue doesn't get any brighter. Popular modern CPUs are very much optimized for the non-context switching case. And a context switch is expensive. A context switch happens when the operating system decides it's time to switch to a different task. Remember that microkernels run each responsibility set in a different process? Okay, what in a monolithic kernel costs only the time of a function call, in a microkernel costs several context switches, because messages get passed around and, for them to be processed, the currently executed task needs to be changed frequently. Each context switch has a heavy cost, because a lot of things have to be changed at once to allow the next process in line its illusion of running with the CPU for itself. Think "cheating girlfriend who redecorates her room differently for each different boyfriend". Now picture her doing that thousands of times per second.

So monolithic kernels will always win the performance race. That is, unless newer processors come with special tunings that make the cost of message passing and context switching negligible. Don't count on it happening too soon, but don't count on it never happening either: top-of-the-line processors already have built-in optimization strategies for paravirtualization and full virtualization (in turn, that case would be best exemplified by special kinds of "cheating girlfriends" who have "different rooms" for each boyfriend).

But microkernels can do wonderful things that monolithic kernels cannot. When a server fails, the microkernel (more accurately, another server) can kill it, isolate its damage, and restart it. If a malicious server finds its way into the system, the damage it can do is severely limited. These are all great advantages of a microkernel. Memory access is isolated, and allocation errors are detected, while their effects on other portions of the system are null. That's why modern real-time applications and critical operations use microkernel-based operating systems. Because, even if partial damage or hardware faults take a system down, stripping it of some functionality, the rest of the system can keep on going, apply recovery techniques, or take evasive action.

Now, let's take a look at monolithic kernels.

The cost of a monolithic kernel strategy

Monolithic kernels win, hands down, on modern hardware, the performance race. But they suffer from a huge problem.

One runaway pointer can crash the machine.

Yes, that's true. It takes only one. That's why modern operating systems (even Mac OS X) can crash due to device driver problems. Remember that modern operating systems make device drivers run in the kernel's address space. In other words, the kernel and the device drivers share the same memory, exactly the opposite from a microkernel.

It gets even worse. Under most systems, as root or Administrator, you can load kernel code into a running kernel. If you're a bad person (otherwise known as "malicious hacker"), you can run code which conceals itself and grants you permanent access, under any and all conditions, unbeknownst to the computer's owner. This is not fiction but a known fact, and its application goes by the name of "rootkit". Hacker bliss, indeed.

Of course, who's to say that this is harder on a microkernel-based system? Well, depending on the microkernel architecture (and there are several microkernels out there), some of them won't let you load code if you don't hold the proper credentials (because some microkernels have credential/identity/user context management as a server). Some of them will. But no microkernel will let you start a server that alters memory from other servers directly.

Are you in the mood for some conclusions?

So, what's better?

I sincerely do not know. What, were you expecting an answer? I only wrote this article to let you make an informed decision.

Actually, I'm 99% confident that you won't be able to make much of a decision. I mean, I'm positive you understood the different facets of the discussion. I'm also positive that you've already come to like one of the two kernel development strategies better. But I also know that, if you're running a mainstream operating system (be it Windows, Linux or Mac OS X) you won't have much of a choice. So any decision you make (under your scenario) is moot.

Oh, let me state something before I go: I'm including Mac OS X in the monolithic kernel list, despite it being a hybrid microkernel, because it's actually a monolithic kernel (with processes sharing address spaces) running atop a microkernel. That seems like the dumbest strategy to me, since they are probably not reaping any of the microkernel rewards, but they're definitely paying the microkernel penalties.

And, with that, this article ends. I'm going to grab a bite. Have a great day!

By the way, in the interest of keeping this article unbiased, I refrained from putting my personal preference in its text. But the first comment of this article has a detailed opinion, courtesy of yours truly.