systemd and the crashing tweet

6 minute read

By now you have probably read all about how systemd is so easy to “crash” that one can make it fit in a tweet (How to Crash Systemd in One Tweet).

This post led to a first response by David Timothy Strauss (How to Throw a Tantrum in One Blog Post), answered by Andrew Ayer (Systemd is not Magic Security Dust), and replied by David Timothy Strauss (Ayer vs. systemd, Part 4).

But let’s take a look at the bug and the arguments against systemd presented in the first post by Andrew Ayer.

What about the bug?

The bug was introduced by d875aa8ce10b which did not properly take into account the changes in 5ba6985b6c8e. It was reported as issue #4234. The first suggested fix (531ac2b2349) was unfortunately incorrect and required two commits to get it right (8523bf7dd51 and 9987750e7a4c).

Thus this bug has a different impact depending on the systemd version you are running:

< 209: unaffected;
>= 209 & < 219: systemd will kill all services using the Watchdog feature. As journald and logind are using it, a lot of functionality will be missing;
>= 219: the assertion is reached and systemd will stop executing.

How does systemd handle this?

Those asserts are here to make sure the arguments given to a function match some expectations. If asserts are reached, then anything can be expected and the execution state of systemd can no longer be considered valid. Asserts are handled by simply stopping systemd execution (and not crashing). Everything is kept running, but the system is obviously in a degraded state.

Unfortunately, this is the only way to properly handle such cases. It is much safer to stop as soon as possible when something unexpected happens than to keep going and make the bug much harder to discover.

Should systemd be written in a safe language?

The systemd developers understand none of this, opting to cram an enormous amount of unnecessary complexity into PID 1, which runs as root and is written in a memory-unsafe language.

systemd should be written in a memory safe language. The obvious picks are Rust or Go. But was this really an option, back at the time when the project was started?

The systemd project was started in 2009 (the first commit is from 2009-11-18), announced in April 2010 and the first release made in July 2010.

According to Wikipedia, Go was created around 2009, the first development release made in March 2011 and the 1.0 release in March 2012.

Rust (Wikipedia) was created around 2010, the 0.1 release made in January 2012 and the 1.0 release in May 2015.

Thus neither Go or Rust was an option at the time. Don’t tell me it should have been written in OCaml, Haskell, Java, ADA, whatever. All of those memory safe languages were either unfit for such a low level software (mandatory garbage collection) or too esoteric to be a good pick (not widely used enough in open source communities). Moreover, languages can not be considered independently from the ecosystem of tools available for them.

But systemd developers are aware that C is unsafe and that writing correct C code is hard. This is why they use some GNU C extensions to most notably help with memory management and thus avoid most common errors (see GCC “cleanup” extension for C).

Moreover, this is a logic bug, not directly connected to any specific weakness of C.

A possible path forward to solve this issue would be to start rewriting parts of systemd or libraries used by systemd in Rust. For example, take a look at the work in progress for syslog-ng (Syslog-ng and Rust).

Should systemd parse files or user provided content as non-root and outside of PID 1?

In particular, any code that accepts messages from untrustworthy sources like systemd-notify should run in a dedicated process as an unprivileged user. The unprivileged process parses and validates messages before passing them along to the privileged process. This is called privilege separation and has been a best practice in security-aware software for over a decade. Systemd, by contrast, does text parsing on messages from untrusted sources, in C, running as root in PID 1.

While this is definitely a good design principle in most cases, it does not make much sense here.

Let’s say we managed to split all input parsing in separated unprivileged processes. Those processes still have to give this information to systemd PID 1 at some point. Thus we will have to format it in some way and send it to PID 1, which will have to parse it.

The thing here is that this format could hardly be any simpler than what is already used. Most of the data parsed by systemd PID 1 come from unit files or the notification socket. The format used to represent data in both cases is not complex enough that doing it in a separated process would bring any security advantage (mostly one Key=Value per line, excepted for some substitutions in unit files).

Simply splitting this into another process won’t help from a security point of view and would not help with the current issue as this was not a parsing bug but a logic bug: empty notifications messages are considered valid and are correctly parsed.

A possible path forward to solve this issue would be to use a per unit unprivileged instance of systemd, that would be responsible of notifications. Any issue related to notifications would then be properly confined to a process outside of PID 1. But this would increase the complexity of notification handling as some notifications would still have to be transmitted to systemd PID 1.

Should systemd be split into several processes?

Furthermore, the real init system, even when running as a non-PID 1 process, should be structured in a modular way such that a failure in one of the riskier components does not bring down the more critical components. For instance, a failure in the daemon management code should not prevent the system from being cleanly rebooted.

Yes it should, but it is much harder to do correctly than to simply say it should be done. Being a single not multi-threaded process, a lot of operations are really simple to handle in systemd as no concurrency, locking or communication has to be performed. Splitting those operations in several processes would significantly increase systemd complexity.

So sure, systemd should be split into several processes, but this is not going to happen overnight. systemd single process design is a trade off between:

simple, thus less prone to failure, but with most failures critical;
complex, thus more prone to failure (and harder to develop and debug), with only some failures critical.

A possible path forward to solve this issue would be to split the core PID 1 functionality (reaping child processes and getting their return code) from everything else.

Are systemd daemons D-Bus interfaces proprietary non-standard?

Systemd is far more than an init system: it is becoming a secondary operating system kernel, providing a log server, a device manager, a container manager, a login manager, a DHCP client, a DNS resolver, and an NTP client. These services are largely interdependent and provide non-standard interfaces for other applications to use.

New interfaces are by definition non-standard. But D-Bus interfaces are also by definition open for introspection. Moreover, systemd interfaces are documented (see resolved for example) and details about their stability are provided (see Interface stability promise and Interface Portability And Stability Chart).

Should systemd stop using umask(0)?

The default umask should be restrictive, so forgetting to change the umask when creating a file would result in a file that obviously doesn’t work.

I could not found any reason not to use umask properly. This may be a valid issue.

Conclusion

Some of the concerns raised in the original article are valid. If someone were to start a project like systemd right now, they should certainly think long and hard about the design, the language to use, what to put in which process.

But blaming systemd developers for the design decisions and trade offs they made when they started the project almost 7 years ago is useless. The track record of the project also proves that while imperfect, the systemd architecture works quite well. There has not been as many critical bugs as some detractors would have wanted to be.

Comments

Comments are disabled on this blog but feel free to start a discussion with me on Mastodon.
You can also contact me directly if you have feedback.

Timothée Ravier

systemd and the crashing tweet

What about the bug?

How does systemd handle this?

Should systemd be written in a safe language?

Should systemd parse files or user provided content as non-root and outside of PID 1?

Should systemd be split into several processes?

Are systemd daemons D-Bus interfaces proprietary non-standard?

Should systemd stop using umask(0)?

Conclusion

Comments

You May Also Enjoy

What’s new for Fedora Atomic Desktops in Fedora 42

What’s new for Fedora Atomic Desktops in Fedora 41

Manual action needed to resolve boot failure for Fedora Atomic Desktops and Fedora IoT

What’s new for Fedora Atomic Desktops in Fedora 40