One of my colleagues recently asked me if there were any best practice guides for designing and testing software daemons (background processes). I hadn’t known of any before writing this blog post (and most of what I’ve found while researching were related to the mechanics of writing daemons), but we came up with a few ideas together and maybe this can serve as a starting point for others.

Note: all of this is from a Linux perspective. I haven’t written any daemons for Mac OS and some of your concerns might be different with Windows Services.

Restarts and supervision

A daemon process runs in the background and is generally expected to stay running until it is explicitly stopped. Sometimes software might stop unexpectedly (e.g., crash) and you might want to think about what happens after:

Process supervisors can help with some of the mechanics of running daemons. Supervisors are separate programs (usually either daemons themselves or integrated into an init system) responsible for monitoring the state of daemons and potentially defining restart policies around them. Some common supervisors include supervisord and systemd, though there are many others. In a container-based system, you’ll frequently see the role of a supervisor performed by a container orchestrator like Amazon ECS or Kubernetes.

Many process supervisors allow you to define policy around restarts. Some good things to think about in defining that policy include:

Monitoring and logging

Since daemons are typically background processes that run without interactivity, it can be a challenge to know what’s going on with the daemon. Is it running? Is it receiving requests? Many daemons will emit logs, where they record information about their activity. Some daemons emit logs to files directly, while others leverage log facilities like syslog or simply write to stdout and expect another process to be reading from that descriptor. Here are some questions to get started:

Monitoring your daemon is also generally useful. Some daemons might have information that they know about their particular workload that is useful to monitor while there are also process-level, (sometimes) language-level, and system-level data that’s useful to know. Daemons might be running under a supervisor, and that supervisor might also have useful information. A variety of mechanisms exist for both exposing and exporting this data, including through tooling like Prometheus. Let’s do the question thing again:

Upgrades, downgrades, and dependency changes

Upgrading daemons and their dependencies can be challenging as daemons are typically designed to stay running indefinitely. Daemons can be designed to interact differently with software changes. Some daemons will stay running during upgrades and downgrades. Others might integrate with package managers to trigger restarts as a result of an upgrade or a downgrade. There isn’t necessarily a single right answer here; what one particular daemon needs might not be needed by others. Daemons that operate as servers may want to stay running to continue to process requests. Daemons that have more asynchronous behavior may choose to restart as part of an upgrade or downgrade so that the running software reflects what’s installed on disk. Keeping a daemon running during an upgrade can have some challenges: while the executable code of the main process will continue to be in memory (on Linux), if the daemon has a dependency on a dynamically-linked library (a .so file) unexpected behavior may occur if the library is upgraded and a different version is loaded. Restarting a daemon means integrating with the upgrade process and some amount of unavailability during the upgrade. There might also be data compatibility issues; the daemon’s state might need to undergo a schema migration. In question form:

Dependencies can also be challenging. Daemons might have multiple kinds of dependencies with different semantics: dynamically-linked libraries, kernel interfaces, remote APIs over a network, a message-passing system like D-Bus, persistent data stored in a particular format, and so on. Some of these might be affected by an upgrade/downgrade (dynamically-linked libraries, kernel interfaces) and some might not (remote APIs over a network), but all are worth thinking through.

Init systems and daemonizing

Init systems can be a kind of process supervisor covered above, but not all are. There are a few special considerations for init systems that are worth thinking through when designing a daemon.

More traditional init systems (in the SysV style) expect programs to “daemonize” themselves: handle the mechanics of placing themselves into the background and running asynchronously. These init systems don’t typically perform much supervision, and may have fairly simple conventions around reporting status and handling dependency startup ordering. For daemons that run under these systems, they are generally expected to do the following for themselves:

  • Close all open file descriptors (especially standard input, standard output and standard error)
  • Change its working directory to the root filesystem, to ensure that it doesn’t tie up another filesystem and prevent it from being unmounted
  • Reset its umask value
  • Run in the background (i.e., fork)
  • Disassociate from its process group (usually a shell), to insulate itself from signals (such as HUP) sent to the process group
  • Ignore all terminal I/O signals
  • Disassociate from the control terminal (and take steps not to reacquire one)
  • Handle any SIGCLD signals

(from the daemonize tool website)

This can be a lot to test and get right (there are fairly detailed guides too), so there is tooling like daemonize to help you do that. But there are also modern init systems like systemd and upstart (though upstart has generally been abandoned in favor of systemd) that prefer that your daemon not “daemonize” itself and run in the foreground instead; they take responsibility for the work of isolating the daemon process. These init systems may have additional recommendations, but the general process is simpler. Init systems like this may also provide features around dependency management, monitoring/log handling, and activation that can be useful to you. The amount of work that an init system like this abstracts away can make it enticing to tie your daemon to that init system; that can be appropriate for some use-cases but can hinder broad adoption.

Activation/startup

How and why the daemon starts can be another useful thing to think about as well. Some daemons are expected to always run and run indefinitely when the system boots. Others might only be necessary when a particular piece of work comes in (a network request, a message, a new device) and can be started on-demand. Modern init systems like systemd can help model these different activation behaviors.

Further reading

This blog post is only a start, and only represents my way of thinking about daemons. I would be remiss if I failed to link to the resources I found as I was writing this article. Here they are, in no particular order:

I hope this blog post has been useful. If you have anything to add or any corrections you’d like to suggest, I’d appreciate if you left a comment here!